Sending data and a file using python, to a rest api? Then you’ll need a multipart post request and this post might save you a little bit of time!
I’ll show you how to construct a basic script that will allow you to send both form data and an arbitrary file (in this case a csv), and point out where I initially had a few issues.
The finished article:
import requests
file = { 'file' : ('csvtest.csv', open('csvtest.csv', 'rt'), 'text/csv') }
payload = { 'parser' : '{object trimmed for comment}' }
headers = { 'accept': "application/json", 'authorization': "Basic ZGxhZG1pbjp0aGlua2JpZw==", }
url = "http://localhost:8400/proxy/v1/schema-discovery/hive/sample-file"
req = requests.post(url, data=payload, files=file, headers=headers) pprint(req.text)
Lets go through that part by part:
import requests
This makes the ‘requests‘ library available to Python. Its very similar in nature to the http.client package, but has additional attributes that you can call that make putting together a multipart request a lot more simple.
file = { 'file' : ('csvtest.csv', open('csvtest.csv', 'rt'), 'text/csv') }
This constructs a tuple with the name of the file, the file contents, and the file type. In a lot of the documentation I found the name and type were optional, but for the api I was using I needed to specify both for it to be accepted.
One additional thing to note is the open() method. Again, most tutorials I found suggested opening in binary mode ‘rb’, but in order to get the text to appear in the request body within the multipart request I needed it setting as text (‘rt’).
payload = { 'parser' : '{object trimmed for comment}' }
This the data I wanted to send alongside the file. The parser value was a huge json string, so i’ve trimmed it here for readability.
headers = { 'accept': "application/json", 'authorization': "Basic ZGxhZG1pbjp0aGlua2JpZw==", }
The thing to note here is the absence of the ‘content-type’ header. By supplying the header it seems to confuse things, as in the last step where we make the post request we specify both file and data which is recognised and the relevant headers built for us when the request is build.
url = "http://localhost:8400/proxy/v1/schema-discovery/hive/sample-file"
The URI for the Rest API I am trying to hit.
req = requests.post(url, data=payload, files=file, headers=headers)
This is the basic construct for the request. The verb (“post”) is set as a method (although you can call this in a different way). We add by name the data, files and headers and make the request.
The package then creates the request body and sets the relevant multipart boundaries and inserts the relevant data.
I believe that you can pass an array of files into the files argument although I have not tested that here.
Hopefully this will save someone the time it took me to figure it out.