- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
Based upon a variety of helpful postings, I am successfully using SAS and PROC HTTP to make REST API calls to Box.com. In the "working" column are getting OAUTH tokens, finding file and folder IDs, etc.
In the "close but not quite working" column is using Box.com's File upload APIs (see here). The file upload API uses a multi-part form to upload the file data. Based in part on things I read here, I developed code that partially works. In the code below, I can upload a text file to Box and it works fine. When I try to use this same code to upload a binary file (say, an *.egp file), the upload does not work correctly and what gets created is a corrupted file.
You can see in the code where the content-type is defined in the sub-form. Originally, I used "text/plain" and then switched to "application/octet-stream" so that it would expect binary data for the sub-form (note, when I use cURL to successfully upload the file, "octet-stream" is what it uses for the content-type).
Although I don't get an errors (SAS or Box.com), the file is still not correct. Looking inside the REQUEST file that gets created, the file data is largely absent and I think the issue may have to do with the infile statement not properly reading binary data. I tried to modify the inflie statement to us recfm=N, but that results in errors from SAS since it appears the put statements are incompatible with binary file processing.
So I'm stuck. Any ideas?
-Eric
*---------------------------------------------------------------------------------------;
*>> Create file request ***************************************************************;
*---------------------------------------------------------------------------------------;
data _null_;
infile copyfile end=eof;
file request;
if _n_ = 1 then do;
put "--&boundary";
put 'Content-Disposition: form-data; name="attributes"';
put ;
put '{"name":"' "&updestfile" '", "parent":{"id":"' "&boxfolderID" '"}}';
put "--&boundary";
put 'Content-Disposition: form-data; name="file"; filename="' "&updestfile" '"';
put "Content-Type: text/plain";
/* put "Content-Type: application/octet-stream";*/
put ;
end;
input;
put _infile_;
if eof then
do;
put "--&boundary--";
end;
run;
*---------------------------------------------------------------------------------------;
*>> Determine size of request and store in macro variable *****************************;
*---------------------------------------------------------------------------------------;
data _null_;
length bytes $1024;
fid = fopen("request");
rc = fread(fid);
bytes = finfo(fid, 'File Size (bytes)');
call symput("FileSize",trim(bytes));
rc = fclose(fid);
put bytes;
run;
*---------------------------------------------------------------------------------------;
*>> Submit the Upload request to Box **************************************************;
*---------------------------------------------------------------------------------------;
proc http
url="https://upload.box.com/api/2.0/files/content"
method = "POST"
out = resp
headerout = resphdrs
in = request
ct = "multipart/form-data; boundary=&boundary"
;
headers
"Authorization" = "Bearer &UPaccesstoken"
"Content-Length" = "&filesize"
;
run;
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
When working with Binary files you kind of need to "force" the datastep to read and write a byte at a time.
the way you do this is with this little trick:
infile copyfile recfm=1 lrecl=1; file request recfm=1 lrecl=1;
But since you need to mix text and binary, it would probably be best to run multiple datasteps using mod
This might work for you
data _null_;
file request termstr=CRLF;
if _n_ = 1 then do;
put "--&boundary";
put 'Content-Disposition: form-data; name="attributes"';
put ;
put '{"name":"' "&updestfile" '", "parent":{"id":"' "&boxfolderID" '"}}';
put "--&boundary";
put 'Content-Disposition: form-data; name="file"; filename="' "&updestfile" '"';
put "Content-Type: application/octet-stream";
put ;
end;
run;
data _null_;
file request mod recfm=f lrecl=1;
infile copyfile recfm=f lrecl=1;
input;
put _infile_;
run;
data _null_;
file request mod termstr=CRLF;
put "--&boundary--";
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
UPDATE:
Much closer to a fix, but not quite there.
I added an IGNOREDOSEOF option to the infile statement, and it now (almost) faithfully reads all the binary characters from the source file and transfers them to the PROC HTTP request file. The file gets created in Box.com, however, the file is still corrupt. Downloading the file from Box and comparing it (hex) to the original revealed the difference.
Below is a section of screenshot that shows an example. The left side of the screen is the source file, the right side is the file that gets created in Box. The red arrow points to where there is a null character Hex comparison
in the source. The green arrow on the right (destination file) shows where this gets converted to a carriage return (0D hex). This happens throughout the file. As far as I can tell, this is the only difference between the two files.
Is there an option in SAS that controls this behavior? I see there is a TERMSTR option, but that does not seem to control this specific behavior.
-Eric
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Tagging @JosephHenry - might be able to help.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
When working with Binary files you kind of need to "force" the datastep to read and write a byte at a time.
the way you do this is with this little trick:
infile copyfile recfm=1 lrecl=1; file request recfm=1 lrecl=1;
But since you need to mix text and binary, it would probably be best to run multiple datasteps using mod
This might work for you
data _null_;
file request termstr=CRLF;
if _n_ = 1 then do;
put "--&boundary";
put 'Content-Disposition: form-data; name="attributes"';
put ;
put '{"name":"' "&updestfile" '", "parent":{"id":"' "&boxfolderID" '"}}';
put "--&boundary";
put 'Content-Disposition: form-data; name="file"; filename="' "&updestfile" '"';
put "Content-Type: application/octet-stream";
put ;
end;
run;
data _null_;
file request mod recfm=f lrecl=1;
infile copyfile recfm=f lrecl=1;
input;
put _infile_;
run;
data _null_;
file request mod termstr=CRLF;
put "--&boundary--";
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Awesome. This worked.
I inserted a hard return ("put ;" ) immediately before the final boundary put statement in the last data step.
Thanks so much for your help.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I am working on getting multipart directly integrated into proc http, so hopefully in the future it will be easier.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I'm running into this same problem, but what you're proposing doesn't work. I get unprocessible entity errors...
Any ideas?
%let boundary=%sysfunc(uuidgen()); filename in '/folders/myfolders/test.txt'; data _null_; file in termstr=CRLF recfm=f lrecl=1 ;
infile "&path./&filename." end=eof recfm=f lrecl=1 termstr=CRLF; if _n_ = 1 then do; put "--&boundary."; put 'Content-Disposition: form-data; name="file"; filename="data.xlsx"'; put 'Content-Type: application/octet-stream'; put ; end; input; put _infile_; if eof then do; put ; put "--&boundary.--"; end; run; proc http method="post"
url = "&url." in = in ct="multipart/form-data; boundary=&boundary." out = out headerout = hdrout headerin= hdrin HEADEROUT_OVERWRITE; ; run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
What does your proc HTTP code look like?
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I've just updated the original post to include the code.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
If it's helpful, the response I'm getting back is:
HTTP/1.1 422 UNPROCESSABLE ENTITY Server: openresty Date: Thu, 13 Sep 2018 20:23:10 GMT Content-Type: application/json Content-Length: 51 Connection: keep-alive X-Frame-Options: SAMEORIGIN
and
{"message": "The request did not contain any file"}
Incidentally, I'm certain that the excel file is valid.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
you are having the same problem as the OP.
Try this:
data _null_;
file in termstr=CRLF;
if _n_ = 1 then do;
put "--&boundary.";
put 'Content-Disposition: form-data; name="file"; filename="data.xlsx"';
put 'Content-Type: application/octet-stream';
put ;
end;
run;
data _null_;
file in mod recfm=f lrecl=1;
infile "&path./&filename." recfm=f lrecl=1;
input;
put _infile_;
run;
data _null_;
file in mod termstr=CRLF;
put "--&boundary--";
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Yep... I already tried that. Same result:
HTTP/1.1 422 UNPROCESSABLE ENTITY Server: openresty Date: Thu, 13 Sep 2018 22:34:03 GMT Content-Type: application/json Content-Length: 51 Connection: keep-alive X-Frame-Options: SAMEORIGIN Set-Cookie: SERVERID=green; path=/
and
{"message": "The request did not contain any file"}
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I checked the file with was received by the server. It's definitely corrupted.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
My tests using the code that I posted all produces the correct multiform output.
What are the additional headers you are sending?
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I'm using the latest academic version. The only header that I'm sending is my API token. It is a text file with the following content:
Authentication: Token XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX