Importing a File

Accepted Solution Solved
Reply
Contributor
Posts: 43
Accepted Solution

Importing a File

Hi,

 

I have run the code below to bring in a file via an API:

 

filename resp TEMP;

 

options validvarname=any;

 

proc http

   url="https://api.diffbot.com/v3/crawl/data?token=*************************&name=freshfields_ex&format=csv..."

   out=resp;

run;

 

proc import

file=resp

out=work.resp replace

dbms=csv;

getnames=yes;

run;

 

I have starred out my password but the file that I import is attached instead. The problem is when I try to import it into SAS. The file doesn't translate into a structured file. The columns should be type, description, diffbotUri, emailAddress, experience, name,office, pageUrl, practiceArea, professionalLevel. The columns and rows on the imported table are all out of synch.

 

Please could you help me to import it. The file is a "file" extension.

 

Thanks

Chris

Attachment

Accepted Solutions
Solution
‎10-23-2017 06:00 AM
Super User
Super User
Posts: 7,405

Re: Importing a File

[ Edited ]

You cannot embed line breaks in the middle of a value.  It looks like your file does that, but it does add quotes around the fields that have the embedded line breaks, so you should be able to fix them.  See this thread.

https://communities.sas.com/t5/General-SAS-Programming/Carriage-Returns-Need-to-be-removed/td-p/8307...

Looks like your file should have only 1,613 lines instead of the 8,333 that it currently has.

48    data _null_ ;
49      if eof then put 'NOTE: Records read=' newn 'Records with missing quotes=' missq ;
50      infile old lrecl=10000 end=eof ;
51      file new lrecl=100000;
52      nq=0;
53      do until (mod(nq,2)=0 or eof );
54         input;
55         newn+1;
56         nq = nq + countc(_infile_,'"');
57         put _infile_ @;
58         if mod(nq,2) then do;
59           missq+1;
60           put '|' @;
61         end;
62      end;
63      put;
64    run;

NOTE: The infile OLD is:
      Filename=c:\downloads\sample.csv,
      RECFM=V,LRECL=10000,File Size (bytes)=1736314,
      Last Modified=22Oct2017:11:42:34,
      Create Time=22Oct2017:15:42:18

NOTE: The file NEW is:
      Filename=c:\downloads\fixed.csv,
      RECFM=V,LRECL=100000,File Size (bytes)=0,
      Last Modified=22Oct2017:15:57:31,
      Create Time=22Oct2017:15:56:08

NOTE: Records read=8333 Records with missing quotes=6720
NOTE: 8333 records were read from the infile OLD.
      The minimum record length was 0.
      The maximum record length was 1056.
NOTE: 1613 records were written to the file NEW.
      The minimum record length was 102.
      The maximum record length was 7781.

View solution in original post


All Replies
Super Contributor
Posts: 493

Re: Importing a File

Chris,

 

Your file contains text fields with embedded line breaks. That's what throws of proc import. A very common thing to happen.

 

Many examples have been given in this forum to deal with that. Have a look to see if any fits your needs (often but not always the termstr option on the filename statement helps). All I can do now is to let you now. I am not in a position now to experiment with this.

 

Good luck,

- Jan.

Super User
Super User
Posts: 7,405

Re: Importing a File

Your file appears to contain Unicode characters. Make sure you are using utf-8 encoding and not something like latin1 or wlatin1 that cannot represent all possible unicode characters.

Solution
‎10-23-2017 06:00 AM
Super User
Super User
Posts: 7,405

Re: Importing a File

[ Edited ]

You cannot embed line breaks in the middle of a value.  It looks like your file does that, but it does add quotes around the fields that have the embedded line breaks, so you should be able to fix them.  See this thread.

https://communities.sas.com/t5/General-SAS-Programming/Carriage-Returns-Need-to-be-removed/td-p/8307...

Looks like your file should have only 1,613 lines instead of the 8,333 that it currently has.

48    data _null_ ;
49      if eof then put 'NOTE: Records read=' newn 'Records with missing quotes=' missq ;
50      infile old lrecl=10000 end=eof ;
51      file new lrecl=100000;
52      nq=0;
53      do until (mod(nq,2)=0 or eof );
54         input;
55         newn+1;
56         nq = nq + countc(_infile_,'"');
57         put _infile_ @;
58         if mod(nq,2) then do;
59           missq+1;
60           put '|' @;
61         end;
62      end;
63      put;
64    run;

NOTE: The infile OLD is:
      Filename=c:\downloads\sample.csv,
      RECFM=V,LRECL=10000,File Size (bytes)=1736314,
      Last Modified=22Oct2017:11:42:34,
      Create Time=22Oct2017:15:42:18

NOTE: The file NEW is:
      Filename=c:\downloads\fixed.csv,
      RECFM=V,LRECL=100000,File Size (bytes)=0,
      Last Modified=22Oct2017:15:57:31,
      Create Time=22Oct2017:15:56:08

NOTE: Records read=8333 Records with missing quotes=6720
NOTE: 8333 records were read from the infile OLD.
      The minimum record length was 0.
      The maximum record length was 1056.
NOTE: 1613 records were written to the file NEW.
      The minimum record length was 102.
      The maximum record length was 7781.
Contributor
Posts: 43

Re: Importing a File

Thanks very much for your help. This worked well for me.


Cheers


Chris

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 4 replies
  • 164 views
  • 0 likes
  • 3 in conversation