BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
cmoore
Obsidian | Level 7

Hi,

 

I have run the code below to bring in a file via an API:

 

filename resp TEMP;

 

options validvarname=any;

 

proc http

   url="https://api.diffbot.com/v3/crawl/data?token=*************************&name=freshfields_ex&format=csv..."

   out=resp;

run;

 

proc import

file=resp

out=work.resp replace

dbms=csv;

getnames=yes;

run;

 

I have starred out my password but the file that I import is attached instead. The problem is when I try to import it into SAS. The file doesn't translate into a structured file. The columns should be type, description, diffbotUri, emailAddress, experience, name,office, pageUrl, practiceArea, professionalLevel. The columns and rows on the imported table are all out of synch.

 

Please could you help me to import it. The file is a "file" extension.

 

Thanks

Chris

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

You cannot embed line breaks in the middle of a value.  It looks like your file does that, but it does add quotes around the fields that have the embedded line breaks, so you should be able to fix them.  See this thread.

https://communities.sas.com/t5/General-SAS-Programming/Carriage-Returns-Need-to-be-removed/td-p/8307...

Looks like your file should have only 1,613 lines instead of the 8,333 that it currently has.

48    data _null_ ;
49      if eof then put 'NOTE: Records read=' newn 'Records with missing quotes=' missq ;
50      infile old lrecl=10000 end=eof ;
51      file new lrecl=100000;
52      nq=0;
53      do until (mod(nq,2)=0 or eof );
54         input;
55         newn+1;
56         nq = nq + countc(_infile_,'"');
57         put _infile_ @;
58         if mod(nq,2) then do;
59           missq+1;
60           put '|' @;
61         end;
62      end;
63      put;
64    run;

NOTE: The infile OLD is:
      Filename=c:\downloads\sample.csv,
      RECFM=V,LRECL=10000,File Size (bytes)=1736314,
      Last Modified=22Oct2017:11:42:34,
      Create Time=22Oct2017:15:42:18

NOTE: The file NEW is:
      Filename=c:\downloads\fixed.csv,
      RECFM=V,LRECL=100000,File Size (bytes)=0,
      Last Modified=22Oct2017:15:57:31,
      Create Time=22Oct2017:15:56:08

NOTE: Records read=8333 Records with missing quotes=6720
NOTE: 8333 records were read from the infile OLD.
      The minimum record length was 0.
      The maximum record length was 1056.
NOTE: 1613 records were written to the file NEW.
      The minimum record length was 102.
      The maximum record length was 7781.

View solution in original post

4 REPLIES 4
jklaverstijn
Rhodochrosite | Level 12

Chris,

 

Your file contains text fields with embedded line breaks. That's what throws of proc import. A very common thing to happen.

 

Many examples have been given in this forum to deal with that. Have a look to see if any fits your needs (often but not always the termstr option on the filename statement helps). All I can do now is to let you now. I am not in a position now to experiment with this.

 

Good luck,

- Jan.

Tom
Super User Tom
Super User

Your file appears to contain Unicode characters. Make sure you are using utf-8 encoding and not something like latin1 or wlatin1 that cannot represent all possible unicode characters.

Tom
Super User Tom
Super User

You cannot embed line breaks in the middle of a value.  It looks like your file does that, but it does add quotes around the fields that have the embedded line breaks, so you should be able to fix them.  See this thread.

https://communities.sas.com/t5/General-SAS-Programming/Carriage-Returns-Need-to-be-removed/td-p/8307...

Looks like your file should have only 1,613 lines instead of the 8,333 that it currently has.

48    data _null_ ;
49      if eof then put 'NOTE: Records read=' newn 'Records with missing quotes=' missq ;
50      infile old lrecl=10000 end=eof ;
51      file new lrecl=100000;
52      nq=0;
53      do until (mod(nq,2)=0 or eof );
54         input;
55         newn+1;
56         nq = nq + countc(_infile_,'"');
57         put _infile_ @;
58         if mod(nq,2) then do;
59           missq+1;
60           put '|' @;
61         end;
62      end;
63      put;
64    run;

NOTE: The infile OLD is:
      Filename=c:\downloads\sample.csv,
      RECFM=V,LRECL=10000,File Size (bytes)=1736314,
      Last Modified=22Oct2017:11:42:34,
      Create Time=22Oct2017:15:42:18

NOTE: The file NEW is:
      Filename=c:\downloads\fixed.csv,
      RECFM=V,LRECL=100000,File Size (bytes)=0,
      Last Modified=22Oct2017:15:57:31,
      Create Time=22Oct2017:15:56:08

NOTE: Records read=8333 Records with missing quotes=6720
NOTE: 8333 records were read from the infile OLD.
      The minimum record length was 0.
      The maximum record length was 1056.
NOTE: 1613 records were written to the file NEW.
      The minimum record length was 102.
      The maximum record length was 7781.
cmoore
Obsidian | Level 7

Thanks very much for your help. This worked well for me.


Cheers


Chris

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 1649 views
  • 0 likes
  • 3 in conversation