BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
cmoore
Obsidian | Level 7

Hi,

 

I have run the code below to bring in a file via an API:

 

filename resp TEMP;

 

options validvarname=any;

 

proc http

   url="https://api.diffbot.com/v3/crawl/data?token=*************************&name=freshfields_ex&format=csv..."

   out=resp;

run;

 

proc import

file=resp

out=work.resp replace

dbms=csv;

getnames=yes;

run;

 

I have starred out my password but the file that I import is attached instead. The problem is when I try to import it into SAS. The file doesn't translate into a structured file. The columns should be type, description, diffbotUri, emailAddress, experience, name,office, pageUrl, practiceArea, professionalLevel. The columns and rows on the imported table are all out of synch.

 

Please could you help me to import it. The file is a "file" extension.

 

Thanks

Chris

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

You cannot embed line breaks in the middle of a value.  It looks like your file does that, but it does add quotes around the fields that have the embedded line breaks, so you should be able to fix them.  See this thread.

https://communities.sas.com/t5/General-SAS-Programming/Carriage-Returns-Need-to-be-removed/td-p/8307...

Looks like your file should have only 1,613 lines instead of the 8,333 that it currently has.

48    data _null_ ;
49      if eof then put 'NOTE: Records read=' newn 'Records with missing quotes=' missq ;
50      infile old lrecl=10000 end=eof ;
51      file new lrecl=100000;
52      nq=0;
53      do until (mod(nq,2)=0 or eof );
54         input;
55         newn+1;
56         nq = nq + countc(_infile_,'"');
57         put _infile_ @;
58         if mod(nq,2) then do;
59           missq+1;
60           put '|' @;
61         end;
62      end;
63      put;
64    run;

NOTE: The infile OLD is:
      Filename=c:\downloads\sample.csv,
      RECFM=V,LRECL=10000,File Size (bytes)=1736314,
      Last Modified=22Oct2017:11:42:34,
      Create Time=22Oct2017:15:42:18

NOTE: The file NEW is:
      Filename=c:\downloads\fixed.csv,
      RECFM=V,LRECL=100000,File Size (bytes)=0,
      Last Modified=22Oct2017:15:57:31,
      Create Time=22Oct2017:15:56:08

NOTE: Records read=8333 Records with missing quotes=6720
NOTE: 8333 records were read from the infile OLD.
      The minimum record length was 0.
      The maximum record length was 1056.
NOTE: 1613 records were written to the file NEW.
      The minimum record length was 102.
      The maximum record length was 7781.

View solution in original post

4 REPLIES 4
jklaverstijn
Rhodochrosite | Level 12

Chris,

 

Your file contains text fields with embedded line breaks. That's what throws of proc import. A very common thing to happen.

 

Many examples have been given in this forum to deal with that. Have a look to see if any fits your needs (often but not always the termstr option on the filename statement helps). All I can do now is to let you now. I am not in a position now to experiment with this.

 

Good luck,

- Jan.

Tom
Super User Tom
Super User

Your file appears to contain Unicode characters. Make sure you are using utf-8 encoding and not something like latin1 or wlatin1 that cannot represent all possible unicode characters.

Tom
Super User Tom
Super User

You cannot embed line breaks in the middle of a value.  It looks like your file does that, but it does add quotes around the fields that have the embedded line breaks, so you should be able to fix them.  See this thread.

https://communities.sas.com/t5/General-SAS-Programming/Carriage-Returns-Need-to-be-removed/td-p/8307...

Looks like your file should have only 1,613 lines instead of the 8,333 that it currently has.

48    data _null_ ;
49      if eof then put 'NOTE: Records read=' newn 'Records with missing quotes=' missq ;
50      infile old lrecl=10000 end=eof ;
51      file new lrecl=100000;
52      nq=0;
53      do until (mod(nq,2)=0 or eof );
54         input;
55         newn+1;
56         nq = nq + countc(_infile_,'"');
57         put _infile_ @;
58         if mod(nq,2) then do;
59           missq+1;
60           put '|' @;
61         end;
62      end;
63      put;
64    run;

NOTE: The infile OLD is:
      Filename=c:\downloads\sample.csv,
      RECFM=V,LRECL=10000,File Size (bytes)=1736314,
      Last Modified=22Oct2017:11:42:34,
      Create Time=22Oct2017:15:42:18

NOTE: The file NEW is:
      Filename=c:\downloads\fixed.csv,
      RECFM=V,LRECL=100000,File Size (bytes)=0,
      Last Modified=22Oct2017:15:57:31,
      Create Time=22Oct2017:15:56:08

NOTE: Records read=8333 Records with missing quotes=6720
NOTE: 8333 records were read from the infile OLD.
      The minimum record length was 0.
      The maximum record length was 1056.
NOTE: 1613 records were written to the file NEW.
      The minimum record length was 102.
      The maximum record length was 7781.
cmoore
Obsidian | Level 7

Thanks very much for your help. This worked well for me.


Cheers


Chris

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 850 views
  • 0 likes
  • 3 in conversation