BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
rcsutherland
Fluorite | Level 6

Hi guys,

 

I'm here again. Basically I'm having hard time debugging why my code  doesn't work.

 

I would like to open an external file and capture some string and save the result it in a data step variable. Here is my code:

 

 

%let sPath = root/documents/files;

data work._output;
   set work._input;
   length lname, fname, path $512. filename $32.;
   
   if filename ne "" then do;
      path = symget('sPath');
      f2r = catx("/", path, filename);
     
      infile a filevar=f2r;
      input @'Firstname:' fname $;
      input @'Lastname:' lname $;
      
      output work._output;
   end;

run;

 

 

Basically, I have files in folder root/documents/files which are:

 

cus_01.txt

cus_02.txt

cus_03.txt

Each file has these kinds of values, please note that these text files have free format and may contain other character but the fixed part is that there would be label before the value (i.e. Firstname: 'Lonzo')

 

cus_01.txt:

ID:   '100001'
Firstname:  'Lonzo'
Lastname:   'Ball'

cus_02.txt:

<DUMMY OR BLANK TEXT FILE THAT DOES NOT CONTAIN THE FIRSTNAME AND LASTNAME KEYWORD TO BE SEARCHED>

cus_03.txt:

ID:   '100002'
Firstname:  'Lebron'
Lastname:   'James'

 

My work._input data set would look like this

 

filename
cus_01.txt
cus_02.txt
cus_03.txt

MY PROBLEM:

 

First iteration of the data step will read the first obs which contains the filename = cus_01.txt. Since filename is not equal to blank, it will now concatenate the path and the text file so for this obs f2r would look like this:

 

root/documents/files/cus_01.txt

 

Then INFILE will open the cus_01.txt and will search for string Firstname: and Lastname: and will put the result to data step variable fname and lname respectively.

 

So for obs 1 this will be the work._output:

 

fname       lname   filename
Lonzo       Ball    cus_01.txt

 

Next it will process obs 2 of the work._input data set which is filename = cus_02.txt. 

 

Here comes the problem, Since there are no "Firstname:" nor "Lastname:" string in the cus_02.txt, the INFILE stops the whole data step and does not continue with obs 3 of work._input.

 

May I know why? I've tried using combinations of options TRUNCOVER, SCANOVER, and MISSOVER, but still not working.

 

Thank you for your help. 🙂

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

Sounds like the issue is that your INPUT statement is reading past the end of one of the files.

That will end the whole data step.  That is actually the normal way for a data step to end.

 

You might need to modify your logic for searching the files.  Instead of using the @ 'text'  function to locate where the values are you might want to just read the values as NAME/VALUE pairs.  You could either keep the data in that structure or add logic to put the VALUE into the proper variable based on the value of the NAME (essentially transform).

 

So you probably want to change you step to look more like this:

data work._output;
   set work._input;
   length path f2r $512. fname lname $50.;

   path = symget('sPath');
   f2r = catx("/", path, filename);
   if filename ne ""  and fileexist(f2r) then do;
      infile a filevar=f2r end=eof truncover ;
      do while (not eof);
         input Name :$32. value $50. ;
         if Name = 'Firstname:' then fname=dequote(value);
         if Name = 'Lastname:' then lname=dequote(value);
      end;
   end;
   drop NAME VALUE;
run;

View solution in original post

5 REPLIES 5
ballardw
Super User

First thing is that your data step does not run because your length statement has commas. The variables on a length statement are space delimited, not comma.

 

You should show the entire LOG for that data step. Please post the log with any notes, warnings or errors into a code box opened using the forum's {I} icon.

 

I don't think that you really need to use symget. I suspect

f2r = catx("/","&spath.", filename);

would work as well.

 

Are the case of the values of your filename correct in your data set?

And if filename is in work._input then there isn't a need to set the length for filename.

You do realize that the attempted length statement would have set the lengths for lname and fname to 512, don't you.

 

You do not read one record, consisting of 3 rows in your text file. Each Input <instructions> ; <= ending semicolon advances the line for reading. Yours 1) does nothing for the ID row, so when "Firstname" is not found nothing is read, then it advances to the second row where "Lastname" is not found, so nothing is read. Your input statement would look more like

Input @'ID:' id $
      /@'Firstname:' fname $      
      / @'Lastname:' lname $
;            

If you want to read 3 lines into a single record. The / tells the program to advance to next line of the file.

rcsutherland
Fluorite | Level 6

Hi ballrdw,

 

You're suggestion of using ' / ' worked same as my double INPUT statement. However, still doesn't solve my problem where data step stops after processing cus_02.txt which doesn't have matches for strings 'Firstname' and 'Lastname'.

 

1.) Sorry length was a typo, I don't have commas just space.

2.) Case of the values of the filenames are correct.

3.) Yes fname and lname would hanve length of 512.

 

There are no WARNINGS and ERRORS. Here is the NOTE after each execution:

 

NOTE: SAS went to a new line when INPUT @'character string' scanned past the end of a line.
Tom
Super User Tom
Super User

Sounds like the issue is that your INPUT statement is reading past the end of one of the files.

That will end the whole data step.  That is actually the normal way for a data step to end.

 

You might need to modify your logic for searching the files.  Instead of using the @ 'text'  function to locate where the values are you might want to just read the values as NAME/VALUE pairs.  You could either keep the data in that structure or add logic to put the VALUE into the proper variable based on the value of the NAME (essentially transform).

 

So you probably want to change you step to look more like this:

data work._output;
   set work._input;
   length path f2r $512. fname lname $50.;

   path = symget('sPath');
   f2r = catx("/", path, filename);
   if filename ne ""  and fileexist(f2r) then do;
      infile a filevar=f2r end=eof truncover ;
      do while (not eof);
         input Name :$32. value $50. ;
         if Name = 'Firstname:' then fname=dequote(value);
         if Name = 'Lastname:' then lname=dequote(value);
      end;
   end;
   drop NAME VALUE;
run;
rcsutherland
Fluorite | Level 6
Hi Tom!

Thanks for your suggestion and sorry for the late reply. SAS mods marked this post as spam so I didn't get to get back and replied soon enough.

I will try your suggestion Tom. I didn't know that SAS data step would end like that considering I still have set statement which hasn't been fully processed yet?
rcsutherland
Fluorite | Level 6
I've tried this approach but I modified some part and it worked! I just added delimiter " : " and added some string transformation to clean up the result. Thanks!

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 945 views
  • 2 likes
  • 3 in conversation