Hello SAS guys,
I need to import mutiple CSV files into SAS. Each files has same formats, but the beginning datarow is different.
For example, the data row begins at line 15 in test1, while the data row begins at line 13 in test2.
My code is the below:
%macro readfile;
%do i=1 %to 2;
data want&i. ;
infile "test&i." dlm='|' dsd lrecl=32767 firstobs=&first.;
input @;
itemid=dequote(scan(_INFILE_,2,"|"));
itemtype=dequote(scan(_INFILE_,1,"|",'M'));
author=dequote(scan(_INFILE_,3,"|"));
put _all_;
run;
%end;
%mend readfile;
How can I get the firstobs=13/15 automatically instead of hardcoding 13 or 15. Thanks very much for helping!
Just find the starting point yourself.
data want&i ;
 length Item_Type $20 Item_ID $20 Author $20 CorrectResponse DQ 8 DQ_Type $20 Folder_Name1 Folder_Name2 Item_Type2 $20 ;
 infile "text&i" dsd dlm='|' truncover ; 
 if _n_=1 then do until(Item_Type='Item Type'); 
    input Item_Type / @;
 end;
 input Item_Type -- Item_Type2 ;
run;Read it all and if it's not relevant delete it?
Or read each individually and append after?
Or is there some systematic order of where the datarow starts that can be automated?
No, I need to read each file in every loop. Do not need to append them together. Thanks for reply!
Please take a look at this:
data want&i. ;
   infile "test&i." dlm='|' dsd lrecl=32767 truncover end= dataend;
   informat itemtype itemid author $25.;
   /* reads the lines untile the header "Item Type"| appears*/
   do until (scan(_infile_,1,'|"') in ('Item Type') );
      INPUT ;
   END;
   /* now read the data*/
   do while (dataend=0);
      input itemtype itemid author ;
      output;
   end;
run;
Instead of trying to find firstobs and the retroactively setting the firstobs infile option the above reads the code until it finds a key value that says that you are reading the header row. Then reads the values from there until the end of the file. If you have blank lines at the end of the file you'll have missing data. The infile option END sets flag that is set to one when the end of the file is encountered. So we use that to control a do loop to execute inputs until the end is encountered. The Explicit output is needed to output each record as read.
Notice that when you use delimited data the quotes in the data file are read between for the text so the dequote and scanning aren't needed. Setting the informat sets the length of the variables so you should set reasonable values for your data.
Setting the informat sets the length of the variables so you should set reasonable values for your data.
Actually setting the INFORMAT tells SAS how to convert the text it reads into the value that it stores. But SAS already knows how to read character variables so adding $xx INFORMAT for a character variable does not really help SAS in any way.
It does have the SIDE EFFECT of defining the variables as being character variables with the lengths specified in the informat since the INFORMAT statement is the first place that you referenced the variable.
Why not just DEFINE the variables?
length itemtype itemid author $25;
The first row of the data you want is different but there is data before. I see two options, one is to pre-process each file and determine the start row and pass that to your macro. A second is to use INPUT to find the ITEM TYPE line and then set a flag. If that flag is set, then you read the data. I think this works but haven't tested it, so if you do run into issues post your code, log and details.
retain first_flag;
input @@;
if first_flag=1 then do;
input ... ;
end;
if _infile_ in: ('"ITEM TYPE"') then first_flag=1;
Just find the starting point yourself.
data want&i ;
 length Item_Type $20 Item_ID $20 Author $20 CorrectResponse DQ 8 DQ_Type $20 Folder_Name1 Folder_Name2 Item_Type2 $20 ;
 infile "text&i" dsd dlm='|' truncover ; 
 if _n_=1 then do until(Item_Type='Item Type'); 
    input Item_Type / @;
 end;
 input Item_Type -- Item_Type2 ;
run;Thank you Tom and Ballardw. I tried these two codes and both of them works. Thanks a lot!
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.
