how to import mutiple CSV files with different datarows into SAS

Accepted Solution Solved
Reply
Highlighted
Contributor
Posts: 62
Accepted Solution

how to import mutiple CSV files with different datarows into SAS

Hello SAS guys,

I need to import mutiple CSV files into SAS. Each files has same formats, but the beginning datarow is different.

For example, the data row begins at line 15 in test1, while the data row begins at line 13 in test2.

My code is the below:

 

%macro readfile;

%do i=1 %to 2;

data want&i. ;
infile "test&i." dlm='|' dsd lrecl=32767 firstobs=&first.;
input @;
itemid=dequote(scan(_INFILE_,2,"|"));
itemtype=dequote(scan(_INFILE_,1,"|",'M'));
author=dequote(scan(_INFILE_,3,"|"));
put _all_;
run;

%end;

%mend readfile;

 

How can I get the firstobs=13/15  automatically instead of hardcoding 13 or 15.  Thanks very much for helping!


Accepted Solutions
Solution
a week ago
Super User
Super User
Posts: 7,076

Re: how to import mutiple CSV files with different datarows into SAS

Just find the starting point yourself.

data want&i ;
 length Item_Type $20 Item_ID $20 Author $20 CorrectResponse DQ 8 DQ_Type $20 Folder_Name1 Folder_Name2 Item_Type2 $20 ;
 infile "text&i" dsd dlm='|' truncover ; 
 if _n_=1 then do until(Item_Type='Item Type'); 
    input Item_Type / @;
 end;
 input Item_Type -- Item_Type2 ;
run;

View solution in original post


All Replies
Super User
Posts: 19,860

Re: how to import mutiple CSV files with different datarows into SAS

Read it all and if it's not relevant delete it?

 

Or read each individually and append after?

 

Or is there some systematic order of where the datarow starts that can be automated?

Contributor
Posts: 62

Re: how to import mutiple CSV files with different datarows into SAS

No, I need to read each file in every loop. Do not need to append them together.  Thanks for reply!

Super User
Posts: 11,343

Re: how to import mutiple CSV files with different datarows into SAS

Please take a look at this:

 

data want&i. ;
   infile "test&i." dlm='|' dsd lrecl=32767 truncover end= dataend;
   informat itemtype itemid author $25.;
   /* reads the lines untile the header "Item Type"| appears*/
   do until (scan(_infile_,1,'|"') in ('Item Type') );
      INPUT ;
   END;
   /* now read the data*/
   do while (dataend=0);
      input itemtype itemid author ;
      output;
   end;
run;

Instead of trying to find firstobs and the retroactively setting the firstobs infile option the above reads the code until it finds a key value that says that you are reading the header row. Then reads the values from there until the end of the file. If you have blank lines at the end of the file you'll have missing data. The infile option END sets flag that is set to one when the end of the file is encountered. So we use that to control a do loop to execute inputs until the end is encountered. The Explicit output is needed to output each record as read.

 

 

Notice that when you use delimited data the quotes in the data file are read between for the text so the dequote and scanning aren't needed. Setting the informat sets the length of the variables so you should set reasonable values for your data.

Super User
Super User
Posts: 7,076

Re: how to import mutiple CSV files with different datarows into SAS

Setting the informat sets the length of the variables so you should set reasonable values for your data.

Actually setting the INFORMAT tells SAS how to convert the text it reads into the value that it stores. But SAS already knows how to read character variables so adding $xx INFORMAT for a character variable does not really help SAS in any way.

 

It does have the SIDE EFFECT of defining the variables as being character variables with the lengths specified in the informat since the INFORMAT statement is the first place that you referenced the variable.

 

Why not just DEFINE the variables?

length itemtype itemid author $25;

 

Super User
Posts: 19,860

Re: how to import mutiple CSV files with different datarows into SAS

The first row of the data you want is different but there is data before. I see two options, one is to pre-process each file and determine the start row and pass that to your macro. A second is to use INPUT to find the ITEM TYPE line and then set a flag. If that flag is set, then you read the data. I think this works but haven't tested it, so if you do run into issues post your code, log and details.

 

retain first_flag;
input @@;
if first_flag=1 then do;
input ... ;
end;
if _infile_ in: ('"ITEM TYPE"') then first_flag=1;

Solution
a week ago
Super User
Super User
Posts: 7,076

Re: how to import mutiple CSV files with different datarows into SAS

Just find the starting point yourself.

data want&i ;
 length Item_Type $20 Item_ID $20 Author $20 CorrectResponse DQ 8 DQ_Type $20 Folder_Name1 Folder_Name2 Item_Type2 $20 ;
 infile "text&i" dsd dlm='|' truncover ; 
 if _n_=1 then do until(Item_Type='Item Type'); 
    input Item_Type / @;
 end;
 input Item_Type -- Item_Type2 ;
run;
Contributor
Posts: 62

Re: how to import mutiple CSV files with different datarows into SAS

Thank you Tom and Ballardw. I tried these two codes and both of them works. Thanks a lot!

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 7 replies
  • 133 views
  • 0 likes
  • 4 in conversation