BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
daisy6
Quartz | Level 8

Hello SAS guys,

I need to import mutiple CSV files into SAS. Each files has same formats, but the beginning datarow is different.

For example, the data row begins at line 15 in test1, while the data row begins at line 13 in test2.

My code is the below:

 

%macro readfile;

%do i=1 %to 2;

data want&i. ;
infile "test&i." dlm='|' dsd lrecl=32767 firstobs=&first.;
input @;
itemid=dequote(scan(_INFILE_,2,"|"));
itemtype=dequote(scan(_INFILE_,1,"|",'M'));
author=dequote(scan(_INFILE_,3,"|"));
put _all_;
run;

%end;

%mend readfile;

 

How can I get the firstobs=13/15  automatically instead of hardcoding 13 or 15.  Thanks very much for helping!

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

Just find the starting point yourself.

data want&i ;
 length Item_Type $20 Item_ID $20 Author $20 CorrectResponse DQ 8 DQ_Type $20 Folder_Name1 Folder_Name2 Item_Type2 $20 ;
 infile "text&i" dsd dlm='|' truncover ; 
 if _n_=1 then do until(Item_Type='Item Type'); 
    input Item_Type / @;
 end;
 input Item_Type -- Item_Type2 ;
run;

View solution in original post

7 REPLIES 7
Reeza
Super User

Read it all and if it's not relevant delete it?

 

Or read each individually and append after?

 

Or is there some systematic order of where the datarow starts that can be automated?

daisy6
Quartz | Level 8

No, I need to read each file in every loop. Do not need to append them together.  Thanks for reply!

ballardw
Super User

Please take a look at this:

 

data want&i. ;
   infile "test&i." dlm='|' dsd lrecl=32767 truncover end= dataend;
   informat itemtype itemid author $25.;
   /* reads the lines untile the header "Item Type"| appears*/
   do until (scan(_infile_,1,'|"') in ('Item Type') );
      INPUT ;
   END;
   /* now read the data*/
   do while (dataend=0);
      input itemtype itemid author ;
      output;
   end;
run;

Instead of trying to find firstobs and the retroactively setting the firstobs infile option the above reads the code until it finds a key value that says that you are reading the header row. Then reads the values from there until the end of the file. If you have blank lines at the end of the file you'll have missing data. The infile option END sets flag that is set to one when the end of the file is encountered. So we use that to control a do loop to execute inputs until the end is encountered. The Explicit output is needed to output each record as read.

 

 

Notice that when you use delimited data the quotes in the data file are read between for the text so the dequote and scanning aren't needed. Setting the informat sets the length of the variables so you should set reasonable values for your data.

Tom
Super User Tom
Super User

Setting the informat sets the length of the variables so you should set reasonable values for your data.

Actually setting the INFORMAT tells SAS how to convert the text it reads into the value that it stores. But SAS already knows how to read character variables so adding $xx INFORMAT for a character variable does not really help SAS in any way.

 

It does have the SIDE EFFECT of defining the variables as being character variables with the lengths specified in the informat since the INFORMAT statement is the first place that you referenced the variable.

 

Why not just DEFINE the variables?

length itemtype itemid author $25;

 

Reeza
Super User

The first row of the data you want is different but there is data before. I see two options, one is to pre-process each file and determine the start row and pass that to your macro. A second is to use INPUT to find the ITEM TYPE line and then set a flag. If that flag is set, then you read the data. I think this works but haven't tested it, so if you do run into issues post your code, log and details.

 

retain first_flag;
input @@;
if first_flag=1 then do;
input ... ;
end;
if _infile_ in: ('"ITEM TYPE"') then first_flag=1;

Tom
Super User Tom
Super User

Just find the starting point yourself.

data want&i ;
 length Item_Type $20 Item_ID $20 Author $20 CorrectResponse DQ 8 DQ_Type $20 Folder_Name1 Folder_Name2 Item_Type2 $20 ;
 infile "text&i" dsd dlm='|' truncover ; 
 if _n_=1 then do until(Item_Type='Item Type'); 
    input Item_Type / @;
 end;
 input Item_Type -- Item_Type2 ;
run;
daisy6
Quartz | Level 8

Thank you Tom and Ballardw. I tried these two codes and both of them works. Thanks a lot!

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 719 views
  • 0 likes
  • 4 in conversation