I used the following SAS code to read in all .csv file within a work directory. If I need to read in line 3 to line 153 of each .csv file, how should I modify the code? Thanks much for the help!
%macro drive(dir,ext);
%local cnt filrf rc did memcnt name;
%let cnt=0;
%let filrf=mydir;
%let rc=%sysfunc(filename(filrf,&dir));
%let did=%sysfunc(dopen(&filrf));
%if &did ne 0 %then %do;
%let memcnt=%sysfunc(dnum(&did));
%do i=1 %to &memcnt;
%let name=%qscan(%qsysfunc(dread(&did,&i)),-1,.);
%if %qupcase(%qsysfunc(dread(&did,&i))) ne %qupcase(&name) %then %do;
%if %superq(ext) = %superq(name) %then %do;
%let cnt=%eval(&cnt+1);
%put %qsysfunc(dread(&did,&i));
proc import datafile="&dir\%qsysfunc(dread(&did,&i))" out=dsn&cnt
dbms=csv replace;
run;
%end;
%end;
%end;
%end;
%else %put &dir cannot be opened.;
%let rc=%sysfunc(dclose(&did));
%mend drive;
%drive(c:\temp,csv)
Take ONE file and figure out how to read it. For example if you are using PROC IMPORT try adding the DATAROW= statement. If you are using your own data step try using the FIRSTOBS= option on the INFILE statement.
Then adapt that code to replace these lines in your current program.
proc import datafile="&dir\%qsysfunc(dread(&did,&i))" out=dsn&cnt
dbms=csv replace;
run;
Note that if you are reading all of the CSV files in a single directory and they all have the exact same layout then you can ditch the macro code and just read all of the file in one data step.
data want;
length fname filename $256;
infile 'c:\temp\*.csv' dsd truncover filename=fname;
input @;
filename=fname;
if filename ne lag(filename) then do;
* Skip first two lines ;
input; input; delete;
end;
* Read the data from the line ;
input ... ;
run;
Thanks much for your suggestions!
I can now read in the files after modifying the code as below. (suppose I want to read in line 4 to 253).
obs=253;
proc import datafile="&dir\%qsysfunc(dread(&did,&i))" out=dsn&cnt
dbms=csv replace;
guessingrows=max;
datarow=4;
Since I need to continue to work on those csv files after reading in, I have a following up question: in the output file, those files are named dsn1, dsn2,.... etc. How can I name them as their original names, such as ppp38, ppp42, ppp45, etc. Could you please direct me how to do that? Thanks!
@superbug wrote:
Thanks much for your suggestions!
I can now read in the files after modifying the code as below. (suppose I want to read in line 4 to 253).
obs=253; proc import datafile="&dir\%qsysfunc(dread(&did,&i))" out=dsn&cnt dbms=csv replace; guessingrows=max; datarow=4;
Since I need to continue to work on those csv files after reading in, I have a following up question: in the output file, those files are named dsn1, dsn2,.... etc. How can I name them as their original names, such as ppp38, ppp42, ppp45, etc. Could you please direct me how to do that? Thanks!
Look carefully at the code you posted. It is naming the dataset by appending the value of &CNT to the prefix string DSN. If you want to name the data based on the name of the CSV file then the names of the CSV files have to be valid SAS dataset names. They need to be 32 bytes or less in length. Use only letters, digits and underscore characters. And not start with a digit. If those rules apply then the %SCAN() function should let you pull out the part of the name before the period.
proc import
datafile="&dir\%qsysfunc(dread(&did,&i))"
out=%scan(%qsysfunc(dread(&did,&i)),1,.)
dbms=csv replace
;
If not then you are better off using the numbered dataset names. You could use the filename as the label on the dataset.
proc import
datafile="&dir\%qsysfunc(dread(&did,&i))"
out=dsn&cnt( label=%sysfunc(dread(&did,&i),$quote.) )
dbms=csv replace
;
Thanks much for your teaching!
Using the following code as you suggested, I can read in all the .csv file and get the desired SAS dataset. The name of the SAS datasets are something like PPP035, PPP046, PPP058, etc.
proc import
datafile="&dir\%qsysfunc(dread(&did,&i))"
out=%scan(%qsysfunc(dread(&did,&i)),1,.)
dbms=csv replace
;
If I need to continue working on those SAS datasets, for the code below (the part below the comments), could you please teach me how do something like that in the original code (after successfully read in all the .csv file)? Thanks a bunch!
%macro yl;
%let aform=PPP035 PPP054 PPP065 ......;
%do i=1 %to 32;
%let form=%scan(&aform,&i);
obs=253;
proc import datafile="....\&form..csv"
dmbs=csv out=&form replace;
guessingrows=max;
datarow=4;
run;
/*for the code below, how to do something like below into the original code? */
data &form;
set &form;
keep var3 var16 count;
count=_N_;
if var16=. then delete;
run;
%end;
%mend;
%yl;
Don't.
Do NOT use PROC IMPORT to read a series of files that have the same layout. Each different set of values could result in variables being defined differently in each dataset just because they contain different samples of the universe of possible records. Character variables will be define with different lengths. Some might even be defined as numeric just because in this particular subset the field only contains digits.
If the intent is to combine all of the files into one dataset then read them into one dataset to begin with. This will make sure that all of the variables are defined the same.
You can use the code that PROC IMPORT generates to help you write the code to read the file. (But note that PROC IMPORT writes really ugly data step code. You could write a much clearer and easier to maintain data step yourself).
I have my own version of reading in multiple .csv files from one directory as code below. It is not efficient since it needs to list all the name of the files. Imagine if there are hundreds of .csv file in one directory, the code below is not efficient. So I am wondering based on the code I posted originally, how to keep the original name of each .csv file. Thanks!
%macro yl;
%let aform=PPP28 PPP35 PPP43.....;
%do i=1 %to 28;
%let form=%scan(&aform,&i);
obs=153;
proc import datafile=".....\&form..csv"
dmbs=csv out=&form replace;
guessingrows=max;
datarow=4;
run;
%end;
%mend;
% yl;
Please answer some questions.
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.