Hello, I am new to SAS and I've been trying to write a sas code so it can automatically take all my csv tables from one directory. Mostly, I've found different proc import scripts, but I want to use data step, as proc import evaluated my columns differently from my what I need. This is my script.
data test; infile '\\PC0525\SAS_EG\IACS\AEZ_2008.csv' delimiter=';' missover firstobs=2 DSD LRECL=32767 ; informat ID best12. VS 12. DOTT $15. ; format ID best12.; format VS best12.; format DOTT $15.; input ID VS DOTT $ run;
I have files from 2008 to 2019, and also different names Like AEZ_2008... ABC_2008..
Thank you for your responses!
If anybody is looking for same answer as me, this was the solution to my problem 🙂
data
%macro smallLoop1(s,e);
%do year = &s. %to &e.;
want&year.
%end;
%mend smallLoop1;
%smallloop1(2008,2014)
wantOther
;
length F $ 200;
infile "\\PC0525\SAS_EG\IACS\*.csv" dlm=";" filename=F EOV=EOV missover Firstobs=2 DSD
LRECL=32767;
input @;
if not EOV then do;
informat
VS 12.
DOTT $15.
SUMAEU best12.
DOPL_1 best12.
;
format VS best12.;
format DOTT $15.;
format SUMAEU best12.;
format DOPL_1 best12.;
input
VS
DOTT $
SUMAEU
DOPL_1
;
select(compress(scan(F,-1,"\/"),,"KD"));
%macro smallLoop2(s,e);
%do year = &s. %to &e.;
when ("&year.") output want&year.;
%end;
%mend smallLoop2;
%smallLoop2(2008,2014)
otherwise output wantOther;
end;
end;
EOV = 0;
run;
hi,
did you try asterisk(*):
data test;
infile '\\PC0525\SAS_EG\IACS\*.csv'
delimiter=';'
...
?
All the best
Bart
Thank you, that works perfectly!
And is there any way, it can make separate tables, different for each year, from 2008 to 2019?
Check out the FILENAME= option in the INFILE statement.
Bart
EDIT: It works but it also add names of every columns from all the tables again, in between.
One solution would be like this [EDIT: lest make it more robust]:
filename f1 "%sysfunc(pathname(work))/file1.csvxxx";
filename f2 "%sysfunc(pathname(work))/file2.csvxxx";
filename f3 "%sysfunc(pathname(work))/file3.csvxxx";
data _null_;
file f1;
put "a,b,c";
put "1,2,3";
put "4,5,6";
file f2;
put "a,b,c";
put "7,8,9";
put "10,11,12";
file f3;
put "a,b,c";
put "13,14,15";
put "16,17,18";
run;
data want;
length F $ 200;
infile "%sysfunc(pathname(work))/*.csvxxx" dlm="," filename=F lrecl=512;
if 1 = _N_ then
do;
input @;
length _header_ $ 512; retain _header_; drop _header_;
_header_ = _infile_;
end;
input @;
curent_file = F;
if _infile_ ne _header_;
input a b c;
run;
proc print;
run;
All the best
Bart
Hi @data_null__ ,
I did consider EOV= but unfortunately asterisk(*) seems to not cooperate:
filename f1 "%sysfunc(pathname(work))/file1.csvxxx";
filename f2 "%sysfunc(pathname(work))/file2.csvxxx";
filename f3 "%sysfunc(pathname(work))/file3.csvxxx";
data _null_;
file f1;
put "a,b,c";
put "1,2,3";
put "4,5,6";
file f2;
put "a,b,c";
put "7,8,9";
put "10,11,12";
file f3;
put "a,b,c";
put "13,14,15";
put "16,17,18";
run;
data want;
infile "%sysfunc(pathname(work))/*.csvxxx" dlm="," EOV=EOV;
input @;
EV=EOV;
if EV then input;
else input a b c;
run;
proc print;
run;
the result is:
Obs EV a b c 1 0 . . . 2 0 1 2 3 3 0 4 5 6 4 1 . . . 5 1 . . . 6 1 . . . 7 1 . . . 8 1 . . . 9 1 . . .
All the best
Bart
P.S.
I would do it probably like:
filename D DUMMY;
data want2;
infile cards;
input FN : $ 200.;
FN = "%sysfunc(pathname(work))/" || FN;
do until(eof);
infile D FILEVAR=FN dlm="," FIRSTOBS=2 END=EOF;
input a b c;
output;
end;
cards;
file1.csvxxx
file2.csvxxx
file3.csvxxx
;
run;
but I would replace cards with extraction data list from the folder.
You have to RESET EOV as mentioned in the tip.
data want;
infile "%sysfunc(pathname(work))/*.csvxxx" dlm="," EOV=EOV;
input @;
if _n_ eq 1 or eov then do;
eov = 0;
delete;
end;
input a b c;
run;
Haha, I need to examine my eyes! Didn't spot that Tip 🙂 Thank you!
Now it looks nice, and firstobs=2 still works:
data want;
infile "%sysfunc(pathname(work))/*.csvxxx" dlm="," EOV=EOV Firstobs=2;
input @;
if not EOV then do;
input a b c;
output;
end;
EOV = 0;
run;
All the best
Bart
I use that pattern of resetting it at the end of the datastep. Works well for simple steps where every iteration traverses the full step.
For simple one line headers just conditionally execute an INPUT to remove the header. Note that you need to use the FIRSTOBS option to skip the first header line.
data want;
infile '*.txt' firstobs=2 eov=eov;
input @;
if eov then input;
input .... ;
eov=0;
run;
Note this will have trouble with 0 observation files that consist of just the header line. In that case you might want to make it a little more complex by putting the extra INPUT in a do loop. In which case then you no longer need to reset EOV at the end.
data want;
infile '*.txt' firstobs=2 eov=eov;
input @;
do while(eov); eov=0; input / @; end;
input .... ;
run;
@Tesera wrote:
EDIT: It works but it also add names of every columns from all the tables again, in between.
If the goal is to keep them as separate years (which is probably not what you want) you can make a loop (or a macro that is called multiple times) and read each file separately. The FIRSTOBS=2 option on the infile statement will skip the header row each time.
When you are reading multiple files you will need to use other logic to skip the header rows.
You could look at the line and figure it out. That is easy if the first variable is numeric so the header is always different than any real line of data.
data want;
infile .....;
input @;
if _infile_ =: 'VAR1' then delete;
...
run;
You could try to make heads or tails of the EOV option. But I have found using the FILENAME= option is easier. Plus then you have it if you want to pull the YEAR or other information from the file's name.
data want;
length fname $256 ;
infile ..... firstobs=1 filename=fname ;
input @;
if fname ne lag(fname) then delete;
...
run;
I would like to keep them as separate years, all 2008 together, all 2009 together, etc. etc.
Hi @Tesera ,
As I wrote use filename= option to extract file name and from the name the year, try below:
filename f1 "%sysfunc(pathname(work))/fileA_2008.csvxxx";
filename f2 "%sysfunc(pathname(work))/fileB_2009.csvxxx";
filename f3 "%sysfunc(pathname(work))/fileC_2019.csvxxx";
data _null_;
file f1;
put "a,b,c";
put "1,2,3";
put "4,5,6";
file f2;
put "a,b,c";
put "7,8,9";
put "10,11,12";
file f3;
put "a,b,c";
put "13,14,15";
put "16,17,18";
run;
data
%macro smallLoop1(s,e);
%do year = &s. %to &e.;
want&year.
%end;
%mend smallLoop1;
%smallLoop1(2008,2019)
wantOther
;
length F $ 200;
infile "%sysfunc(pathname(work))/*.csvxxx" dlm="," filename=F EOV=EOV Firstobs=2;
input @;
if not EOV then do;
input a b c;
select(compress(scan(F,-1,"\/"),,"KD"));
%macro smallLoop2(s,e);
%do year = &s. %to &e.;
when("&year.") output want&year.;
%end;
%mend smallLoop2;
%smallLoop2(2008,2019)
otherwise output wantOther;
end;
end;
EOV = 0;
run;
I'm using macro-loop here, it could be also done with hash-of-hashes, but this one do the job perfectly good.
Just adjust `input a b c;` to your needs and `%smallLoop1(2008,2019)` and `%smallLoop2(2008,2019)` to cover your data range.
All the best
Bart
Thank you! I will try it out and let you know if it works for me.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.