BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Tesera
Obsidian | Level 7

Hello, I am new to SAS and I've been trying to write a sas code so it can automatically take all my csv tables from one directory. Mostly, I've found different proc import scripts, but I want to use data step, as proc import evaluated my columns differently from my what I need. This is my script.

data test;
	infile '\\PC0525\SAS_EG\IACS\AEZ_2008.csv'
delimiter=';'
		missover
		firstobs=2
		DSD
		LRECL=32767
		;
informat 

ID best12.	
VS 12. 
DOTT $15. 
;

format ID best12.;	
format VS best12.;	
format DOTT	$15.;

input 

ID
VS 
DOTT $	

run;

I have files from 2008 to 2019, and also different names Like AEZ_2008... ABC_2008..

Thank you for your responses! 

1 ACCEPTED SOLUTION

Accepted Solutions
Tesera
Obsidian | Level 7

If anybody is looking for same answer as me, this was the solution to my problem 🙂

 

data 
%macro smallLoop1(s,e);
	%do year = &s. %to &e.;
	want&year.
	%end;
%mend smallLoop1;
%smallloop1(2008,2014)
wantOther
;

  length F $ 200;
  infile "\\PC0525\SAS_EG\IACS\*.csv" dlm=";" filename=F EOV=EOV missover Firstobs=2 	DSD
		LRECL=32767;
  input @;

if not EOV then do;

informat 
VS 12. 
DOTT $15. 
SUMAEU best12.	
DOPL_1 best12. 
;

format VS best12.;	
format DOTT	$15.;
format SUMAEU best12.;	
format DOPL_1 best12.;	

input 
VS 
DOTT $	
SUMAEU	
DOPL_1
;

select(compress(scan(F,-1,"\/"),,"KD"));
%macro smallLoop2(s,e);
%do year = &s. %to &e.;
when ("&year.") output want&year.;
%end;
%mend smallLoop2;
%smallLoop2(2008,2014)
otherwise output wantOther;
end;
end;
EOV = 0;
run;

View solution in original post

26 REPLIES 26
yabwon
Onyx | Level 15

hi,

 

did you try asterisk(*):

data test;
	infile '\\PC0525\SAS_EG\IACS\*.csv'
delimiter=';'
...

?

 

All the best

Bart

_______________
Polish SAS Users Group: www.polsug.com and communities.sas.com/polsug

"SAS Packages: the way to share" at SGF2020 Proceedings (the latest version), GitHub Repository, and YouTube Video.
Hands-on-Workshop: "Share your code with SAS Packages"
"My First SAS Package: A How-To" at SGF2021 Proceedings

SAS Ballot Ideas: one: SPF in SAS, two, and three
SAS Documentation



Tesera
Obsidian | Level 7

Thank you, that works perfectly!

And is there any way, it can make separate tables, different for each year, from 2008 to 2019?

yabwon
Onyx | Level 15

Check out the FILENAME= option in the INFILE statement.

 

Bart

_______________
Polish SAS Users Group: www.polsug.com and communities.sas.com/polsug

"SAS Packages: the way to share" at SGF2020 Proceedings (the latest version), GitHub Repository, and YouTube Video.
Hands-on-Workshop: "Share your code with SAS Packages"
"My First SAS Package: A How-To" at SGF2021 Proceedings

SAS Ballot Ideas: one: SPF in SAS, two, and three
SAS Documentation



Tesera
Obsidian | Level 7

EDIT: It works but it also add names of every columns from all the tables again, in between.

yabwon
Onyx | Level 15

One solution would be like this [EDIT: lest make it more robust]:

filename f1 "%sysfunc(pathname(work))/file1.csvxxx";
filename f2 "%sysfunc(pathname(work))/file2.csvxxx";
filename f3 "%sysfunc(pathname(work))/file3.csvxxx";

data _null_;
  file f1;
  put "a,b,c";
  put "1,2,3";
  put "4,5,6";
  file f2;
  put "a,b,c";
  put "7,8,9";
  put "10,11,12";
  file f3;
  put "a,b,c";
  put "13,14,15";
  put "16,17,18";
run;

data want;
  length F $ 200;
  infile "%sysfunc(pathname(work))/*.csvxxx" dlm="," filename=F lrecl=512;
  if 1 = _N_ then 
    do;
      input @;
      length _header_ $ 512; retain _header_; drop _header_;
      _header_ = _infile_;
    end;
  input @;
  curent_file = F;
  if _infile_ ne _header_;
  input a b c;
run;
proc print;
run;

 

All the best

Bart

_______________
Polish SAS Users Group: www.polsug.com and communities.sas.com/polsug

"SAS Packages: the way to share" at SGF2020 Proceedings (the latest version), GitHub Repository, and YouTube Video.
Hands-on-Workshop: "Share your code with SAS Packages"
"My First SAS Package: A How-To" at SGF2021 Proceedings

SAS Ballot Ideas: one: SPF in SAS, two, and three
SAS Documentation



data_null__
Jade | Level 19

@yabwon Consider the INFILE statement option EOV= 

 

Capture.PNG

yabwon
Onyx | Level 15

Hi @data_null__ ,

 

I did consider EOV= but unfortunately asterisk(*) seems to not cooperate:

filename f1 "%sysfunc(pathname(work))/file1.csvxxx";
filename f2 "%sysfunc(pathname(work))/file2.csvxxx";
filename f3 "%sysfunc(pathname(work))/file3.csvxxx";

data _null_;
  file f1;
  put "a,b,c";
  put "1,2,3";
  put "4,5,6";
  file f2;
  put "a,b,c";
  put "7,8,9";
  put "10,11,12";
  file f3;
  put "a,b,c";
  put "13,14,15";
  put "16,17,18";
run;

data want;
  infile "%sysfunc(pathname(work))/*.csvxxx" dlm="," EOV=EOV;
  input @;
  EV=EOV;
  if EV then input;
             else input a b c;
run;
proc print;
run;

the result is:

Obs    EV    a    b    c

 1      0    .    .    .
 2      0    1    2    3
 3      0    4    5    6
 4      1    .    .    .
 5      1    .    .    .
 6      1    .    .    .
 7      1    .    .    .
 8      1    .    .    .
 9      1    .    .    .

All the best

Bart

 

P.S. 

I would do it probably like:

filename D DUMMY;
data want2;
  infile cards;
  input FN : $ 200.;
  FN = "%sysfunc(pathname(work))/" || FN;

  do until(eof);
    infile D FILEVAR=FN dlm="," FIRSTOBS=2 END=EOF;
      input a b c;
      output;
  end;
cards;
file1.csvxxx
file2.csvxxx
file3.csvxxx
;
run;

but I would replace cards with extraction data list from the folder.

_______________
Polish SAS Users Group: www.polsug.com and communities.sas.com/polsug

"SAS Packages: the way to share" at SGF2020 Proceedings (the latest version), GitHub Repository, and YouTube Video.
Hands-on-Workshop: "Share your code with SAS Packages"
"My First SAS Package: A How-To" at SGF2021 Proceedings

SAS Ballot Ideas: one: SPF in SAS, two, and three
SAS Documentation



data_null__
Jade | Level 19

You have to RESET EOV as mentioned in the tip.  

 

data want;
   infile "%sysfunc(pathname(work))/*.csvxxx" dlm="," EOV=EOV;
   input @;
   if _n_ eq 1 or eov then do;
      eov = 0;
      delete;
      end;
   input a b c;
   run;
yabwon
Onyx | Level 15

Haha, I need to examine my eyes! Didn't spot that Tip 🙂 Thank you!

 

Now it looks nice, and firstobs=2 still works:

data want;
  infile "%sysfunc(pathname(work))/*.csvxxx" dlm="," EOV=EOV Firstobs=2;
  input @;
  if not EOV then do; 
            input a b c; 
            output; 
          end;
  EOV = 0;
run;

 

All the best

Bart

_______________
Polish SAS Users Group: www.polsug.com and communities.sas.com/polsug

"SAS Packages: the way to share" at SGF2020 Proceedings (the latest version), GitHub Repository, and YouTube Video.
Hands-on-Workshop: "Share your code with SAS Packages"
"My First SAS Package: A How-To" at SGF2021 Proceedings

SAS Ballot Ideas: one: SPF in SAS, two, and three
SAS Documentation



Tom
Super User Tom
Super User

I use that pattern of resetting it at the end of the datastep.  Works well for simple steps where every iteration traverses the full step.

For simple one line headers just conditionally execute an INPUT to remove the header.  Note that you need to use the FIRSTOBS option to skip the first header line.

data want;
  infile '*.txt' firstobs=2 eov=eov;
  input @;
  if eov then input;
  input .... ;
  eov=0;
run;

Note this will have trouble with 0 observation files that consist of just the header line.  In that case you might want to make it a little more complex by putting the extra INPUT in a do loop.  In which case then you no longer need to reset EOV at the end.

data want;
  infile '*.txt' firstobs=2 eov=eov;
  input @;
  do while(eov); eov=0; input / @; end;
  input .... ;
run;

 

Tom
Super User Tom
Super User

@Tesera wrote:

EDIT: It works but it also add names of every columns from all the tables again, in between.


If the goal is to keep them as separate years (which is probably not what you want)  you can make a loop (or a macro that is called multiple times) and read each file separately.  The FIRSTOBS=2 option on the infile statement will skip the header row each time.

 

When you are reading multiple files you will need to use other logic to skip the header rows.

You could look at the line and figure it out. That is easy if the first variable is numeric so the header is always different than any real line of data.

data want;
  infile .....;
  input @;
  if _infile_ =: 'VAR1' then delete;
  ...
run;

You could try to make heads or tails of the EOV option. But I have found using the FILENAME= option is easier.  Plus then you have it if you want to pull the YEAR or other information from the file's name.

data want;
  length fname $256 ;
  infile ..... firstobs=1 filename=fname ;
  input @;
  if fname ne lag(fname) then delete;
  ...
run;

 

Tesera
Obsidian | Level 7

I would like to keep them as separate years, all 2008 together, all 2009 together, etc. etc.

 

yabwon
Onyx | Level 15

Hi @Tesera ,

 

As I wrote use filename= option to extract file name and from the name the year, try below:

filename f1 "%sysfunc(pathname(work))/fileA_2008.csvxxx";
filename f2 "%sysfunc(pathname(work))/fileB_2009.csvxxx";
filename f3 "%sysfunc(pathname(work))/fileC_2019.csvxxx";

data _null_;
  file f1;
  put "a,b,c";
  put "1,2,3";
  put "4,5,6";
  file f2;
  put "a,b,c";
  put "7,8,9";
  put "10,11,12";
  file f3;
  put "a,b,c";
  put "13,14,15";
  put "16,17,18";
run;

data 
  %macro smallLoop1(s,e);
    %do year = &s. %to &e.;
      want&year.
    %end;
  %mend smallLoop1;
  %smallLoop1(2008,2019)
  wantOther
;
  length F $ 200;
  infile "%sysfunc(pathname(work))/*.csvxxx" dlm="," filename=F EOV=EOV Firstobs=2;
  input @;
  if not EOV then do; 
            input a b c;
            select(compress(scan(F,-1,"\/"),,"KD"));
              %macro smallLoop2(s,e);
                %do year = &s. %to &e.;
                  when("&year.") output want&year.;
                %end;
              %mend smallLoop2;
              %smallLoop2(2008,2019)
              otherwise  output wantOther;
            end; 
          end;
  EOV = 0;
run;

I'm using macro-loop here, it could be also done with hash-of-hashes, but this one do the job perfectly good.

Just adjust `input a b c;` to your needs and `%smallLoop1(2008,2019)` and `%smallLoop2(2008,2019)` to cover your data range.

 

All the best

Bart

_______________
Polish SAS Users Group: www.polsug.com and communities.sas.com/polsug

"SAS Packages: the way to share" at SGF2020 Proceedings (the latest version), GitHub Repository, and YouTube Video.
Hands-on-Workshop: "Share your code with SAS Packages"
"My First SAS Package: A How-To" at SGF2021 Proceedings

SAS Ballot Ideas: one: SPF in SAS, two, and three
SAS Documentation



Tesera
Obsidian | Level 7

Thank you! I will try it out and let you know if it works for me.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 26 replies
  • 2354 views
  • 0 likes
  • 4 in conversation