- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I have several sas datasets saved in a folder. The sas datasets have similar names like file_20210701.sas7bdat, file_20210601.sas7bdat, file_20210501 etc. They also have the same column names. I want to combine all of these files to make 1 dataset but before merging them all together, I have to add a new variable, date to each dataset. Date is the date in the filename. I have already made a code where I get the names of my files in sas library then I made a macro where I add the date. But my code is very long. Is there an easier way to do this?
Thanks!
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Since a SET statement option INDSNAME will make the name of a contributing data set available in a data step you can combine and add the variable by parsing that variable. I really hope you are assigning a library to where those files are stored instead of referencing them file path and name.
Dummy code
data want; set lib.file_20210501 lib.file_20210601 lib.file_20210701 indsname=fname; /*this creates a character variable such as 20210501*/ length fdate $ 8. ; /* the position of 10 is using a 3 character Libname, adjust for yours*/ fdate= substr(fname,10); /* if you want an actual SAS Date value*/ fdate2 = input(substr(fname,10,8),yymmdd8); format fdate2 yymmdd10.; run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Posting the code is necessary to find out if it could be improved.
imho having datasets with same structure containing data of different reference date, is a bad design idea and creates the need of additional code, when multiple datasets are used in one job. So i would append all those datasets, add a variable holding the reference date and the new variable.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Since a SET statement option INDSNAME will make the name of a contributing data set available in a data step you can combine and add the variable by parsing that variable. I really hope you are assigning a library to where those files are stored instead of referencing them file path and name.
Dummy code
data want; set lib.file_20210501 lib.file_20210601 lib.file_20210701 indsname=fname; /*this creates a character variable such as 20210501*/ length fdate $ 8. ; /* the position of 10 is using a 3 character Libname, adjust for yours*/ fdate= substr(fname,10); /* if you want an actual SAS Date value*/ fdate2 = input(substr(fname,10,8),yymmdd8); format fdate2 yymmdd10.; run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thank you for this. Unfortunately I'm using an older version of SAS(7.1)
This is what my code looks like.
%macro loop(file,date,entry);
proc sql;
create table &file as select *, &entry as entry_date format=date9.
from &date where 'effective Date'n between '01jan2018'd and '01jul2021'd;
quit;
%mend loop;
%loop(file1,lib.DA20171214,'14dec2017'd)
%loop(file2,lib.DA20180116,'16jan2018'd)
%loop(file3,lib.DA20180402,'02apr2018'd)
%loop(file4,lib.DA20180502,'02may2018'd)
%loop(file5,lib.DA20180601,'01jun2018'd)
%loop(file6,lib.DA20180702,'02jul2018'd)
%loop(file7,lib.DA20180801,'01aug2018'd)
%loop(file8,lib.DA20180903,'03sep2018'd)
%loop(file9,lib.DA20181001,'01oct2018'd)
.
.
.
data want; set file:;run;
Is there a simpler way to do this without using indsname?
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Why is the code @ballardw posted not working for you? What's the error?
Is 7.1 the version of your EG client or really your SAS version? I rather doubt that it's the SAS version.
To verify your SAS version please execute the following as code and let us know what you get in the SAS log:
%put &=SYSVLONG;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@JT99 wrote:
Thank you for this. Unfortunately I'm using an older version of SAS(7.1)
This is what my code looks like.
%macro loop(file,date,entry);
proc sql;
create table &file as select *, &entry as entry_date format=date9.
from &date where 'effective Date'n between '01jan2018'd and '01jul2021'd;
quit;
%mend loop;%loop(file1,lib.DA20171214,'14dec2017'd)
%loop(file2,lib.DA20180116,'16jan2018'd)
%loop(file3,lib.DA20180402,'02apr2018'd)
%loop(file4,lib.DA20180502,'02may2018'd)
%loop(file5,lib.DA20180601,'01jun2018'd)
%loop(file6,lib.DA20180702,'02jul2018'd)
%loop(file7,lib.DA20180801,'01aug2018'd)
%loop(file8,lib.DA20180903,'03sep2018'd)
%loop(file9,lib.DA20181001,'01oct2018'd).
.
.
data want; set file:;run;
Is there a simpler way to do this without using indsname?
You original question says that you have "The sas datasets have similar names like file_20210701.sas7bdat, file_20210601.sas7bdat, file_20210501 etc.". The code you show above does not create or use data sets named that way. It uses sets named like "lib.DA20171214" and creates "file1" (which would be in the WORK library). So of course my code won't work the names of the "file" sets do not have the information in them that you stated was there, the 20210701 for example.
So now it is up to you to show 1) the actual names of the data sets you want to combine and 2) what you expect the output to look like (the added variable).
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Your code is already attempting to add the variable ENTRY_DATE.
Does it not work? If not then try changing the order you have the new variable in the SELECT statement.
select &entry as entry_date format=date9., *
When the same variable name appears twice in a SELECT statement only the first one makes it into the output dataset.
Or did you want something else?
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
You can translate that code into a single data step.
data want;
set
lib.DA20171214
lib.DA20180116
lib.DA20180402
lib.DA20180502
lib.DA20180601
lib.DA20180702
lib.DA20180801
lib.DA20180903
lib.DA20181001
indsname=dsn
;
entry_date = input(substr(dsn,length(dsn)-7),yymmdd10.);
format entry_date date9.;
where 'effective Date'n between '01jan2018'd and '01jul2021'd;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Proof of concept for the use of INDSNAME:
data file_20210701;
set sashelp.class;
run;
data file_20210601;
set sashelp.class;
run;
data file_20210501;
set sashelp.class;
run;
data want;
length fname $41; * (=8 (lib) + 1 (dot) + 32 (dataset name));
set work.file: indsname=fname;
format date yymmdd10.;
date = input(scan(fname,-1,"_"),yymmdd8.);
run;
This will work in ALL Enterprise Guide versions, and has worked in all SAS versions I used (starting with 6.09 in 1998!)
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I tried the same code again by opening a new application and now it’s working.
My original code also works, I just wanted to find a better and simpler way to do it. This way, I could improve myself and learn more.
Thank you all again. You’ve been bery helpful.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Ex. Data want: set lib.da:; indsname=fname;