BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
JT99
Obsidian | Level 7
Hi!
I have several sas datasets saved in a folder. The sas datasets have similar names like file_20210701.sas7bdat, file_20210601.sas7bdat, file_20210501 etc. They also have the same column names. I want to combine all of these files to make 1 dataset but before merging them all together, I have to add a new variable, date to each dataset. Date is the date in the filename. I have already made a code where I get the names of my files in sas library then I made a macro where I add the date. But my code is very long. Is there an easier way to do this?
Thanks!
1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

Since a SET statement option INDSNAME will make the name of a contributing data set available in a data step you can combine and add the variable by parsing that variable. I really hope you are assigning a library to where those files are stored instead of referencing them file path and name.

Dummy code

data want;
    set lib.file_20210501 lib.file_20210601 lib.file_20210701
         indsname=fname;
  /*this creates a character variable such as 20210501*/
   length fdate $ 8. ;
  /* the position of 10 is using a 3 character Libname, adjust for yours*/
   fdate= substr(fname,10);
  /* if you want an actual SAS Date value*/
  fdate2 = input(substr(fname,10,8),yymmdd8);
   format fdate2 yymmdd10.;
run;

View solution in original post

10 REPLIES 10
andreas_lds
Jade | Level 19

Posting the code is necessary to find out if it could be improved.

imho having datasets with same structure containing data of different reference date, is a bad design idea and creates the need of additional code, when multiple datasets are used in one job. So i would append all those datasets, add a variable holding the reference date and the new variable.

ballardw
Super User

Since a SET statement option INDSNAME will make the name of a contributing data set available in a data step you can combine and add the variable by parsing that variable. I really hope you are assigning a library to where those files are stored instead of referencing them file path and name.

Dummy code

data want;
    set lib.file_20210501 lib.file_20210601 lib.file_20210701
         indsname=fname;
  /*this creates a character variable such as 20210501*/
   length fdate $ 8. ;
  /* the position of 10 is using a 3 character Libname, adjust for yours*/
   fdate= substr(fname,10);
  /* if you want an actual SAS Date value*/
  fdate2 = input(substr(fname,10,8),yymmdd8);
   format fdate2 yymmdd10.;
run;
JT99
Obsidian | Level 7

Thank you for this. Unfortunately I'm using an older version of SAS(7.1)

This is what my code looks like.

 

%macro loop(file,date,entry);
proc sql;
create table &file as select *, &entry as entry_date format=date9.
from &date where 'effective Date'n between '01jan2018'd and '01jul2021'd;
quit;
%mend loop;

%loop(file1,lib.DA20171214,'14dec2017'd)
%loop(file2,lib.DA20180116,'16jan2018'd)
%loop(file3,lib.DA20180402,'02apr2018'd)
%loop(file4,lib.DA20180502,'02may2018'd)
%loop(file5,lib.DA20180601,'01jun2018'd)
%loop(file6,lib.DA20180702,'02jul2018'd)
%loop(file7,lib.DA20180801,'01aug2018'd)
%loop(file8,lib.DA20180903,'03sep2018'd)
%loop(file9,lib.DA20181001,'01oct2018'd)

.

.

.

data want; set file:;run;

 

Is there a simpler way to do this without using indsname? 

Patrick
Opal | Level 21

Why is the code @ballardw posted not working for you? What's the error?

 

Is 7.1 the version of your EG client or really your SAS version? I rather doubt that it's the SAS version.

 

To verify your SAS version please execute the following as code and let us know what you get in the SAS log:

%put &=SYSVLONG;

ballardw
Super User

@JT99 wrote:

Thank you for this. Unfortunately I'm using an older version of SAS(7.1)

This is what my code looks like.

 

%macro loop(file,date,entry);
proc sql;
create table &file as select *, &entry as entry_date format=date9.
from &date where 'effective Date'n between '01jan2018'd and '01jul2021'd;
quit;
%mend loop;

%loop(file1,lib.DA20171214,'14dec2017'd)
%loop(file2,lib.DA20180116,'16jan2018'd)
%loop(file3,lib.DA20180402,'02apr2018'd)
%loop(file4,lib.DA20180502,'02may2018'd)
%loop(file5,lib.DA20180601,'01jun2018'd)
%loop(file6,lib.DA20180702,'02jul2018'd)
%loop(file7,lib.DA20180801,'01aug2018'd)
%loop(file8,lib.DA20180903,'03sep2018'd)
%loop(file9,lib.DA20181001,'01oct2018'd)

.

.

.

data want; set file:;run;

 

Is there a simpler way to do this without using indsname? 


You original question says that you have "The sas datasets have similar names like file_20210701.sas7bdat, file_20210601.sas7bdat, file_20210501 etc.". The code you show above does not create or use data sets named that way. It uses sets named like "lib.DA20171214" and creates "file1" (which would be in the WORK library). So of course my code won't work the names of the "file" sets do not have the information in them that you stated was there, the 20210701 for example.

 

So now it is up to you to show 1) the actual names of the data sets you want to combine and 2) what you expect the output to look like (the added variable).

Tom
Super User Tom
Super User

Your code is already attempting to add the variable ENTRY_DATE.

Does it not work?  If not then try changing the order you have the new variable in the SELECT statement.

select &entry as entry_date format=date9., *

When the same variable name appears twice in a SELECT statement only the first one makes it into the output dataset.


Or did you want something else?

Tom
Super User Tom
Super User

You can translate that code into a single data step.

data want;
  set 
   lib.DA20171214
   lib.DA20180116
   lib.DA20180402
   lib.DA20180502
   lib.DA20180601
   lib.DA20180702
   lib.DA20180801
   lib.DA20180903
   lib.DA20181001
    indsname=dsn 
  ;
  entry_date = input(substr(dsn,length(dsn)-7),yymmdd10.);
  format entry_date date9.;
  where 'effective Date'n between '01jan2018'd and '01jul2021'd;
run;
Kurt_Bremser
Super User

Proof of concept for the use of INDSNAME:

data file_20210701;
set sashelp.class;
run;

data file_20210601;
set sashelp.class;
run;

data file_20210501;
set sashelp.class;
run;

data want;
length fname $41; * (=8 (lib) + 1 (dot) + 32 (dataset name));
set work.file: indsname=fname;
format date yymmdd10.;
date = input(scan(fname,-1,"_"),yymmdd8.);
run;

This will work in ALL Enterprise Guide versions, and has worked in all SAS versions I used (starting with 6.09 in 1998!)

JT99
Obsidian | Level 7
I tried running the code using INDSNAME and it is now working. I have two SAS programs, SAS EG and Analytics Client. The code was working with Analytics Client but not in sas eg when I first tried it that’s why I thought maybe my SAS EG does not support indsname.
I tried the same code again by opening a new application and now it’s working.
My original code also works, I just wanted to find a better and simpler way to do it. This way, I could improve myself and learn more.
Thank you all again. You’ve been bery helpful.
JT99
Obsidian | Level 7
I think the reason it wasn’t working was because I added a semicolon before the INDSNAME statement.
Ex. Data want: set lib.da:; indsname=fname;

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 10 replies
  • 1021 views
  • 1 like
  • 6 in conversation