I have a library that contains ~30 datasets. Every week, new data is downloaded and the data gets saved with a date stamp.
I re-run my reports every week on newly updated data. The issues I run into is each time I have to copy the name of the date stamp since the variable name changes because the date stamp changes.
Is there a way to code this in SAS that can get rid of the date stamp and save the datasets without these datestamps so that the dataset name doesn't need to be changed in my report each week? Currently, I right-click, rename, remove the '_DATA_2019-07-29_1343', which is not the smartest way.
For ex: I have protocol01-Dataset1_DATA_2019-07-29_1342
protocol01-Dataset2_DATA_2019-07-29_1343
and so on..
Thanks for your help!
@saslove wrote:
I have a library that contains ~30 datasets. Every week, new data is downloaded and the data gets saved with a date stamp.
I re-run my reports every week on newly updated data. The issues I run into is each time I have to copy the name of the date stamp since the variable name changes because the date stamp changes.
Is there a way to code this in SAS that can get rid of the date stamp and save the datasets without these datestamps so that the dataset name doesn't need to be changed in my report each week? Currently, I right-click, rename, remove the '_DATA_2019-07-29_1343', which is not the smartest way.
For ex: I have protocol01-Dataset1_DATA_2019-07-29_1342
protocol01-Dataset2_DATA_2019-07-29_1343
and so on..
Thanks for your help!
What about changing your program to have it factor in as a wild card?
Replace the '_DATA_2019-07-29_1343' portion with a *.
This may depend on how you're importing the data but in general you can use wildcards in file names.
data want;
infile 'myFilename_*.csv' dsd truncover lrecl=500;
input a b c d e f g ... z;
run;
If you have multiple files with a similar name it will read all the files. You can store the filename as a new variable in the data set to identify the data source.
So you don't want to keep past data sets? You delete all but the latest?
Like this?
options validmemname=extend;
data 'Dataset1_DATA_2019 07 29_1342'n
'Dataset2_DATA_2019 07 29_1342'n
'Dataset3_DATA_2019 07 29_1344'n;
run;
proc sql noprint;
select max(MEMNAME) into :keepname
from DICTIONARY.TABLES
where LIBNAME='WORK'
and MEMNAME like 'DATASET__DATA_%'; %put &=keepname;
select catt('"',MEMNAME,'"n') into :deletenames separated by ' '
from DICTIONARY.TABLES
where LIBNAME='WORK'
and MEMNAME like 'DATASET__DATA_%'
and MEMNAME ne "&keepname";
quit;
proc datasets lib=WORK;
delete &deletenames.;
change "&keepname"n=NAMETOKEEP;
quit;
NOTE: Deleting WORK.'DATASET1_DATA_2019 07 29_1342'n (memtype=DATA).
NOTE: Deleting WORK.'DATASET2_DATA_2019 07 29_1342'n (memtype=DATA).
NOTE: Changing the name WORK.'DATASET3_DATA_2019 07 29_1344'n to WORK.NAMETOKEEP (memtype=DATA).
@saslove wrote:
I have a library that contains ~30 datasets. Every week, new data is downloaded and the data gets saved with a date stamp.
I re-run my reports every week on newly updated data. The issues I run into is each time I have to copy the name of the date stamp since the variable name changes because the date stamp changes.
Is there a way to code this in SAS that can get rid of the date stamp and save the datasets without these datestamps so that the dataset name doesn't need to be changed in my report each week? Currently, I right-click, rename, remove the '_DATA_2019-07-29_1343', which is not the smartest way.
For ex: I have protocol01-Dataset1_DATA_2019-07-29_1342
protocol01-Dataset2_DATA_2019-07-29_1343
and so on..
Thanks for your help!
What about changing your program to have it factor in as a wild card?
Replace the '_DATA_2019-07-29_1343' portion with a *.
This may depend on how you're importing the data but in general you can use wildcards in file names.
data want;
infile 'myFilename_*.csv' dsd truncover lrecl=500;
input a b c d e f g ... z;
run;
If you have multiple files with a similar name it will read all the files. You can store the filename as a new variable in the data set to identify the data source.
Thanks for the wildcard idea, Reeza. I used it in my import step for all my 33 datasets and it runs perfectly.
Sharing the code for future reference
%let datafiledt=2019_0729;
%macro import (myfilename=,workfile=);
proc import out=work.&workfile
datafile="\\server\study\data download\&datafiledt.\&myFilename._*.csv"
DBMS=CSV REPLACE; guessingrows=65465;
run;
%mend import;
%import (myfilename=study-datafile1,workfile=datafile1);
%import (myfilename=study-datafile2,workfile=datafile2);
and so on..
The smartest way would be to avoid creating those ill-named datasets at all and maybe use Generation Data Sets if older versions are required for anything.
But if you can't change that, you could use sashelp.vtable to get the name of the latest dataset in a library.
Thanks for the tip. However, the data dump occurs automatically from a very old database. So, there is no way to remove the date stamp. We had already looked into this to save time. I posted here since that was not a possibility.
I used the wildcard tip from @Reeza and it works great.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.