Solved: Re: Strip date stamp from 30 datasets

saslove · Posted 08-01-2019 06:48 PM

I have a library that contains ~30 datasets. Every week, new data is downloaded and the data gets saved with a date stamp.

I re-run my reports every week on newly updated data. The issues I run into is each time I have to copy the name of the date stamp since the variable name changes because the date stamp changes.

Is there a way to code this in SAS that can get rid of the date stamp and save the datasets without these datestamps so that the dataset name doesn't need to be changed in my report each week? Currently, I right-click, rename, remove the '_DATA_2019-07-29_1343', which is not the smartest way.

For ex: I have protocol01-Dataset1_DATA_2019-07-29_1342

protocol01-Dataset2_DATA_2019-07-29_1343

and so on..

Thanks for your help!

Reeza · Posted 08-01-2019 09:25 PM

@saslove wrote:

I have a library that contains ~30 datasets. Every week, new data is downloaded and the data gets saved with a date stamp.

I re-run my reports every week on newly updated data. The issues I run into is each time I have to copy the name of the date stamp since the variable name changes because the date stamp changes.

Is there a way to code this in SAS that can get rid of the date stamp and save the datasets without these datestamps so that the dataset name doesn't need to be changed in my report each week? Currently, I right-click, rename, remove the '_DATA_2019-07-29_1343', which is not the smartest way.

For ex: I have protocol01-Dataset1_DATA_2019-07-29_1342

protocol01-Dataset2_DATA_2019-07-29_1343

and so on..

Thanks for your help!

What about changing your program to have it factor in as a wild card?

Replace the '_DATA_2019-07-29_1343' portion with a *.

This may depend on how you're importing the data but in general you can use wildcards in file names.

data want;
infile 'myFilename_*.csv' dsd truncover lrecl=500;
input a b c d e f g ... z;
run;

If you have multiple files with a similar name it will read all the files. You can store the filename as a new variable in the data set to identify the data source.

View solution in original post

ChrisNZ · Posted 08-01-2019 07:48 PM

So you don't want to keep past data sets? You delete all but the latest?

High-Performance SAS Coding - Third Edition

saslove · Posted 08-01-2019 09:03 PM

That’s correct. I don’t need to keep the past data. But even I am going to, it’s alright.

ChrisNZ · Posted 08-01-2019 10:37 PM

Like this?

options validmemname=extend;
data 'Dataset1_DATA_2019 07 29_1342'n
     'Dataset2_DATA_2019 07 29_1342'n
     'Dataset3_DATA_2019 07 29_1344'n;
run;

proc sql noprint;
 select max(MEMNAME) into :keepname    
 from DICTIONARY.TABLES 
 where LIBNAME='WORK'
   and MEMNAME like 'DATASET__DATA_%';   %put &=keepname;

 select catt('"',MEMNAME,'"n') into :deletenames separated by ' '
 from DICTIONARY.TABLES 
 where LIBNAME='WORK'
   and MEMNAME like 'DATASET__DATA_%'
   and MEMNAME ne "&keepname";
quit;

proc datasets lib=WORK;
  delete &deletenames.; 
  change "&keepname"n=NAMETOKEEP;
quit;

NOTE: Deleting WORK.'DATASET1_DATA_2019 07 29_1342'n (memtype=DATA).
NOTE: Deleting WORK.'DATASET2_DATA_2019 07 29_1342'n (memtype=DATA).
NOTE: Changing the name WORK.'DATASET3_DATA_2019 07 29_1344'n to WORK.NAMETOKEEP (memtype=DATA).

High-Performance SAS Coding - Third Edition

Reeza · Posted 08-01-2019 09:25 PM

@saslove wrote:

I have a library that contains ~30 datasets. Every week, new data is downloaded and the data gets saved with a date stamp.

I re-run my reports every week on newly updated data. The issues I run into is each time I have to copy the name of the date stamp since the variable name changes because the date stamp changes.

Is there a way to code this in SAS that can get rid of the date stamp and save the datasets without these datestamps so that the dataset name doesn't need to be changed in my report each week? Currently, I right-click, rename, remove the '_DATA_2019-07-29_1343', which is not the smartest way.

For ex: I have protocol01-Dataset1_DATA_2019-07-29_1342

protocol01-Dataset2_DATA_2019-07-29_1343

and so on..

Thanks for your help!

What about changing your program to have it factor in as a wild card?

Replace the '_DATA_2019-07-29_1343' portion with a *.

This may depend on how you're importing the data but in general you can use wildcards in file names.

data want;
infile 'myFilename_*.csv' dsd truncover lrecl=500;
input a b c d e f g ... z;
run;

If you have multiple files with a similar name it will read all the files. You can store the filename as a new variable in the data set to identify the data source.

saslove · Posted 08-02-2019 12:51 PM

Thanks for the wildcard idea, Reeza. I used it in my import step for all my 33 datasets and it runs perfectly.

Sharing the code for future reference

%let datafiledt=2019_0729;

%macro import (myfilename=,workfile=);

proc import out=work.&workfile
datafile="\\server\study\data download\&datafiledt.\&myFilename._*.csv"
DBMS=CSV REPLACE; guessingrows=65465;
run;

%mend import;

%import (myfilename=study-datafile1,workfile=datafile1);
%import (myfilename=study-datafile2,workfile=datafile2);

and so on..

andreas_lds · Posted 08-02-2019 01:31 AM

The smartest way would be to avoid creating those ill-named datasets at all and maybe use Generation Data Sets if older versions are required for anything.

But if you can't change that, you could use sashelp.vtable to get the name of the latest dataset in a library.

saslove · Posted 08-02-2019 12:53 PM

Thanks for the tip. However, the data dump occurs automatically from a very old database. So, there is no way to remove the date stamp. We had already looked into this to save time. I posted here since that was not a possibility.
I used the wildcard tip from @Reeza and it works great.

Catch up on SAS Innovate 2026

SAS Training: Just a Click Away