DATA Step, Macro, Functions and more

Macro to append datasets

Reply
Super Contributor
Posts: 648

Macro to append datasets

Hi,
I will be getting data from different vendors(17 vendors) each month.All these need some data cleaning,fomatting and hence I saved them with as dataset_v1 or dataset_v2.
With out explicitly defining the dataset names, I would want to append all the datasets that have either _v1 or _v2 suffix to the dataset.how to achieve this?

Also not all files arrive the same day.so if a dataset is already appended, it shouldnt be appended again.
Super Contributor
Super Contributor
Posts: 3,174

Re: Macro to append datasets

Generic-prefix specification of SAS file/member is not practical or supported by SAS, without some additional programming:

1) Consider using PROC SQL and DICTIONARY.MEMBERS - generate a macro variable with the list of members to concatenate for your SAS DATA step.

2) Or use a DATA step and SASHELP.VMEMBER on a SET to identify candidate members, and generate the SET statement to a TEMP file, and %INCLUDE TEMP; to invoke your SET statement for the concatenation process.

Suggest searching the SAS support http://support.sas.com/ website for more information on using DICTIONARY tables and generating SAS code.

Scott Barry
SBBWorks, Inc.
Super Contributor
Posts: 474

Re: Macro to append datasets

No simple or direct solution here. At least that I'm aware of.

Beside reading from dictionary tables, and assuming that all the tables are in the same library, you could also gather info about library contents with the CONTENTS procedure, say, something like this:

proc contents data = LIB._ALL_ out = WORK.CONTENTS noprint;
run;

From there you need to work the CONTENTS table, which will hold pretty much all the metainfo about each table found in LIB.

Do a select distinct over the MEMNAME variable (table name) where MEMTYPE = 'DATA' (only dataset members) and you'll get every dataset name in LIB.

From there, you just need the right names into some macro variable (maybe a select into: will do the job here) and concatenate the desired dataset into a single one.

After that, I would sugest to move (to anotther lib) or rename the concatenated dataset, so they won't be concatenated again on the next run (or you could work with the last modified date of each, more work here).

As you see, not very easy.

Good work!

Cheers from Portugal.

Daniel Santos @ www.cgd.pt.
Ask a Question
Discussion stats
  • 2 replies
  • 115 views
  • 0 likes
  • 3 in conversation