BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
zz
Calcite | Level 5 zz
Calcite | Level 5

How do I go through a directory and combine datasets under it?  I also need to add a column / variable that identifies contributing dataset (dataset name).  

 

Thanks!

1 ACCEPTED SOLUTION

Accepted Solutions
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Try the one step approach:

data _null_;
  set sashelp.vtable (where=(libname="DT")) end=last;
  if _n_=1 then call execute('data final; length dataset $100; set ');
  call execute(cat(' ',strip(name));
  if last the call execute(' indsname=ds; dataset=ds; run;');
run;

Basically the sashelp.vtable is pulled in, filetering on the where clause.  On first record the datastep part is written out, then for each obs returned that is written out and on the final record the datastep is closed, and then that whole code block is run.  You can aplpy any manipulations to it you want.

View solution in original post

5 REPLIES 5
RW9
Diamond | Level 26 RW9
Diamond | Level 26

The indsname option on set statement can be used to get dataset name:

data want;
  set have1 have2 have3 indsname=dsname;
run;

You can use the set to append datasets.  Proc append can also be used.  You could also do union all in proc sql.  They all have their pluses and negatives so check out the manual.

ballardw
Super User

One aproach

data want;
   Length dataset $ 41;
   set lib.file1 lib.file2 indsname=ds;
   dataset=ds;
run;

The varaible referenced by indsname option will be a temporary variable so you need to create a permanent one to keep the value as assigned above in the output.

 

You can use explicit dataset names as in the example or if you have similar names you could use a list:

 

set lib.sales: indsname=ds; would get all data sets in the library that start with SALES in the name.

Note that you my get warning about lengths of character variables different. The length of the first encountered set will be the length so truncation may be an issue. Also any variables with the same name must be of the same type. If VARX is numeric in one set and character in another this will fail (unless you rename the variable using dataset options).

zz
Calcite | Level 5 zz
Calcite | Level 5

Thanks for your input, ballardw!

 

 

I tested out the code with below:

 

LIBNAME dt '/data';
LIBNAME rs '/results';


proc sql noprint;
SELECT memname into : names
FROM dictionary.tables
WHERE libname='DT'
;


DATA RS.combined;
LENGTH dataset $ 41;
SET DT.&names indsname=ds;
dataset=ds;
RUN;

 

It created dataset that only included the first dataset 'ae' under 'DT' as source, with variable 'dataset having value 'DT.ae'.  There are about 30 some datasets under 'DT'.  How would I improve the above, so it includes all dataset in the 'combined' dataset?

 

Thanks in advance

ballardw
Super User

First you need to look at the results of %put &names;

Second, you would have had only one reference to the libname.

 

Warning: writing into the same library can result in appending the same datasets multiple times if you keep running the code as the second time you will get ds.combined appended (and some corresponding issues about the value of the indsname variable)

 

Easier to create the libname.datasetname construct when you select it.

 

proc sql;
   select catt(libname,'.',memname) into : Names separated by ' '
   from dictionary.tables
   where libname='DT' and memtype='DATA';
quit;

%put &names;    /* so you see what the names are*/
data rs.combined;
   length dataset $ 41;
   set &names indsname;
   dataset=ds;
run;

And if your goal was to combine the sets in DS and DT then use this for the SQL

 

proc sql;
   select catt(libname,'.',memname) into : Names separated by ' '
   from dictionary.tables
   where libname in ('DT','DS') and memtype='DATA';
quit;

 

RW9
Diamond | Level 26 RW9
Diamond | Level 26

Try the one step approach:

data _null_;
  set sashelp.vtable (where=(libname="DT")) end=last;
  if _n_=1 then call execute('data final; length dataset $100; set ');
  call execute(cat(' ',strip(name));
  if last the call execute(' indsname=ds; dataset=ds; run;');
run;

Basically the sashelp.vtable is pulled in, filetering on the where clause.  On first record the datastep part is written out, then for each obs returned that is written out and on the final record the datastep is closed, and then that whole code block is run.  You can aplpy any manipulations to it you want.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 3035 views
  • 0 likes
  • 3 in conversation