Hi,
I have ~200 data-sets that are stored in 1 library (that I've called "mbuild"). Each dataset has 1 row of data that is the mean and percentiles for 1 variable. In each dataset, the variable information (e.g., height, weight, etc.) is only stored in the label. The datasets are numbered in sets of 54 (e.g. mbuild.ADmb_RBC1_1 -> mbuild.ADmb_RBC1_54, mbuild.ADmb_SF1_1 -> mbuild.ADmb_SF1_54... etc,).
I want them all in the same dataset, and initially, used this macro ( a few times, per the changing names) to stack them:
%MACRO GENSET(FILEPFX=, LOW=, HIGH=); %DO CYC=&LOW %TO &HIGH; &FILEPFX&CYC %END; %MEND GENSET;
data want ; set %GENSET(FILEPFX= mbuild.ADmb_RBC1_, LOW=1, HIGH=54); run;
It worked perfectly, but, the information in the label is lost, so I can't tell which row goes to which dataset (of course I could manually line them up, but I want to automate it).
I updated a macro I found on this site (thanks!!) to read through the list of datasets, extract the label and create a new dataset with the data labeled:
%macro getlabel (x=);
%let total_vars = %sysfunc(countw(&x) );
%do i = 1 %to &total_vars;
%let selected_var = %scan(&x, &i);
data new_&i ; set &selected_var;
comp = vlabel(Mean); run;
%end;
%mend;
I then loop in the %genset macros, and it should give me datasets new_1 to new_200, that I can stack together for additional analyses.
The macro works perfectly when I test it using datasets that I've copied into the working library:
data test1; set mbuild.ADmb_SF1_1; run;
data test2; set mbuild.ADmb_SF1_2; run;
data test3; set mbuild.ADmb_SF1_3; run;
data prac1; set mbuild.ADmb_SF1_1; run;
data prac2; set mbuild.ADmb_SF1_2; run;
data prac3; set mbuild.ADmb_SF1_3; run;
%getlabel (x = %GENSET(FILEPFX= test, LOW=1, HIGH=3);
%GENSET(FILEPFX= prac, LOW=1, HIGH=3););
All 6 datasets are created, and have the 5 variables that I want (the original 4+ the new variable "comp" with the label in it)
But, If I try to call in the datasets where they currently reside (the 'mbuild' library) - I'm getting an error that makes it seem like the macro can't read the libname? :
The %genset macro works perfectly on its own (and reads the libname correctly), but once its inside the %getlabel macro it crashes.
I tried to update the above macro to first copy all of the datasets into the working library (below, in case useful), but I'm getting similar errors
%macro movetowork (x=);
%let total_vars = %sysfunc(countw(&x) );
%do i = 1 %to &total_vars;
%let selected_var = %scan(&x, &i);
data new_&selected_var ; set &selected_var; run;
%end;
%mend;
Is there a way to update the macro so it reads "mbuild" as the dataset's libname?
Thank you so much!
Set the mprint option prior to running the macro:
Options mprint;
%anymacro(<parameters)
options nomprint; /*to stop the mprint*/
This will generate details of the code created by the macro and give you a better chance of finding a problem.
Your parameter X is apparently supposed to be a list of data set names. When you use %scan one of the default delimiters is a . So if pass data set named Mbuild.somevar each piece processed separately, not the entire data set, generating set not found errors.
Example:
%macro dummy(x=work.junk work.example); %let total_vars = %sysfunc(countw(&x) ); %do i = 1 %to &total_vars; %let selected_var = %scan(&x, &i); %put i=: &i. Selected_var= &selected_var.; %end; %mend; %dummy()
which using the default data set names generates this output:
i=: 1 Selected_var= work i=: 2 Selected_var= junk i=: 3 Selected_var= work i=: 4 Selected_var= example
So you need to 1) modify the SCAN to get the library.dataset and 2) the Countw to get the correct count of "words" , which means providing a list of other delimiters to use. If the list created by Genset is space delimited then maybe
%macro dummy(x=work.junk work.example); %let total_vars = %sysfunc(countw(&x,%str(' ')) ); %do i = 1 %to &total_vars; %let selected_var = %scan(&x, &i,%str(' ') ); %put i=: &i. Selected_var= &selected_var.; %end; %mend; %dummy()
Set the mprint option prior to running the macro:
Options mprint;
%anymacro(<parameters)
options nomprint; /*to stop the mprint*/
This will generate details of the code created by the macro and give you a better chance of finding a problem.
Your parameter X is apparently supposed to be a list of data set names. When you use %scan one of the default delimiters is a . So if pass data set named Mbuild.somevar each piece processed separately, not the entire data set, generating set not found errors.
Example:
%macro dummy(x=work.junk work.example); %let total_vars = %sysfunc(countw(&x) ); %do i = 1 %to &total_vars; %let selected_var = %scan(&x, &i); %put i=: &i. Selected_var= &selected_var.; %end; %mend; %dummy()
which using the default data set names generates this output:
i=: 1 Selected_var= work i=: 2 Selected_var= junk i=: 3 Selected_var= work i=: 4 Selected_var= example
So you need to 1) modify the SCAN to get the library.dataset and 2) the Countw to get the correct count of "words" , which means providing a list of other delimiters to use. If the list created by Genset is space delimited then maybe
%macro dummy(x=work.junk work.example); %let total_vars = %sysfunc(countw(&x,%str(' ')) ); %do i = 1 %to &total_vars; %let selected_var = %scan(&x, &i,%str(' ') ); %put i=: &i. Selected_var= &selected_var.; %end; %mend; %dummy()
Thank you so much! This worked perfectly 🙂
And the tip re: options mprint will surely save me in the future! thanks so much!
I don't think you need a macro.
Are the variable names in your dataset all the same and you just have different labels? And do those label always correspond to the dataset name. If so you just need a new variable specifying the name of the source dataset, and you will not have lost any information in the labels.
If so, then you can use the INDSNAME argument of the SET statement to get the dataset name as an identifier for each observation in your consolidated dataset:
data want_adm_rbc;
set mbuild.ADM_RBC: indsname=_dsn;
dsn=scan(_dsn,-1,'.');
run;
This reads every dataset in mbuild whose name starts with ADM_RBC. The variable _dsn (from "indsname=_dsn") has both the libname and dataset name, but it will be deleted at end of the data step. So I just stripped off the libname and saved the dataset name in the variable DSN (the rightmost text after a period). This should provide the information you need. If you have 54 such ADM_RBC datasets, you'll get 54 observations, with a new variable: DSN.
Just make sure that the SET statement accommodates all the datasets in mbuild that you want to consolidate.
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.