I have a number of SAS datasets (around 30), each with the same structure but the sets are differentiated by a numerical suffix, e.g., DATA##, where # is some number.
I have a number of other SAS dataset, e.g., INDICES##, each with one row (observation) of two columns of numerical variables named H and A. This pair of numbers represents the suffixes ## of two of the DATA## datasets that need to be merged for analysis. The number of second datasets is less than 20, and their suffixes are unrelated to those of DATA.
I would like to be able to extract a pair of numbers from each of the second datasets in a way that I can use them to identify each of the pair of datasets from the first set that I need to merge, perhaps renaming them HDATAXX and ADATAZZ, where XX and ZZ are the relevant identifiers for the comparison.
Several attempts at assigning the value of H or A to a name in a macro then appending that name to DATAxx have failed. One example follows, with 5 suffix-containing datasets:
%macro trial;
%do I = 1 %to 5;
data temp; set INDICES&I;
%let key=A; run;
data DATA&key; set DATA&key;
rename coreoff = coredef;
run;
%end;
%mend trial;
%trial; run;
In this case, SAS interprets “DATA&key” as “DATAA”, which doesn’t exist. Any help appreciated.
SAS Ver. 9.4 running in Windows 10.
It is doing what you asked it to do. The macro processor finishes its work and passes the resulting text to SAS to actually run. Any macro code embedded inside of SAS code is going to be evaluated before SAS even starts to compile the code, much less begins to run it. So placing a %LET statement in the middle of data step is just going to confuse the humans reading the code.
You effectively ran these statements.
%let key=A;
data temp;
set INDICES&I;
run;
data DATA&key;
set DATA&key;
rename coreoff = coredef;
run;
If you want to make a macro variable from the value of a dataset variable you use the CALL SYMPUTX() function.
data _null_;
set INDICES&I;
call symputx('key',a);
run;
But that said your macro does not look like it is trying to do what you described.
And I don't understand what value it would be to do what you described. Why are you renaming datasets? What are going to do with the datasets once they are renamed that you couldn't do with them using their original names?
Why?
Why do you have separate datasets?
Why is the number part of the dataset name instead of variable in the dataset?
data all_data;
set data: indsname=dsname;
dsnum = input(substr(scan(dsname,-1,'.'),5),32.);
run;
Why do you have separate index dataset? Why not combine those into one also?
Why do you want to rename the datasets?
What are you planning to do once you have renamed the datasets?
Thanks for a prompt reply!
(1) Each of the separate DATA## sets (I currently have DATA1 through DATA32) evolves differently through time, and merging them would create upstream problems.
(2) In an earlier iteration of the problem I had all the indexes in one file, but I couldn't make that work either and so pulled them apart. It would be easy enough to put them back it that yields a solution.
(3) I don't want to rename the datasets, I just want to rename some variables in one of the pair of datasets. Both of the pair start out with the same variable names, so one needs to be changed before merging. Once merged, I will calculate some differences between the two datasets for the same original-named variables (and it is relevant which is subtracted from which), then the resulting dataset gets passed along for further analysis. As the datasets evolve through time (see (1)), their role as either the "subtractor" or "subtractee" changes.
You still haven't said enough about what you are trying to do to give any good advice.
There is no need to rename the variables IN THE EXISTING datasets. Rename them as part of the MERGE (what ever you mean by that) using the RENAME= dataset option.
Say you want to merge DATA34 and DATA87 by the key variable ID and create different names for non-key common variable named FRED. Your code might look like this:
data want;
merge data34(rename=(FRED=LEFT)) data87(rename=(FRED=RIGHT));
by id;
difference = left - right;
run;
So all that changes in that is the 34 and 87.
So make two macro variable to hold those numbers and replace the hard coded numbers with the macro variable references.
%let left=34;
%let right=87;
data want;
merge data&left.(rename=(FRED=LEFT)) data&right.(rename=(FRED=RIGHT));
by id;
difference = left - right;
run;
Now you could convert this into a macro with two input parameters. And call it with the values you want.
%macro combine(left,right);
data want;
merge data&left.(rename=(FRED=LEFT)) data&right.(rename=(FRED=RIGHT));
by id;
difference = left - right;
run;
%mend combine ;
%combine(left=34,right=87)
Now if you have a dataset named INDEX with the variable LEFT and RIGHT you can use it to generate one macro call for each observation.
data _null_;
set index;
call execute(cats('%nrstr(%combine)(left=',left,',right=',right,')'));
run;
Obviously your real macro will do more steps than just the one I have shown.
It is doing what you asked it to do. The macro processor finishes its work and passes the resulting text to SAS to actually run. Any macro code embedded inside of SAS code is going to be evaluated before SAS even starts to compile the code, much less begins to run it. So placing a %LET statement in the middle of data step is just going to confuse the humans reading the code.
You effectively ran these statements.
%let key=A;
data temp;
set INDICES&I;
run;
data DATA&key;
set DATA&key;
rename coreoff = coredef;
run;
If you want to make a macro variable from the value of a dataset variable you use the CALL SYMPUTX() function.
data _null_;
set INDICES&I;
call symputx('key',a);
run;
But that said your macro does not look like it is trying to do what you described.
And I don't understand what value it would be to do what you described. Why are you renaming datasets? What are going to do with the datasets once they are renamed that you couldn't do with them using their original names?
Alas, I'm still relatively new to macros, and indeed I was a confused human reading (or writing) the code. I will try to implement your suggestion. And sorry for the lack of clarity - I'm not renaming datasets (or at least that's not my intention), I'm just renaming some variables within one of the pair of datasets.
In general, it's a good idea to show the non-macro code that works before the macro code. Once you have working code, then it's usually easy to convert it to a macro.
That did it. "Symputx" was missing from my still limited set of macro statements. My original macro for this problem was essentially what you showed in another reply, starting by setting each of the indexes (LEFT and RIGHT in your example) "by hand." As the number of combinations grew large, this became onerous, leading to trying to extract the relevant indices from files generated elsewhere in the program. Thanks also for advice in another post on renaming variables while merging; that streamlines code in several downstream locations. Very much obliged for you patience and assistance.
Got it. Thanks.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.