Hello!
I hope someone can help me with what I'm sure is a simple issue cos I'm stumped!
I have a bunch of variables which are all indexed by the same variable (called specimen) and I want to pairwise inner join all of these variables together on specimen in separate outputs. I have created the macro below to help automate this:
%macro inner;
%let i = 1;
%let j = 2;
%do %while(&i < 3);
PROC SQL;
CREATE TABLE LI1_HOM_&i._&j AS SELECT
A.VAR_&i.,
B.VAR_&j
FROM LI1_HOM_&i A
INNER JOIN LI1_HOM_&j B
ON A.SPECIMEN=B.SPECIMEN
;
%let j = %eval(&j + 1);
%if &j = 4 %then %do;
%let i = %eval(&i + 1);
%let j = %eval(&i + 1);
%end;
%end;
%mend;
I know that the code does what I want up to the %if statement, as the expected outputs appear. However, the code then tries to create tables where i = 1 and j keeps increasing suggesting that the %if statement isn't working.
The %if statement is designed to add 1 to i and to set j to i + 1 and then start joining more variables. In this (limited) test, the maximum value of i is 2 and the maximum value of j is 3 (so there will be 6 outputs) but once the macro works it will be used for much higher values of i and j.
Can someone explain what is going wrong with the %if statement?
I am using SAS 9.4 and working in Enterprise Guide 7.1.
Many thanks,
Mike
Why would you want to increment the loop counters manually?
Why aren't you just using normal iterative DO loops?
%do i=1 %to 3 ;
%do j=&i+1 %to 4 ;
...
%end;
%end;
First thing that jumps out at me is:
%let j = %eval(&i + 1);
Did you mean to set j to i+1?
Second off, really not a good idea to create lots of datasets of the same data, its just going to make your life harder coding with it.
Provide some sample test data and required output, bet there is a simpler and faster method.
Thanks for the swift reply!
To answer your first point, I am not sure if I want j = i+1 or j = i+2. The macro starts with i = 1 and j = 2 and when the %if statement is processed for the first time (when j = 4) I want to move to i = 2 and j = 3.
Here's some input data:
Input 1 | Input 2 | Input 3 | |||||
Specimen | VAR_1 | Specimen | VAR_2 | Specimen | VAR_3 | ||
A 100 | 3 | A 100 | 0 | A 100 | 2 | ||
A 30 | 0 | A 76 | 1 | A 30 | 0 | ||
A 76 | 3 | SDR 1 | 0 | A 76 | 1 | ||
SDR 1 | 2 | K B | 1 | SDR 1 | 0 | ||
K B | 2 | K C | 1 | K B | 1 | ||
K C | 2 | K E | 0 | K C | 0 | ||
K E | 1 | K H | 0 | K E | 1 | ||
K H | 1 | K J | 0 | K H | 0 | ||
K J | 1 | K L | 0 | K J | 0 | ||
K L | 1 | LM 1 | 1 | K L | 2 | ||
LM 1 | 1 | M | 1 | LM 1 | 0 | ||
M | 1 | Ind I | 1 | M | 0 | ||
Ind I | 3 | Ind II | 0 | Ind I | 0 | ||
Ind II | 2 | Ind III | 1 | Ind II | 0 | ||
Ind III | 2 | Ind VII | 0 | Ind III | 0 | ||
Ind VII | 3 | Ind XV | 0 | Ind VII | 0 | ||
Ind XV | 2 | Ind XVI | 0 | Ind XV | 1 | ||
Ind XVI | 2 | Ind XVII | 0 | Ind XVI | 0 | ||
Ind XVII | 2 | Ind XX | 0 | Ind XVII | 2 | ||
Ind XX | 3 | Ind XXII | 1 | Ind XX | 2 | ||
Ind XXII | 3 | Ind XXIII | 0 | Ind XXII | 1 | ||
Ind XXIII | 2 | Ind XXIV | 0 | Ind XXIII | 1 | ||
Ind XXIV | 2 | Ind XXV | 0 | Ind XXIV | 1 | ||
Ind XXV | 3 | Ind XXVI | 0 | Ind XXV | 1 | ||
Ind XXVI | 2 | Ind XXXI | 0 | Ind XXVI | 0 | ||
Ind XXXI | 2 | LI1 isoa | 0 | Ind XXXI | 0 | ||
LI1 isoa | 4 | LI1 isob | 1 | LI1 isoa | 0 | ||
LI1 isob | 2 | LI1 isoc | 1 | LI1 isob | 0 | ||
LI1 isoc | 3 | LI1 isonu | 0 | LI1 isoc | 0 | ||
LI1 isonu | 3 | LI1 isox | 0 | LI1 isonu | 0 | ||
LI1 isox | 3 | SC 1 | 1 | LI1 isox | 0 | ||
SC 1 | 1 | SC 1 | 0 |
As you can see, Input 2 has no value for the specimen called A30 so it cannot inner join with the other two variables.
The outputs in this case would be:
Output 1 | Output 2 | Output 3 | ||||||||
Specimen | VAR_1 | VAR_2 | Specimen | VAR_1 | VAR_3 | Specimen | VAR_2 | VAR_3 | ||
A 100 | 3 | 0 | A 100 | 3 | 2 | A 100 | 0 | 2 | ||
A 76 | 3 | 1 | A 30 | 0 | 0 | A 76 | 1 | 1 | ||
SDR 1 | 2 | 0 | A 76 | 3 | 1 | SDR 1 | 0 | 0 | ||
K B | 2 | 1 | SDR 1 | 2 | 0 | K B | 1 | 1 | ||
K C | 2 | 1 | K B | 2 | 1 | K C | 1 | 0 | ||
K E | 1 | 0 | K C | 2 | 0 | K E | 0 | 1 | ||
K H | 1 | 0 | K E | 1 | 1 | K H | 0 | 0 | ||
K J | 1 | 0 | K H | 1 | 0 | K J | 0 | 0 | ||
K L | 1 | 0 | K J | 1 | 0 | K L | 0 | 2 | ||
LM 1 | 1 | 1 | K L | 1 | 2 | LM 1 | 1 | 0 | ||
M | 1 | 1 | LM 1 | 1 | 0 | M | 1 | 0 | ||
Ind I | 3 | 1 | M | 1 | 0 | Ind I | 1 | 0 | ||
Ind II | 2 | 0 | Ind I | 3 | 0 | Ind II | 0 | 0 | ||
Ind III | 2 | 1 | Ind II | 2 | 0 | Ind III | 1 | 0 | ||
Ind VII | 3 | 0 | Ind III | 2 | 0 | Ind VII | 0 | 0 | ||
Ind XV | 2 | 0 | Ind VII | 3 | 0 | Ind XV | 0 | 1 | ||
Ind XVI | 2 | 0 | Ind XV | 2 | 1 | Ind XVI | 0 | 0 | ||
Ind XVII | 2 | 0 | Ind XVI | 2 | 0 | Ind XVII | 0 | 2 | ||
Ind XX | 3 | 0 | Ind XVII | 2 | 2 | Ind XX | 0 | 2 | ||
Ind XXII | 3 | 1 | Ind XX | 3 | 2 | Ind XXII | 1 | 1 | ||
Ind XXIII | 2 | 0 | Ind XXII | 3 | 1 | Ind XXIII | 0 | 1 | ||
Ind XXIV | 2 | 0 | Ind XXIII | 2 | 1 | Ind XXIV | 0 | 1 | ||
Ind XXV | 3 | 0 | Ind XXIV | 2 | 1 | Ind XXV | 0 | 1 | ||
Ind XXVI | 2 | 0 | Ind XXV | 3 | 1 | Ind XXVI | 0 | 0 | ||
Ind XXXI | 2 | 0 | Ind XXVI | 2 | 0 | Ind XXXI | 0 | 0 | ||
LI1 isoa | 4 | 0 | Ind XXXI | 2 | 0 | LI1 isoa | 0 | 0 | ||
LI1 isob | 2 | 1 | LI1 isoa | 4 | 0 | LI1 isob | 1 | 0 | ||
LI1 isoc | 3 | 1 | LI1 isob | 2 | 0 | LI1 isoc | 1 | 0 | ||
LI1 isonu | 3 | 0 | LI1 isoc | 3 | 0 | LI1 isonu | 0 | 0 | ||
LI1 isox | 3 | 0 | LI1 isonu | 3 | 0 | LI1 isox | 0 | 0 | ||
SC 1 | 1 | 1 | LI1 isox | 3 | 0 | SC 1 | 1 | 0 | ||
SC 1 | 1 | 0 |
There are many more variables with considerably more combinations of specimens.
My purpose is to create pairs of variables, then append additional data from another source (also on SAS) with different specimens but the same variables where the number of new specimens is equal to the existing specimens (this will be different for each pair). Then an analysis will be conducted on these datasets.
My requirement here is to automate the process of matching variables so that I don't have to do it manually.
Thanks for your help.
Mike
What you are trying to do is to create a map of all the possible combinations. You can do this easier by:
Put all your input data into one dataset first, i.e. a long one then doing a cartesian join on the data. Something like this post:
Why not just use a normal MERGE statement to combine all of the input datasets into a single dataset with ALL of your measures?
If the dataset names have nice numeric suffixes then you can use a list of dataset names like this:
data want ;
merge LI1_HOM_1 - LI1_HOM_20 ;
by specimen ;
run;
You will need to make sure that they are sorted by SPECIMEN first.
So if you have 20 little datasets you could use a simple macro like this to sort them all.
%macro sort_them(basename,byvars,from,to);
%local i ;
%do i=&from %to &to ;
proc sort data &basename.&i ;
by &byvars;
run;
%end;
%mend sort_them;
%sort_them(basename=LI1_HOM_,byvars=SPECIMEN,from=1,to=20);
Before you start using the macro variables in code generated by the macro, use %put to check the "evolution" of your macro variables:
%macro inner;
%let i = 1;
%let j = 2;
%do %while(&i < 3);
%put i=&i;
%put j=&j;
%let j = %eval(&j + 1);
%if &j = 4 %then %do;
%let i = %eval(&i + 1);
%let j = %eval(&i + 1);
%end;
%end;
%mend;
%inner
This will provide a clue where you made a mistake.
Hello and thanks for your reply.
I tried running this but just got a log telling me that it had created a SASREPORT:
NOTE: Writing TAGSETS.SASREPORT13(EGSR) Body file: EGSR
I don't know how to utilise this method!
That's a standard NOTE of code submitted by EG. Nothing to worry about.
The results of the %put statements in the log are important.
Why would you want to increment the loop counters manually?
Why aren't you just using normal iterative DO loops?
%do i=1 %to 3 ;
%do j=&i+1 %to 4 ;
...
%end;
%end;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.