BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
RyanSimmons
Pyrite | Level 9

I am a bit stuck on how to automate this programming issue. I am trying to write a macro to perform this but can't seem to figure out a clean way of doing it so it is generalizable. 

 

The input to this macro is going to be a numeric dataset with ~30-50 observations and anywhere from 2 to 10 variables. This dataset will be structure like this:

DATA input;
	do Item = 1 to 30;
		Score1 = rand('uniform');
		Score2 = rand('uniform');
		Score3 = rand('uniform');
		output;
	end;
run;

But the exact numbers will vary for each such input dataset.

 

What I need to do is evaluate for each Item (i.e. row) whether or not the set of Scores (columns) meets a condition. The condition is: exactly one of the Scores is >= 0.6 and each other score is <=0.4. Now, for the example dataset, this can be coded pretty easily using something like:

 

DATA output;
	set input;
	if Score1 >= 0.6 & Score2 <= 0.4 & Score3 <= 0.4 then condition = 1;
	else if Score1 <= 0.4 & Score2 >=0.6 & Score3 <= 0.4 then condition = 1;
	else if Score1 <= 0.4 & Score2 <= 0.4 & Score3 >=0.6 then condition = 1;
	else condition = 0;
run;

The problem of course is that this isn't easily generalizable, this only works if there are exactly 3 variables. 

How would I automate/generalize this logic in a macro? I'm pretty stuck, I don't really know what to do. I'm guessing there is a way to do this with arrays but I can't quite figure out how to implement that. 

1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User

Use LARGEST()

Your logic can be generalized to check that the largest value is greater than 0.6 and the second largest value must be less than 0.4 for the rest to be less than 0.4. No macros. Use the colon to automatically refer to all variables that start with score so its dynamic as well. If you have other variables that start with the prefix score this will not work. For other options in that case, see the short cut list link below.

if largest(1, of score:) > =0.6 and largest(2, of score:) <= 0.4 then condition=1; 
else condition = 0;

Largest function

https://documentation.sas.com/?docsetId=lefunctionsref&docsetTarget=n1hqpyj2pz85u2n1fcl1uoh5rx2l.htm...

 

Here is a reference that illustrates how to refer to variables and datasets in a short cut list:
https://blogs.sas.com/content/iml/2018/05/29/6-easy-ways-to-specify-a-list-of-variables-in-sas.html

 

 


@RyanSimmons wrote:

I am a bit stuck on how to automate this programming issue. I am trying to write a macro to perform this but can't seem to figure out a clean way of doing it so it is generalizable. 

 

The input to this macro is going to be a numeric dataset with ~30-50 observations and anywhere from 2 to 10 variables. This dataset will be structure like this:

DATA input;
	do Item = 1 to 30;
		Score1 = rand('uniform');
		Score2 = rand('uniform');
		Score3 = rand('uniform');
		output;
	end;
run;

But the exact numbers will vary for each such input dataset.

 

What I need to do is evaluate for each Item (i.e. row) whether or not the set of Scores (columns) meets a condition. The condition is: exactly one of the Scores is >= 0.6 and each other score is <=0.4. Now, for the example dataset, this can be coded pretty easily using something like:

 

DATA output;
	set input;
	if Score1 >= 0.6 & Score2 <= 0.4 & Score3 <= 0.4 then condition = 1;
	else if Score1 <= 0.4 & Score2 >=0.6 & Score3 <= 0.4 then condition = 1;
	else if Score1 <= 0.4 & Score2 <= 0.4 & Score3 >=0.6 then condition = 1;
	else condition = 0;
run;

The problem of course is that this isn't easily generalizable, this only works if there are exactly 3 variables. 

How would I automate/generalize this logic in a macro? I'm pretty stuck, I don't really know what to do. I'm guessing there is a way to do this with arrays but I can't quite figure out how to implement that. 


 

View solution in original post

2 REPLIES 2
Reeza
Super User

Use LARGEST()

Your logic can be generalized to check that the largest value is greater than 0.6 and the second largest value must be less than 0.4 for the rest to be less than 0.4. No macros. Use the colon to automatically refer to all variables that start with score so its dynamic as well. If you have other variables that start with the prefix score this will not work. For other options in that case, see the short cut list link below.

if largest(1, of score:) > =0.6 and largest(2, of score:) <= 0.4 then condition=1; 
else condition = 0;

Largest function

https://documentation.sas.com/?docsetId=lefunctionsref&docsetTarget=n1hqpyj2pz85u2n1fcl1uoh5rx2l.htm...

 

Here is a reference that illustrates how to refer to variables and datasets in a short cut list:
https://blogs.sas.com/content/iml/2018/05/29/6-easy-ways-to-specify-a-list-of-variables-in-sas.html

 

 


@RyanSimmons wrote:

I am a bit stuck on how to automate this programming issue. I am trying to write a macro to perform this but can't seem to figure out a clean way of doing it so it is generalizable. 

 

The input to this macro is going to be a numeric dataset with ~30-50 observations and anywhere from 2 to 10 variables. This dataset will be structure like this:

DATA input;
	do Item = 1 to 30;
		Score1 = rand('uniform');
		Score2 = rand('uniform');
		Score3 = rand('uniform');
		output;
	end;
run;

But the exact numbers will vary for each such input dataset.

 

What I need to do is evaluate for each Item (i.e. row) whether or not the set of Scores (columns) meets a condition. The condition is: exactly one of the Scores is >= 0.6 and each other score is <=0.4. Now, for the example dataset, this can be coded pretty easily using something like:

 

DATA output;
	set input;
	if Score1 >= 0.6 & Score2 <= 0.4 & Score3 <= 0.4 then condition = 1;
	else if Score1 <= 0.4 & Score2 >=0.6 & Score3 <= 0.4 then condition = 1;
	else if Score1 <= 0.4 & Score2 <= 0.4 & Score3 >=0.6 then condition = 1;
	else condition = 0;
run;

The problem of course is that this isn't easily generalizable, this only works if there are exactly 3 variables. 

How would I automate/generalize this logic in a macro? I'm pretty stuck, I don't really know what to do. I'm guessing there is a way to do this with arrays but I can't quite figure out how to implement that. 


 

RyanSimmons
Pyrite | Level 9
Thanks! I knew the solution would be really simple!

sas-innovate-white.png

Special offer for SAS Communities members

Save $250 on SAS Innovate and get a free advance copy of the new SAS For Dummies book! Use the code "SASforDummies" to register. Don't miss out, May 6-9, in Orlando, Florida.

 

View the full agenda.

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 2 replies
  • 800 views
  • 0 likes
  • 2 in conversation