BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
RyanSimmons
Pyrite | Level 9

I am a bit stuck on how to automate this programming issue. I am trying to write a macro to perform this but can't seem to figure out a clean way of doing it so it is generalizable. 

 

The input to this macro is going to be a numeric dataset with ~30-50 observations and anywhere from 2 to 10 variables. This dataset will be structure like this:

DATA input;
	do Item = 1 to 30;
		Score1 = rand('uniform');
		Score2 = rand('uniform');
		Score3 = rand('uniform');
		output;
	end;
run;

But the exact numbers will vary for each such input dataset.

 

What I need to do is evaluate for each Item (i.e. row) whether or not the set of Scores (columns) meets a condition. The condition is: exactly one of the Scores is >= 0.6 and each other score is <=0.4. Now, for the example dataset, this can be coded pretty easily using something like:

 

DATA output;
	set input;
	if Score1 >= 0.6 & Score2 <= 0.4 & Score3 <= 0.4 then condition = 1;
	else if Score1 <= 0.4 & Score2 >=0.6 & Score3 <= 0.4 then condition = 1;
	else if Score1 <= 0.4 & Score2 <= 0.4 & Score3 >=0.6 then condition = 1;
	else condition = 0;
run;

The problem of course is that this isn't easily generalizable, this only works if there are exactly 3 variables. 

How would I automate/generalize this logic in a macro? I'm pretty stuck, I don't really know what to do. I'm guessing there is a way to do this with arrays but I can't quite figure out how to implement that. 

1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User

Use LARGEST()

Your logic can be generalized to check that the largest value is greater than 0.6 and the second largest value must be less than 0.4 for the rest to be less than 0.4. No macros. Use the colon to automatically refer to all variables that start with score so its dynamic as well. If you have other variables that start with the prefix score this will not work. For other options in that case, see the short cut list link below.

if largest(1, of score:) > =0.6 and largest(2, of score:) <= 0.4 then condition=1; 
else condition = 0;

Largest function

https://documentation.sas.com/?docsetId=lefunctionsref&docsetTarget=n1hqpyj2pz85u2n1fcl1uoh5rx2l.htm...

 

Here is a reference that illustrates how to refer to variables and datasets in a short cut list:
https://blogs.sas.com/content/iml/2018/05/29/6-easy-ways-to-specify-a-list-of-variables-in-sas.html

 

 


@RyanSimmons wrote:

I am a bit stuck on how to automate this programming issue. I am trying to write a macro to perform this but can't seem to figure out a clean way of doing it so it is generalizable. 

 

The input to this macro is going to be a numeric dataset with ~30-50 observations and anywhere from 2 to 10 variables. This dataset will be structure like this:

DATA input;
	do Item = 1 to 30;
		Score1 = rand('uniform');
		Score2 = rand('uniform');
		Score3 = rand('uniform');
		output;
	end;
run;

But the exact numbers will vary for each such input dataset.

 

What I need to do is evaluate for each Item (i.e. row) whether or not the set of Scores (columns) meets a condition. The condition is: exactly one of the Scores is >= 0.6 and each other score is <=0.4. Now, for the example dataset, this can be coded pretty easily using something like:

 

DATA output;
	set input;
	if Score1 >= 0.6 & Score2 <= 0.4 & Score3 <= 0.4 then condition = 1;
	else if Score1 <= 0.4 & Score2 >=0.6 & Score3 <= 0.4 then condition = 1;
	else if Score1 <= 0.4 & Score2 <= 0.4 & Score3 >=0.6 then condition = 1;
	else condition = 0;
run;

The problem of course is that this isn't easily generalizable, this only works if there are exactly 3 variables. 

How would I automate/generalize this logic in a macro? I'm pretty stuck, I don't really know what to do. I'm guessing there is a way to do this with arrays but I can't quite figure out how to implement that. 


 

View solution in original post

2 REPLIES 2
Reeza
Super User

Use LARGEST()

Your logic can be generalized to check that the largest value is greater than 0.6 and the second largest value must be less than 0.4 for the rest to be less than 0.4. No macros. Use the colon to automatically refer to all variables that start with score so its dynamic as well. If you have other variables that start with the prefix score this will not work. For other options in that case, see the short cut list link below.

if largest(1, of score:) > =0.6 and largest(2, of score:) <= 0.4 then condition=1; 
else condition = 0;

Largest function

https://documentation.sas.com/?docsetId=lefunctionsref&docsetTarget=n1hqpyj2pz85u2n1fcl1uoh5rx2l.htm...

 

Here is a reference that illustrates how to refer to variables and datasets in a short cut list:
https://blogs.sas.com/content/iml/2018/05/29/6-easy-ways-to-specify-a-list-of-variables-in-sas.html

 

 


@RyanSimmons wrote:

I am a bit stuck on how to automate this programming issue. I am trying to write a macro to perform this but can't seem to figure out a clean way of doing it so it is generalizable. 

 

The input to this macro is going to be a numeric dataset with ~30-50 observations and anywhere from 2 to 10 variables. This dataset will be structure like this:

DATA input;
	do Item = 1 to 30;
		Score1 = rand('uniform');
		Score2 = rand('uniform');
		Score3 = rand('uniform');
		output;
	end;
run;

But the exact numbers will vary for each such input dataset.

 

What I need to do is evaluate for each Item (i.e. row) whether or not the set of Scores (columns) meets a condition. The condition is: exactly one of the Scores is >= 0.6 and each other score is <=0.4. Now, for the example dataset, this can be coded pretty easily using something like:

 

DATA output;
	set input;
	if Score1 >= 0.6 & Score2 <= 0.4 & Score3 <= 0.4 then condition = 1;
	else if Score1 <= 0.4 & Score2 >=0.6 & Score3 <= 0.4 then condition = 1;
	else if Score1 <= 0.4 & Score2 <= 0.4 & Score3 >=0.6 then condition = 1;
	else condition = 0;
run;

The problem of course is that this isn't easily generalizable, this only works if there are exactly 3 variables. 

How would I automate/generalize this logic in a macro? I'm pretty stuck, I don't really know what to do. I'm guessing there is a way to do this with arrays but I can't quite figure out how to implement that. 


 

RyanSimmons
Pyrite | Level 9
Thanks! I knew the solution would be really simple!

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 2 replies
  • 409 views
  • 0 likes
  • 2 in conversation