Solved: Need help with array to check multiple iterations of one variable to c...

Geoghegan · Posted 10-02-2025 10:23 AM

I have a messy dataset where I have up to 16 iterations of a variable Result1-Result16. Some people in the dataset only have a few iterations, some have 16 and everywhere in between. I'm trying to create a result summary variable that takes into account each of these versions. The Result variable is in number form and I have a format for it that interprets what the numbers mean, I'm not sure if that matters for the coding. When interpreting the result options into the SummaryResult variable I'd like to create, some only match up with one interpretation but others should be grouped together since they mean essentially the same thing for the SummaryResult variable. For example:

A result of 100 means UNKNOWN

A result of 200 means NO

But then a result of 300, 400, 500, etc means YES

Many are also missing with . as the result, so the ResultSummary should be . also

To complicate matters, I can't just look at the last/highest iteration of Result because the correct/best ResultSummary may not be there, it could be the first or middle or last. The ResultSummary should be determined by a hierarchy:

If there are any results that mean YES, the ResultSummary variable should be YES.

If there are no YES results, but there is at least one NO, the ResultSummary variable should be NO.

If there are no YES or NO results, but there is at least one UNKNOWN, the ResultSummary variable should be UNKNOWN.

If . is the only result for all iterations of Result for that person, the ResultSummary should be . (missing).

Here is a fake example:

Person	Result1	Result2	Result3	Result4	Result5	SummaryResult
111	.	.	100	.	100	UNKNOWN
112	200	200	.	.	500	YES
113	.	.	100	200	200	NO
114	.	.	.	.	.	.
115	.	300	100	100	400	YES

I know only basics about assays from a class about 6 years ago and I just can't wrap my head around how I could get it to check each iteration while not keeping the last version instead of the best summary based on the hierarchy.

Thank you for any help!!

Kurt_Bremser · Posted 10-02-2025 11:01 AM

Something like:

length summaryresult $7;
array res [*] result1-result5;
select (max(of res[*]));
  when (300,400,500) summaryresult = "YES";
  when (200) summaryresult = "NO";
  when (100) summaryresult = "UNKNOWN";
  otherwise summaryresult = ".";
end;

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

View solution in original post

Kurt_Bremser · Posted 10-02-2025 11:01 AM

Something like:

length summaryresult $7;
array res [*] result1-result5;
select (max(of res[*]));
  when (300,400,500) summaryresult = "YES";
  when (200) summaryresult = "NO";
  when (100) summaryresult = "UNKNOWN";
  otherwise summaryresult = ".";
end;

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

Geoghegan · Posted 10-04-2025 03:58 PM

Thank you! My example used fake numbers but I adjusted a few of mine to be in a descending order like my example and your suggestion worked, thank you!

Thanks others for the ideas, if I have to do further I'll keep them in mind!

Tom · Posted 10-04-2025 06:00 PM

Note that the ARRAY is not needed for this solution since you only need the list of variables once you can just place the list into the function call directly.

select (max(of result1-result5));

FreelanceReinh · Posted 10-02-2025 12:02 PM

Alternatively, you could store the exact value of max(of result1-result16) in variable SummaryResult (and thus have more information available) and define a format to display 100 as "UNKNOWN", 200 as "NO", etc.

proc format;
value sumres
  .='MISSING'
100='UNKNOWN'
200='NO'
300-high='YES'
other='STRANGE' 
;
run;

data want;
set have;
if n(of result:) then SummaryResult=max(of result:);
format SummaryResult sumres.;
run;

This formatted variable could be used in CLASS statements (e.g. in PROC MEANS), as a group variable in PROC REPORT, for creating frequency tables (with PROC FREQ), etc. However, it would be slightly less convenient than a character variable if you had to retrieve the formatted values very often (as in if vvalue(SummaryResult)='YES' then ...).

Ksharp · Posted 10-02-2025 11:45 PM

data have;
infile cards truncover expandtabs;
input Person	Result1	Result2	Result3	Result4	Result5	;
cards;
111	.	.	100	.	100	
112	200	200	.	.	500	
113	.	.	100	200	200	
114	.	.	.	.	.	
115	.	300	100	100	400	
;


data want;
 set have;
 array x{*} Result:;
 length ResultSummary $ 40;
 if (300 in x) or (400 in x) or (500 in x)  then ResultSummary ='YES ';
  else  if 200 in x then ResultSummary ='NO ';
    else if 100 in x then ResultSummary ='UNKNOWN';
run;

Need help with array to check multiple iterations of one variable to create a new one

Re: Need help with array to check multiple iterations of one variable to create a new one

Re: Need help with array to check multiple iterations of one variable to create a new one

Re: Need help with array to check multiple iterations of one variable to create a new one

Re: Need help with array to check multiple iterations of one variable to create a new one

Re: Need help with array to check multiple iterations of one variable to create a new one

Re: Need help with array to check multiple iterations of one variable to create a new one

Registration is open

Call for Content EXTENDED

Need help with array to check multiple iterations of one variable to create a new one

Re: Need help with array to check multiple iterations of one variable to create a new one

Re: Need help with array to check multiple iterations of one variable to create a new one

Re: Need help with array to check multiple iterations of one variable to create a new one

Re: Need help with array to check multiple iterations of one variable to create a new one

Re: Need help with array to check multiple iterations of one variable to create a new one

Re: Need help with array to check multiple iterations of one variable to create a new one

Registration is open

Call for Content EXTENDED

SAS Training: Just a Click Away