Good morning,
I have a data set set similar to this.
DATA TEST;
INPUT EVENT_ID TEST$ RESULTS $;
DATALINES;
1234 NAATR N
1234 NAATG P
1234 NAATP N
1234 NAATU N
1256 NAATU N
1256 NAATP N
1256 NAATR N
1256 NAATG N
3456 NAATR P
3456 NAATG P
3456 NAATP P
;
RUN;
I sorted and transposed the data set
EVENT_ID NAATR NAATG NAATP NAATU
1234 N P N N
1256 N N N N
3456 P P P
i WOULD LIKE TO HAVE THIS (BELOW)
EVENT_ID ANY_NAAT
1234 P
1256 N
3456 P
Can be done directly from the initial dataset with by-processing and retain:
data have;
input event_id test $ results $;
datalines;
1234 NAATR N
1234 NAATG P
1234 NAATP N
1234 NAATU N
1256 NAATU N
1256 NAATP N
1256 NAATR N
1256 NAATG N
3456 NAATR P
3456 NAATG P
3456 NAATP P
;
run;
data want (keep=event_id any_naat);
set have;
by event_id;
retain any_naat;
if first.event_id then any_naat = 'N';
if results = 'P' then any_naat = 'P';
if last.event_id then output;
run;
proc print data=want noobs;
run;
Result:
event_id any_naat 1234 P 1256 N 3456 P
Its very hard to read code when it is SHOUTED at me. Perhaps something like:
data want; set have; any_naat=ifc(whichc("P",naatr,naatg,naatp,naatu),"P","N"); run;
Based on your transposed dataset. However you could do the same thing in proc sql without transposing:
proc sql; create table want as select distinct event_id, case when exists(select distinct event_id from test where results="P") then "P"
else "N" end as any_naat from test; quit;
The whole transposition business is unnecessary. See my code, and @RW9's SQL. Both work from the original dataset.
The first code I presented would work on your transposed data, the proc sql works on your original data without need for transposing. Please post test data in the form of a datastep in future:
please try
data want;
set have;
by event_id;
if results='N' then ord=0;
else if results='P' then ord=1;
run;
proc sort data=want;
by event_id ord;
run;
data want2;
set want;
by event_id ord;
if last.event_id;
run;
data have;
input event_id test $ results $;
datalines;
1234 NAATR N
1234 NAATG P
1234 NAATP N
1234 NAATU N
1256 NAATU N
1256 NAATP N
1256 NAATR N
1256 NAATG N
3456 NAATR P
3456 NAATG P
3456 NAATP P
;
run;
proc sql;
select EVENT_ID ,max(results) as ANY_NAAT
from have
group by EVENT_ID;
quit;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.