data tests;
input id $ test $ result $ ;
datalines;
XYZ001 ABC01A .
XYZ001 ABC01B NE
XYZ001 ABC01C NE
XYZ001 ABC02 6
XYZ001 ABC03 5
XYZ001 ABC04 1
XYZ001 ABC05 0
XYZ001 ABC06 .
XYZ001 ABC07 10
XYZ001 ABC08 6
XYZ002 ABC01A 2
XYZ002 ABC01B 1
XYZ002 ABC01C NE
XYZ002 ABC02 15
XYZ002 ABC03 9
XYZ002 ABC04 NE
XYZ002 ABC05 7
XYZ002 ABC06 6
XYZ002 ABC07 10
XYZ002 ABC08 9
XYZ003 ABC01A 6
XYZ003 ABC01B 4
XYZ003 ABC01C 6
XYZ003 ABC02 8
XYZ003 ABC03 9
;
RUN;
proc sql;
create table sub2 as
select id,
count(test) as missed_count,
case
when max(case when test = 'ABC01A' and result in ('NE', ' ') then 1 else 0 end) = 1
then 'Y'
else 'N'
end as test_01A
from tests
where result in ('NE', ' ')
group by id
having missed_count > 3;
quit;
Hello all,
I have dataset for test results and created a new dataset 'sub2' to summarize which id have more that 3 missed or 'NE' result, could anyone please help in updating this code to create a new column named 'missed' with values of test which are missing or NE and multiple test separated by ','? For example for in my current sub2 dataset I have XYZ001 id, for this id new column value would be ABC01A,ABC01B,ABC01C,ABC06
Thank you in advance.
You want to concatenate the values of a character variable from multiple rows in a table. There may be a way to do that in PROC SQL, but I am not enough of a practitioner to know how.
But I do know how to do this in a data step. Assuming your data are sorted by ID, then:
data want (keep=id missed_count test_01a missed_list);
do missed_count=1 by 1 until (last.id);
set tests (where=(result=' ' or result='NE'));
by id;
if test='ABC01A' then test_01a='Y';
length missed_list $100;
missed_list=catx(',',missed_list,test);
end;
if test_01A^='Y' then test_01A='N';
if missed_count>3 then output;
run;
You want to concatenate the values of a character variable from multiple rows in a table. There may be a way to do that in PROC SQL, but I am not enough of a practitioner to know how.
But I do know how to do this in a data step. Assuming your data are sorted by ID, then:
data want (keep=id missed_count test_01a missed_list);
do missed_count=1 by 1 until (last.id);
set tests (where=(result=' ' or result='NE'));
by id;
if test='ABC01A' then test_01a='Y';
length missed_list $100;
missed_list=catx(',',missed_list,test);
end;
if test_01A^='Y' then test_01A='N';
if missed_count>3 then output;
run;
@mkeintz @PaigeMiller thank you both, this worked.
Agreeing with @mkeintz , SQL is the wrong tool to do this. I don't even think it is possible in SQL.
However, I would also like to point out that creating these long character strings is usually a suboptimal way to proceed, and can cause more programming difficulties down the line. A better approach in most cases is to leave the data in the long form rather than the wide form that you ask for. I say "in most cases", but there are exceptions, and if we knew what you were trying to do (the big picture, not how to code this one step), we could provide more advice.
Save $250 on SAS Innovate and get a free advance copy of the new SAS For Dummies book! Use the code "SASforDummies" to register. Don't miss out, May 6-9, in Orlando, Florida.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.