BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
billi_billi
Calcite | Level 5
data tests;
 input id $ test $ result $ ;
 datalines;
 XYZ001 ABC01A .
 XYZ001 ABC01B NE
 XYZ001 ABC01C NE
 XYZ001 ABC02 6
 XYZ001 ABC03 5
 XYZ001 ABC04 1
 XYZ001 ABC05 0
 XYZ001 ABC06 .
 XYZ001 ABC07 10
 XYZ001 ABC08 6
 XYZ002 ABC01A 2
 XYZ002 ABC01B 1
 XYZ002 ABC01C NE
 XYZ002 ABC02 15
 XYZ002 ABC03 9
 XYZ002 ABC04 NE
 XYZ002 ABC05 7
 XYZ002 ABC06 6
 XYZ002 ABC07 10
 XYZ002 ABC08 9
 XYZ003 ABC01A 6
 XYZ003 ABC01B 4
 XYZ003 ABC01C 6
 XYZ003 ABC02 8
 XYZ003 ABC03 9
 ;
 RUN;

proc sql;
    create table sub2 as
    select id,
           count(test) as missed_count,
           case 
               when max(case when test = 'ABC01A' and result in ('NE', ' ') then 1 else 0 end) = 1 
               then 'Y' 
               else 'N' 
           end as test_01A
    from tests
    where result in ('NE', ' ')
    group by id
    having missed_count > 3;
quit;

 Hello all,

I have dataset for test results and created a new dataset 'sub2' to summarize which id have more that 3 missed or 'NE' result, could anyone please help in updating this code to create a new column named 'missed' with values of test which are missing or NE and multiple test separated by ','? For example for in my current sub2 dataset I have XYZ001 id, for this id new column value would be ABC01A,ABC01B,ABC01C,ABC06

 

Thank you in advance.

1 ACCEPTED SOLUTION

Accepted Solutions
mkeintz
PROC Star

You want to concatenate the values of a character variable from multiple rows in a table.  There may be a way to do that in PROC SQL, but I am not enough of a practitioner to know how.

 

But I do know how to do this in a data step.  Assuming your data are sorted by ID, then:

 

data want (keep=id missed_count test_01a missed_list);
  do missed_count=1 by 1 until (last.id);
    set tests (where=(result=' ' or result='NE'));
    by id;
    if test='ABC01A' then test_01a='Y';
    length missed_list $100;
    missed_list=catx(',',missed_list,test);
  end;
  if test_01A^='Y' then test_01A='N';
  if missed_count>3 then output;
run;
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

View solution in original post

3 REPLIES 3
mkeintz
PROC Star

You want to concatenate the values of a character variable from multiple rows in a table.  There may be a way to do that in PROC SQL, but I am not enough of a practitioner to know how.

 

But I do know how to do this in a data step.  Assuming your data are sorted by ID, then:

 

data want (keep=id missed_count test_01a missed_list);
  do missed_count=1 by 1 until (last.id);
    set tests (where=(result=' ' or result='NE'));
    by id;
    if test='ABC01A' then test_01a='Y';
    length missed_list $100;
    missed_list=catx(',',missed_list,test);
  end;
  if test_01A^='Y' then test_01A='N';
  if missed_count>3 then output;
run;
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
billi_billi
Calcite | Level 5

@mkeintz  @PaigeMiller thank you both, this worked.

PaigeMiller
Diamond | Level 26

Agreeing with @mkeintz , SQL is the wrong tool to do this. I don't even think it is possible in SQL.

 

However, I would also like to point out that creating these long character strings is usually a suboptimal way to proceed, and can cause more programming difficulties down the line. A better approach in most cases is to leave the data in the long form rather than the wide form that you ask for. I say "in most cases", but there are exceptions, and if we knew what you were trying to do (the big picture, not how to code this one step), we could provide more advice.

--
Paige Miller

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 440 views
  • 0 likes
  • 3 in conversation