BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
A_Halps
Obsidian | Level 7

Everyone has more than one result in my dataset. I am trying to create a new column that shows if they answered 1 and 2 OR 1 and 3 OR 1 and 5 OR 3 and 5, etc.

This is what I currently have, but it is not looking at the data by ssn which is what I need it to do:

 

proc sql;
create table diff_answer_combos as
select *
, case  when status_flg in (1,2) then "1+2"
		when status_flg in (1,3) then "1+3"
 		when status_flg in (1,4) then "1+4"
		when status_flg in (1,5) then "1+5"
		when status_flg in (1,6) then "1+6"
		when status_flg in (2,3) then "2+3"
		when status_flg in (2,4) then "2+4"
		when status_flg in (2,5) then "2+5"
		when status_flg in (2,6) then "2+6"
		when status_flg in (3,4) then "3+4"
		when status_flg in (3,5) then "3+5"
		when status_flg in (3,6) then "3+6"
		when status_flg in (4,5) then "4+5"
		when status_flg in (4,6) then "4+6"
		when status_flg in (5,6) then "5+6"
	else ""
end as status_combined

from duplicates_diff_answer
order by ssn;
quit;

Please help!

Thanks 🙂

 

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

So if you have two observations per group you can just build the new string from the MIN and MAX values.

proc sql;
create table diff_answer_combos as
select ssn
    , count(*) as nobs
    , catx('+',min(status_flg),max(status_flg)) as status_combined length=5
from duplicates_diff_answer
group by ssn
;
quit;

View solution in original post

5 REPLIES 5
A_Halps
Obsidian | Level 7
Correct
Kurt_Bremser
Super User

Then (IMO) a data step will be your best solution:

proc sort data=duplicates_diff_answer;
by ssn status_flg;
run;

data diff_answer_combos;
set duplicates_diff_answer;
by ssn;
if first.ssn
then status_combined = "   "; /* set a sufficient length here, if more is needed */
else status_combined = catx("+",status_combined,status_flg);
if last.ssn;
drop status_flg;
run;

If you need to make sure that only 2 rows exist per ssn, count the number of plus signs.

Reeza
Super User

If you're looking for status_flg across more than one row this type of logic will not work, it only evaluates once per row so I don't think this will achieve what you're looking to accomplish.

 

Can you show a sample of input data and desired results? 

Tom
Super User Tom
Super User

So if you have two observations per group you can just build the new string from the MIN and MAX values.

proc sql;
create table diff_answer_combos as
select ssn
    , count(*) as nobs
    , catx('+',min(status_flg),max(status_flg)) as status_combined length=5
from duplicates_diff_answer
group by ssn
;
quit;

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 1284 views
  • 0 likes
  • 4 in conversation