BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
A_Halps
Obsidian | Level 7

Everyone has more than one result in my dataset. I am trying to create a new column that shows if they answered 1 and 2 OR 1 and 3 OR 1 and 5 OR 3 and 5, etc.

This is what I currently have, but it is not looking at the data by ssn which is what I need it to do:

 

proc sql;
create table diff_answer_combos as
select *
, case  when status_flg in (1,2) then "1+2"
		when status_flg in (1,3) then "1+3"
 		when status_flg in (1,4) then "1+4"
		when status_flg in (1,5) then "1+5"
		when status_flg in (1,6) then "1+6"
		when status_flg in (2,3) then "2+3"
		when status_flg in (2,4) then "2+4"
		when status_flg in (2,5) then "2+5"
		when status_flg in (2,6) then "2+6"
		when status_flg in (3,4) then "3+4"
		when status_flg in (3,5) then "3+5"
		when status_flg in (3,6) then "3+6"
		when status_flg in (4,5) then "4+5"
		when status_flg in (4,6) then "4+6"
		when status_flg in (5,6) then "5+6"
	else ""
end as status_combined

from duplicates_diff_answer
order by ssn;
quit;

Please help!

Thanks 🙂

 

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

So if you have two observations per group you can just build the new string from the MIN and MAX values.

proc sql;
create table diff_answer_combos as
select ssn
    , count(*) as nobs
    , catx('+',min(status_flg),max(status_flg)) as status_combined length=5
from duplicates_diff_answer
group by ssn
;
quit;

View solution in original post

5 REPLIES 5
A_Halps
Obsidian | Level 7
Correct
Kurt_Bremser
Super User

Then (IMO) a data step will be your best solution:

proc sort data=duplicates_diff_answer;
by ssn status_flg;
run;

data diff_answer_combos;
set duplicates_diff_answer;
by ssn;
if first.ssn
then status_combined = "   "; /* set a sufficient length here, if more is needed */
else status_combined = catx("+",status_combined,status_flg);
if last.ssn;
drop status_flg;
run;

If you need to make sure that only 2 rows exist per ssn, count the number of plus signs.

Reeza
Super User

If you're looking for status_flg across more than one row this type of logic will not work, it only evaluates once per row so I don't think this will achieve what you're looking to accomplish.

 

Can you show a sample of input data and desired results? 

Tom
Super User Tom
Super User

So if you have two observations per group you can just build the new string from the MIN and MAX values.

proc sql;
create table diff_answer_combos as
select ssn
    , count(*) as nobs
    , catx('+',min(status_flg),max(status_flg)) as status_combined length=5
from duplicates_diff_answer
group by ssn
;
quit;

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 16. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 1382 views
  • 0 likes
  • 4 in conversation