BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
A_Halps
Obsidian | Level 7

Everyone has more than one result in my dataset. I am trying to create a new column that shows if they answered 1 and 2 OR 1 and 3 OR 1 and 5 OR 3 and 5, etc.

This is what I currently have, but it is not looking at the data by ssn which is what I need it to do:

 

proc sql;
create table diff_answer_combos as
select *
, case  when status_flg in (1,2) then "1+2"
		when status_flg in (1,3) then "1+3"
 		when status_flg in (1,4) then "1+4"
		when status_flg in (1,5) then "1+5"
		when status_flg in (1,6) then "1+6"
		when status_flg in (2,3) then "2+3"
		when status_flg in (2,4) then "2+4"
		when status_flg in (2,5) then "2+5"
		when status_flg in (2,6) then "2+6"
		when status_flg in (3,4) then "3+4"
		when status_flg in (3,5) then "3+5"
		when status_flg in (3,6) then "3+6"
		when status_flg in (4,5) then "4+5"
		when status_flg in (4,6) then "4+6"
		when status_flg in (5,6) then "5+6"
	else ""
end as status_combined

from duplicates_diff_answer
order by ssn;
quit;

Please help!

Thanks 🙂

 

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

So if you have two observations per group you can just build the new string from the MIN and MAX values.

proc sql;
create table diff_answer_combos as
select ssn
    , count(*) as nobs
    , catx('+',min(status_flg),max(status_flg)) as status_combined length=5
from duplicates_diff_answer
group by ssn
;
quit;

View solution in original post

5 REPLIES 5
A_Halps
Obsidian | Level 7
Correct
Kurt_Bremser
Super User

Then (IMO) a data step will be your best solution:

proc sort data=duplicates_diff_answer;
by ssn status_flg;
run;

data diff_answer_combos;
set duplicates_diff_answer;
by ssn;
if first.ssn
then status_combined = "   "; /* set a sufficient length here, if more is needed */
else status_combined = catx("+",status_combined,status_flg);
if last.ssn;
drop status_flg;
run;

If you need to make sure that only 2 rows exist per ssn, count the number of plus signs.

Reeza
Super User

If you're looking for status_flg across more than one row this type of logic will not work, it only evaluates once per row so I don't think this will achieve what you're looking to accomplish.

 

Can you show a sample of input data and desired results? 

Tom
Super User Tom
Super User

So if you have two observations per group you can just build the new string from the MIN and MAX values.

proc sql;
create table diff_answer_combos as
select ssn
    , count(*) as nobs
    , catx('+',min(status_flg),max(status_flg)) as status_combined length=5
from duplicates_diff_answer
group by ssn
;
quit;

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 1593 views
  • 0 likes
  • 4 in conversation