Hi,
I have a dataset that is at the level of person and values of a categorical variable (categ_1). There are several possible values of categ_1 but no person can have more than 2 values. Each row also has a value for categ_2 and a date variable.
ID categ_1 categ_2 date
1 abc jkl 1/1/2009
1 def mno 1/1/2010
2 abc mno 1/1/2009
What I'm trying to do is isolate those people with multiple values of categ_1, and find out what their values of categ_2 are for each value of categ_1, by date (so the one with the earlier date is their first categ_1 and the one with the later date is their second categ_1).
What I did was this:
proc sort data=want;
by ID date;
run;
data have;
set want;
by ID date;
if first.ID then first_categ_1=1;
else second_categ_1=1;
if first_categ_1=1 and categ_2='jkl' then new_var='First categ 1=JKL';
if first_categ_1=1 and categ_2='mno' then new_var='First categ 1=MNO';
if second_categ_1=1 and categ_2='mno' then new_var='Second categ 1=MNO';
if second_categ_1=1 and categ_2='mno' then new_var='Second categ 1=MNO';
run;
This works fine and gives me the information I want; the issue is that I want to exclude anyone who only has one value for categ_1 from my frequencies. I'm not sure how to keep both rows for people with two rows with different values of categ_1 and get rid of the rows for those who only have one row with one value of categ_1. I don't want to collapse because of new_var which tells me what happened to each person's first categ_1 and second categ_1.
Any help is much appreciated.
It isn't clear, at least to me, if you want to exclude from an analysis but leave them in the data OR to remove the records from the data.
You might be able to set a flag that allows either with:
Categ_1_flag= ( first.id and last.id);
If there is only one value for the ID variable then this flag will be 1 other wise 0.
You could either drop the record with
If categ_1_flag=0;
or exclude from an analysis with a where statement or clause:
proc freq data=have;
where categ_1_flag=0;
tables ....
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.