Hi,
I have a dataset which is supposed to be at the person-level, but really there are some people on multiple rows (I know because I've added an umbrella ID and there are some people on two rows with the same umbrella ID but different values of my former ID variable, ID2). My dataset has ID1 (the umbrella ID), ID2, indicators for each month (=1 if the person had an event in that month), and a number of categorical variables.
ID1 ID2 mth_200901 mth_200902...mth_201311 mth_201312 categ_1 categ_2
1 1 1 1 . . abc pqr
1 2 . . 1 1 def xyz
What I want is two things. One is simple and is just that for each value of ID1, they should have a value of 1 for each mth_ indicator as long as one of their sub-IDs (ID2) had a 1 for that month. The other is that I want the values of categ_1 and categ_2 (and the several other categorical vars) as of the row with the latest month that =1. So my final dataset would have this summary row for ID1=1.
ID1 mth_200901 mth_200902...mth_201311 mth_201312 categ_1 categ_2
1 1 1 1 1 def xyz
Any help is much appreciated.
Assuming your data set is sorted by ID1 ID2, you could use:
data want;
update have (obs=0) have;
by ID1;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.