Is it possible to merge multiple observations into one observation based on ID and a categorical variable? Right now I have multiple observations for each record when I only need one observation per record ID for each level of a categorical variable, this is what I need to do:
Record ID: Category: Record ID (grouped): Category (grouped):
1 2 1 2
1 2
1 3 1 3
1 3
1 3
2 4 2 4
2 4
How do I take the first two columns and group them to create the second two columns?
select distinct record_id, Category;
you could also add a counter to tell you how many record for each group.
It really is not clear what your output should be. Is it a report that people read or a data set? If a data set would it have 7 records with missing values for several or 3?
Also, are there other variables involved?
This may be one way:
proc sql; create table want as select distinct recordid, category from have; quit;
Since you don't actually show normally acceptable SAS variable names I'm guessing on those.
Proc sort using nodukey option might also do what you want.
data have;
input recordID Category;
datalines;
1 2
1 2
1 3
1 3
1 3
2 4
2 4
;
run;
proc sort data = have nodupkey; by recordID category; run;
You have plenty of good suggestions that work for what you have presented.
However, suppose your data set contains additional variables and you would like to preserve some values of those other variables?
Here's a solution that best addresses the issue of "additional variables":
data want;
update have (obs=0) have;
by id category;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.