Is it possible to merge multiple observations into one observation based on ID and a categorical variable? Right now I have multiple observations for each record when I only need one observation per record ID for each level of a categorical variable, this is what I need to do:
Record ID: Category: Record ID (grouped): Category (grouped):
1 2 1 2
1 2
1 3 1 3
1 3
1 3
2 4 2 4
2 4
How do I take the first two columns and group them to create the second two columns?
select distinct record_id, Category;
you could also add a counter to tell you how many record for each group.
It really is not clear what your output should be. Is it a report that people read or a data set? If a data set would it have 7 records with missing values for several or 3?
Also, are there other variables involved?
This may be one way:
proc sql; create table want as select distinct recordid, category from have; quit;
Since you don't actually show normally acceptable SAS variable names I'm guessing on those.
Proc sort using nodukey option might also do what you want.
data have;
input recordID Category;
datalines;
1 2
1 2
1 3
1 3
1 3
2 4
2 4
;
run;
proc sort data = have nodupkey; by recordID category; run;
You have plenty of good suggestions that work for what you have presented.
However, suppose your data set contains additional variables and you would like to preserve some values of those other variables?
Here's a solution that best addresses the issue of "additional variables":
data want;
update have (obs=0) have;
by id category;
run;
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.