Hi,
I have two datasets. One has ID number and several categorical variables, like this:
ID cat_var1 cat_var2 cat_var3
1 a d i
2 b f h
3 c e j
4 a e h
5 c d h
Dataset 1 is unique at the ID level. Dataset 2 has an ID variable; a numeric variable, which always equals 1; and a group variable. Each ID recurs several times in Dataset 2, like this:
ID count_var group_var
1 1 1
1 1 1
1 1 2
1 1 2
1 1 2
2 1 1
2 1 1
2 1 1
What I want is to sum the count_var by ID and group (that is, ID=1 would have sum=2 for group 1 and sum=3 for group 2) and add the sum variables to Dataset 1. I know the inefficient way to do this--do a proc sql sum of count_var grouping by ID and group_var, then transpose, then join to Dataset 1 on ID--but I'm hoping there's a more efficient way to get what I want.
Any help is much appreciated.
if you are not interested to use the proc sql , please try the datastep
dataset 2:
data dataset2;
set have;
by id group_var;
retain sum;
if first.group_var then sum=1;
else sum+1;
if last.group_var;
run;
Then this dataset will have one or more records per id, group_var with the sum information, which you could merge with the dataset1 by id variable.
Thanks,
Jag
To get your group counts
proc summary data=dataset2 nway;
class id group_var;
var count;
output out=summarycount(drop=_type_ _freq_) sum=;
run;
Then either a join using proc sql or a data step merge by ID with the first data set
I think you were looking for a way to do all of this in one step, not so much getting away from sql, is that right? If so this will get you what you want:
data have1;
input ID cat_var1 $ cat_var2 $ cat_var3 $;
cards;
1 a d i
2 b f h
3 c e j
4 a e h
5 c d h
;
data have2;
input ID count_var group_var;
cards;
1 1 1
1 1 1
1 1 2
1 1 2
1 1 2
2 1 1
2 1 1
2 1 1
;
proc sql;
create table want as
select distinct
a.id,
a.cat_var1,
a.cat_var2,
a.cat_var3,
count(b.group_var) as count_group
from have1 a left join
have2 b on
a.id = b.id
group by b.id,b.group_var;
Hash Table.
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.