Hi SAS community,
I have 4 datasets that have data on same population across 6 time points.
I want to know how to perform the occurrence as show below.
Dataset 1
ID Var 1
1 0
2 0
3 0
4 0
5 0
Dataset 2
ID Var 1
1 0
2 0
3 1
4 0
5 0
Dataset 3
ID Var 1
1 1
2 0
3 1
4 0
5 0
Dataset 4
ID Var 1
1 1
2 0
3 0
4 1
5 0
I want to find of all the people who had an occurrence of 1 at-least once across the 4 time points or Datasets.
I would merge all of the datasets, then use SQL.
proc sql; create table mergedSet as select * from Dataset1; insert into mergedSet values(select * from Dataset2); insert into mergedSet values(select * from Dataset3); insert into mergedSet values(select * from Dataset4); insert into mergedSet values(select * from Dataset5); select id, sum(var1)>0 as hasValue from mergedSet group by id; quit;
I would merge all of the datasets, then use SQL.
proc sql; create table mergedSet as select * from Dataset1; insert into mergedSet values(select * from Dataset2); insert into mergedSet values(select * from Dataset3); insert into mergedSet values(select * from Dataset4); insert into mergedSet values(select * from Dataset5); select id, sum(var1)>0 as hasValue from mergedSet group by id; quit;
A data step approach:
data want;
set dataset1 dataset2 dataset3 dataset4;
by ID;
where var1=1;
if first.ID;
keep ID;
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.
Find more tutorials on the SAS Users YouTube channel.