Hi all,
I have three data sets and want to make another one to contain only observarions that is shared in all three datasets. The two variables that shall match the condition in all three datasets are STRUCNUM and EN such as this;
STRUCNUM 12354 EN 410 from Fhwaed2015
STRUCNUM 12354 EN 410 from Fhwaed2016
STRUCNUM 12354 EN 410 from Fhwaed2017
If the above condition matched in all three datasets, I want to have them in my dataset.
I tried the following code, but could not get the job done.
I would appreciate if anyone could help me.
proc sql;
create table taxesdata as
select *
from data.Fhwaed2015, data.Fhwaed2016, data.Fhwaed2017
where Fhwaed2015.STRUCNUM=Fhwaed2016.STRUCNUM and Fhwaed2015.STRUCNUM=Fhwaed2017.STRUCNUM
Fhwaed2015.en=Fhwaed2016.en and Fhwaed2015.en=Fhwaed2017.en
;
quit;
mmhxc5
I'm still not following, but I'll take a guess. You have data sets for 3 years. You want records that are in all three years.
1. Stack the data (append, instead of merge)
2. Count the number of records per your grouping variables. If it's 3 you have records for three years, If it's not three you can drop it.
data want;
set data2015-data2017;
run;
proc sql;
create table want as
select *
from want
group by var1, var2, var3
having count(*) = 3;
quit;
@mmhxc5 wrote:
@Reeza, here is a screenshot of my data from fhwaed2015 with STRUCNUM and EN and the other variables. I want to combine this with the other two datasets - fhwaed2016, and fhwaed2017 that is matching STRUCNUM and EN condition and generate the dataset to have all variables matching in STRUCNUM and EN.
Thanks,
Obviously untested because no workable data was provided. If this isn't what you want, you'll need to clarify in detail.
EDIT: If you have duplicates of your grouping variable in the data sets, this method will not work.
@Reeza, here is a screenshot of my data from fhwaed2015 with STRUCNUM and EN and the other variables. I want to combine this with the other two datasets - fhwaed2016, and fhwaed2017 that is matching STRUCNUM and EN condition and generate the dataset to have all variables matching in STRUCNUM and EN.
Thanks,
I'm still not following, but I'll take a guess. You have data sets for 3 years. You want records that are in all three years.
1. Stack the data (append, instead of merge)
2. Count the number of records per your grouping variables. If it's 3 you have records for three years, If it's not three you can drop it.
data want;
set data2015-data2017;
run;
proc sql;
create table want as
select *
from want
group by var1, var2, var3
having count(*) = 3;
quit;
@mmhxc5 wrote:
@Reeza, here is a screenshot of my data from fhwaed2015 with STRUCNUM and EN and the other variables. I want to combine this with the other two datasets - fhwaed2016, and fhwaed2017 that is matching STRUCNUM and EN condition and generate the dataset to have all variables matching in STRUCNUM and EN.
Thanks,
Obviously untested because no workable data was provided. If this isn't what you want, you'll need to clarify in detail.
EDIT: If you have duplicates of your grouping variable in the data sets, this method will not work.
Following @Reeza 's guess, but doing everything in SQL:
proc sql;
create table records as
select unique Fhwaed2015.STRUCNUM, Fhwaed2015.en
from data.Fhwaed2015, data.Fhwaed2016, data.Fhwaed2017
where Fhwaed2015.STRUCNUM=Fhwaed2016.STRUCNUM and Fhwaed2015.STRUCNUM=Fhwaed2017.STRUCNUM
Fhwaed2015.en=Fhwaed2016.en and Fhwaed2015.en=Fhwaed2017.en;
create table taxesData as
select a.* from Fhwaed2015 as a inner join records as b on a.STRUCNUM=b.STRUCNUM and a.en=b.en
union all
select a.* from Fhwaed2016 as a inner join records as b on a.STRUCNUM=b.STRUCNUM and a.en=b.en
union all
select a.* from Fhwaed2017 as a inner join records as b on a.STRUCNUM=b.STRUCNUM and a.en=b.en
;
drop table records;
quit;
(untested)
Code Not Tested. proc sql; create table key as select STRUCNUM , en from data.Fhwaed2015 intersect select STRUCNUM , en from data.Fhwaed2016 intersect select STRUCNUM , en from data.Fhwaed2017 ; create table taxesdata as select * ,'From Fhwaed2015 ' as label from data.Fhwaed2015 where catx(' ', STRUCNUM , en) in (select catx(' ', STRUCNUM , en) from key) union all select * ,'From Fhwaed2016 ' as label from data.Fhwaed2016 where catx(' ', STRUCNUM , en) in (select catx(' ', STRUCNUM , en) from key) union all select * ,'From Fhwaed2017' as label from data.Fhwaed2017 where catx(' ', STRUCNUM , en) in (select catx(' ', STRUCNUM , en) from key) ; quit;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.