BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
mmhxc5
Quartz | Level 8

Hi all,

I have three data sets and want to make another one to contain only observarions that is shared in all three datasets. The two variables that shall match the condition in all three datasets are STRUCNUM and EN such as this;

STRUCNUM 12354 EN 410 from Fhwaed2015

STRUCNUM 12354 EN 410 from Fhwaed2016

STRUCNUM 12354 EN 410 from Fhwaed2017 

If the above condition matched in all three datasets, I want to have them in my dataset.

I tried the following code, but could not get the job done.

I would appreciate if anyone could help me.

proc sql;
create table taxesdata as
select *
from data.Fhwaed2015, data.Fhwaed2016, data.Fhwaed2017
where Fhwaed2015.STRUCNUM=Fhwaed2016.STRUCNUM and Fhwaed2015.STRUCNUM=Fhwaed2017.STRUCNUM
Fhwaed2015.en=Fhwaed2016.en and Fhwaed2015.en=Fhwaed2017.en
;
quit;

  mmhxc5

 

1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User

I'm still not following, but I'll take a guess. You have data sets for 3 years. You want records that are in all three years. 

 

1. Stack the data (append, instead of merge)

2. Count the number of records per your grouping variables. If it's 3 you have records for three years, If it's not three you can drop it. 

 

data want;
set data2015-data2017;

run;

proc sql;
create table want as
select *
from want
group by var1, var2, var3
having count(*) = 3;
quit;

@mmhxc5 wrote:

@Reeza, here is a screenshot of my data from fhwaed2015 with STRUCNUM and EN and the other variables. I want to combine this with the other two datasets - fhwaed2016, and fhwaed2017 that is matching  STRUCNUM and EN condition and generate the dataset to have all variables matching in STRUCNUM and EN. Capture.JPG

Thanks,


Obviously untested because no workable data was provided. If this isn't what you want, you'll need to clarify in detail. 

 

EDIT: If you have duplicates of your grouping variable in the data sets, this method will not work. 

View solution in original post

7 REPLIES 7
Reeza
Super User
Can you expand your sample data set to show some more records of what you have an what you want? If you just want records that are in all three by key variables a data step merge may be more effective here.
mmhxc5
Quartz | Level 8

@Reeza, here is a screenshot of my data from fhwaed2015 with STRUCNUM and EN and the other variables. I want to combine this with the other two datasets - fhwaed2016, and fhwaed2017 that is matching  STRUCNUM and EN condition and generate the dataset to have all variables matching in STRUCNUM and EN. Capture.JPG

Thanks,

Reeza
Super User

I'm still not following, but I'll take a guess. You have data sets for 3 years. You want records that are in all three years. 

 

1. Stack the data (append, instead of merge)

2. Count the number of records per your grouping variables. If it's 3 you have records for three years, If it's not three you can drop it. 

 

data want;
set data2015-data2017;

run;

proc sql;
create table want as
select *
from want
group by var1, var2, var3
having count(*) = 3;
quit;

@mmhxc5 wrote:

@Reeza, here is a screenshot of my data from fhwaed2015 with STRUCNUM and EN and the other variables. I want to combine this with the other two datasets - fhwaed2016, and fhwaed2017 that is matching  STRUCNUM and EN condition and generate the dataset to have all variables matching in STRUCNUM and EN. Capture.JPG

Thanks,


Obviously untested because no workable data was provided. If this isn't what you want, you'll need to clarify in detail. 

 

EDIT: If you have duplicates of your grouping variable in the data sets, this method will not work. 

mmhxc5
Quartz | Level 8

@Reeza, and @Ksharp, thank you for our time and help. Your codes both worked well - same result. @Reeza ,  ur code gives me this note in the log.

NOTE: The query requires remerging summary statistics back with the original data.

Thank you @PGStats for ur time. 

Reeza
Super User
It's a note, not a warning.
PGStats
Opal | Level 21

Following @Reeza 's guess, but doing everything in SQL:

 

proc sql;
create table records as
select unique Fhwaed2015.STRUCNUM, Fhwaed2015.en
from data.Fhwaed2015, data.Fhwaed2016, data.Fhwaed2017
where Fhwaed2015.STRUCNUM=Fhwaed2016.STRUCNUM and Fhwaed2015.STRUCNUM=Fhwaed2017.STRUCNUM
Fhwaed2015.en=Fhwaed2016.en and Fhwaed2015.en=Fhwaed2017.en;

create table taxesData as
select a.* from Fhwaed2015 as a inner join records as b on a.STRUCNUM=b.STRUCNUM and a.en=b.en
union all
select a.* from Fhwaed2016 as a inner join records as b on a.STRUCNUM=b.STRUCNUM and a.en=b.en
union all
select a.* from Fhwaed2017 as a inner join records as b on a.STRUCNUM=b.STRUCNUM and a.en=b.en
;

drop table records;
quit;

(untested)

PG
Ksharp
Super User
Code Not Tested.


proc sql;
create table key as
select STRUCNUM , en from  data.Fhwaed2015
intersect
select STRUCNUM , en from  data.Fhwaed2016
intersect
select STRUCNUM , en from  data.Fhwaed2017
;


create table taxesdata as
select * ,'From Fhwaed2015 ' as label from data.Fhwaed2015  
 where catx(' ', STRUCNUM , en) in (select catx(' ', STRUCNUM , en) from key)
union all 
select * ,'From Fhwaed2016 ' as label from data.Fhwaed2016 
 where catx(' ', STRUCNUM , en) in (select catx(' ', STRUCNUM , en) from key)
union all 
select * ,'From Fhwaed2017' as label from data.Fhwaed2017  
 where catx(' ', STRUCNUM , en) in (select catx(' ', STRUCNUM , en) from key)
;
quit;

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 7 replies
  • 1047 views
  • 1 like
  • 4 in conversation