Dear All,
I have two sets, one is at an aggregated-level while the other is at a disaggregated-level. I would like to merge summary stat (e.g., mean, sum) from the second set with the first set by using proc SQL. I did this as shown below. I think it could be shorter, but I don't know how to do so. Please keep in mind that the aggregated-level has another variable that is not included in the the other data set. So I would like to use proc SQL to get summary information from the disaggregated set.
To illustrate, I generated the two datasets.
- ID_set: Set at an aggregated-level.
- Data_set: Set at a disaggregated-level.
*Generating ID_Set;
data ID_set;
set sashelp.cars;
keep Make Type;
run;
proc sort data= ID_set
nodupkey;
by Make;
run;
*N= 38;
*Generating Data_Set;
data Data_set;
set sashelp.cars;
keep Make EngineSize;
run;
*N= 428;
Using the two datasets, I would like to calculate sum and average of EngineSize and attach them to the first data set. I did PROC SQL and PROC SORT to keep the unique observations at an aggregated-level. I feel like they can be done at once in PROC SQL (without PROC SORT), but I don't know how to do so. Any help would be appreciated!
proc sql;
create table Merged
as select a.*,
sum(b.EngineSize) as Sum_Engine,
Mean(b.EngineSize) as Mean_Engine
from ID_set a left join Data_set b
on a.Make= b.Make
group by a.Make;
quit;
proc sort data= Merged nodupkey;
by Make;
run;
So you want summaries in your data at the make level but the data table at a make/engine size categories?
TYPE is just the first record? It doesn't matter which one you take? Do you need it the logic for what's first to be consistent at least? Your current code doesn't guarantee that.
proc sql;
create table Merged as
select a.*,
b. Sum_Engine,
b.Mean_Engine
from ID_set a
left join (
select Make,
sum(EngineSize) as Sum_Engine,
Mean(EngineSize) as Mean_Engine
from Data_set
group by a.Make
) b on a.Make= b.Make
quit;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.