Hello
I want to ask a question please.
Suppose I have 4 data sets with same dimensions (same number of rows and columns) and same columns names.
Suppose that I want to sum up values from 4 data sets (in numeric fields) .
What is the best way to do it?
Data tbl1;
input z$ x w y;
a 10 20 30
b 20 10 20
c 30 30 40
d 40 15 50
;
run;
Data tbl2;
input z$ x w y;
a 20 30 60
b 40 30 20
c 40 30 50
d 60 45 50
;
run;
Data tbl3;
input z$ x w y;
a 10 30 55
b 30 20 20
c 40 30 20
d 20 45 30
;
run;
Data tbl4;
input z$ x w y;
a 20 10 45
b 20 30 30
c 30 40 30
d 20 25 40
;
run;
Data tbl1;
input z$ x w y;
cards;
a 10 20 30
b 20 10 20
c 30 30 40
d 40 15 50
;
run;
Data tbl2;
input z$ x w y;
cards;
a 20 30 60
b 40 30 20
c 40 30 50
d 60 45 50
;
run;
Data tbl3;
input z$ x w y;
cards;
a 10 30 55
b 30 20 20
c 40 30 20
d 20 45 30
;
run;
Data tbl4;
input z$ x w y;
cards;
a 20 10 45
b 20 30 30
c 30 40 30
d 20 25 40
;
run;
data all;
set tbl1-tbl4;
run;
proc means data=all nway noprint;
class z;
var x w y;
output out=want(drop=_:) sum=;
run;
z | x | w | y |
---|---|---|---|
a | 60 | 90 | 190 |
b | 110 | 90 | 90 |
c | 140 | 130 | 140 |
d | 140 | 130 | 170 |
sum by group (key enumeration)?
or sum all (enumerate) all?
or any other?
May I ask what's your desired?
The 4 data sets are outputs for 4 separate populations.
Now, the task is to create one table that merge all 4 populations to 1 population
How do you want the output to look? one table of sums after combining the four sets? Four separate sums tables? Are we supposed to sum x with w and y? or sums of the individual variables?
There will be one output table:
a | 60 | 90 | 190 |
b | 110 | 90 | 90 |
c | 130 | 130 | 130 |
d | 120 | 100 | 170 |
It means that we sum up values by positions....(like when you do Matrix addition )
Matrix addition, this is easily done in PROC IML.
If you don't want to use PROC IML, then PROC SUMMARY will work after you combine the data sets into one
data combined;
set tbl1 tbl2 tbl3 tbl4;
run;
proc summary nway data=combined;
class z;
var x w y;
output out=want sum=;
run;
Data tbl1;
input z$ x w y;
cards;
a 10 20 30
b 20 10 20
c 30 30 40
d 40 15 50
;
run;
Data tbl2;
input z$ x w y;
cards;
a 20 30 60
b 40 30 20
c 40 30 50
d 60 45 50
;
run;
Data tbl3;
input z$ x w y;
cards;
a 10 30 55
b 30 20 20
c 40 30 20
d 20 45 30
;
run;
Data tbl4;
input z$ x w y;
cards;
a 20 10 45
b 20 30 30
c 30 40 30
d 20 25 40
;
run;
data all;
set tbl1-tbl4;
run;
proc means data=all nway noprint;
class z;
var x w y;
output out=want(drop=_:) sum=;
run;
z | x | w | y |
---|---|---|---|
a | 60 | 90 | 190 |
b | 110 | 90 | 90 |
c | 140 | 130 | 140 |
d | 140 | 130 | 170 |
proc sql;
create table want as
select sum(x) as x, sum(w) as w,sum(y) as y
from
(select * from tbl1
union
select * from tbl2
union
select * from tbl3
union
select * from tbl4)
group by z;
quit;
Better use UNION ALL , if there is a same obs appeared in two tables, sql would remove one .
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.