Hello - thanks in advance for your time.
I have dataset1 with 100+ variables and dataset2 with 100+ variables. Some of the variables match across the two datasets. I'm looking for a dataset3 that is the sum of the matching variables but also includes the unique variables from each dataset. I understand this is a relatively simple task by renaming the variables, merging the datasets and summing the like variables manually. However I'm looking to remove any manual processes from this program as I receive this data often and these manual steps are driving me crazy! Additionally, the variable naming conventions do not follow any sequence, i.e., x1 - xn.
For example:
data have1;
input geo time
web_wib_aln
web_wib_ccd
web_wib_dch
web_wib_dsv
web_wib_efs
web_wib_hel
web_wib_iau;
datalines;
501 209 0 0 1 2 3 4 5
501 210 0 0 1 2 3 4 5
501 211 0 0 1 2 3 4 5
;
RUN;
data have2;
input geo time
web_wib_ccd
web_wib_dch
web_wib_dsv
web_wib_ilh
web_wib_inv
web_wib_mtg
web_wib_onl;
datalines;
501 209 6 7 8 9 10 11 12
501 210 6 7 8 9 10 11 12
501 211 6 7 8 9 10 11 12
;
RUN;
The dataset I'm looking for:
data want;
input geo time
web_wib_ccd
web_wib_dch
web_wib_dsv
web_wib_aln
web_wib_efs
web_wib_hel
web_wib_iau
web_wib_ilh
web_wib_inv
web_wib_mtg
web_wib_onl;
datalines;
501 209 11 8 10 1 3 4 5 9 10 11 12
501 210 9 8 10 1 3 4 5 9 10 11 12
501 211 10 8 10 1 3 4 5 9 10 11 12
;
run;
Again, thanks for your time and input!
If we can assume that the math in your example doesn't add up correctly, then the following might do what you want:
data want; set have2 have1; run; proc summary data=want nway; var web_:; class geo time; output out=want (drop=_:) sum=; run;
HTH,
Art, CEO, AnalystFinder.com
Try an inner join and I have assumed the variable 'time' could be used as a key. This should give you the desired results.
proc sql;
select a.geo, a.time,
a.web_wib_aln,
a.web_wib_ccd,
a.web_wib_dch,
a.web_wib_dsv,
a.web_wib_efs,
a.web_wib_hel,
a.web_wib_iau,
b.web_wib_ilh,
b.web_wib_inv,
b.web_wib_mtg,
b.web_wib_onl
from have1 as a
inner join have2 as b
on a.time=b.time ;
quit ;
If we can assume that the math in your example doesn't add up correctly, then the following might do what you want:
data want; set have2 have1; run; proc summary data=want nway; var web_:; class geo time; output out=want (drop=_:) sum=; run;
HTH,
Art, CEO, AnalystFinder.com
art297 - this is brilliant! And yes, your assumption is absolutely correct - I must have copied/pasted the wrong table. I didn't even think to use proc summary and I didn't realize you could specify all variables beginning with "web_" in the command. This perfectly solves my dilemma - thank you!
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.