Hello - thanks in advance for your time.
I have dataset1 with 100+ variables and dataset2 with 100+ variables. Some of the variables match across the two datasets. I'm looking for a dataset3 that is the sum of the matching variables but also includes the unique variables from each dataset. I understand this is a relatively simple task by renaming the variables, merging the datasets and summing the like variables manually. However I'm looking to remove any manual processes from this program as I receive this data often and these manual steps are driving me crazy! Additionally, the variable naming conventions do not follow any sequence, i.e., x1 - xn.
For example:
data have1;
input geo time
web_wib_aln
web_wib_ccd
web_wib_dch
web_wib_dsv
web_wib_efs
web_wib_hel
web_wib_iau;
datalines;
501 209 0 0 1 2 3 4 5
501 210 0 0 1 2 3 4 5
501 211 0 0 1 2 3 4 5
;
RUN;
data have2;
input geo time
web_wib_ccd
web_wib_dch
web_wib_dsv
web_wib_ilh
web_wib_inv
web_wib_mtg
web_wib_onl;
datalines;
501 209 6 7 8 9 10 11 12
501 210 6 7 8 9 10 11 12
501 211 6 7 8 9 10 11 12
;
RUN;
The dataset I'm looking for:
data want;
input geo time
web_wib_ccd
web_wib_dch
web_wib_dsv
web_wib_aln
web_wib_efs
web_wib_hel
web_wib_iau
web_wib_ilh
web_wib_inv
web_wib_mtg
web_wib_onl;
datalines;
501 209 11 8 10 1 3 4 5 9 10 11 12
501 210 9 8 10 1 3 4 5 9 10 11 12
501 211 10 8 10 1 3 4 5 9 10 11 12
;
run;
Again, thanks for your time and input!
If we can assume that the math in your example doesn't add up correctly, then the following might do what you want:
data want; set have2 have1; run; proc summary data=want nway; var web_:; class geo time; output out=want (drop=_:) sum=; run;
HTH,
Art, CEO, AnalystFinder.com
Try an inner join and I have assumed the variable 'time' could be used as a key. This should give you the desired results.
proc sql;
select a.geo, a.time,
a.web_wib_aln,
a.web_wib_ccd,
a.web_wib_dch,
a.web_wib_dsv,
a.web_wib_efs,
a.web_wib_hel,
a.web_wib_iau,
b.web_wib_ilh,
b.web_wib_inv,
b.web_wib_mtg,
b.web_wib_onl
from have1 as a
inner join have2 as b
on a.time=b.time ;
quit ;
If we can assume that the math in your example doesn't add up correctly, then the following might do what you want:
data want; set have2 have1; run; proc summary data=want nway; var web_:; class geo time; output out=want (drop=_:) sum=; run;
HTH,
Art, CEO, AnalystFinder.com
art297 - this is brilliant! And yes, your assumption is absolutely correct - I must have copied/pasted the wrong table. I didn't even think to use proc summary and I didn't realize you could specify all variables beginning with "web_" in the command. This perfectly solves my dilemma - thank you!
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.