Hello, below is an example of my two datasets:
dataset A:
id | env | water | transport | production |
1 | 0 | 0 | 0 | 0 |
2 | 1 | 0 | 1 | 0 |
3 | 0 | 0 | 0 | 1 |
4 | 0 | 0 | 1 | 0 |
dataset B:
id | env | water | transport | production |
1 | 1 | 0 | 0 | 0 |
2 | 0 | 0 | 0 | 0 |
3 | 0 | 0 | 0 | 0 |
4 | 1 | 0 | 1 | 1 |
Just a background, why two datasets have the same column names is because I split my original data into two subsamples and let them run through the same codes to achieve this.
I want that as long as there is '1' in one of the datasets, the combined table should reflect '1'.
Below is my desired outcome:
id | env | water | transport | production |
1 | 1 | 0 | 0 | 0 |
2 | 1 | 0 | 1 | 0 |
3 | 0 | 0 | 0 | 1 |
4 | 1 | 0 | 1 | 1 |
I know there are probably time-consuming ways to achieve the desired outcome by renaming one of the table's column names, merging by id, summing the columns, restricting to greater than 1, and so on. However, I want to know if there is a fast and non-tedious way for operation as the reason why I initially split is that the original dataset was too huge. Hopefully, the expert can provide a more efficient method. Thank you.
Stack the two datasets, then use PROC MEANS:
data stacked,
set a b;
by id;
run;
proc means data=stacked;
by id;
var env water transport production;
output
out=want (drop=_type _freq_)
max()=
;
run;
Stack the two datasets, then use PROC MEANS:
data stacked,
set a b;
by id;
run;
proc means data=stacked;
by id;
var env water transport production;
output
out=want (drop=_type _freq_)
max()=
;
run;
A merge could be used, but requires renaming the variables of one dataset:
data want;
merge have_a have_b(rename=(env= b_env water=b_water transport = b_transport production = b_production));
by id;
env = env or b_env;
water = water or b_water;
transport = transport or b_transport;
production = production or b_production;
drop b_:;
run;
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.