Hello, below is an example of my two datasets:
dataset A:
id | env | water | transport | production |
1 | 0 | 0 | 0 | 0 |
2 | 1 | 0 | 1 | 0 |
3 | 0 | 0 | 0 | 1 |
4 | 0 | 0 | 1 | 0 |
dataset B:
id | env | water | transport | production |
1 | 1 | 0 | 0 | 0 |
2 | 0 | 0 | 0 | 0 |
3 | 0 | 0 | 0 | 0 |
4 | 1 | 0 | 1 | 1 |
Just a background, why two datasets have the same column names is because I split my original data into two subsamples and let them run through the same codes to achieve this.
I want that as long as there is '1' in one of the datasets, the combined table should reflect '1'.
Below is my desired outcome:
id | env | water | transport | production |
1 | 1 | 0 | 0 | 0 |
2 | 1 | 0 | 1 | 0 |
3 | 0 | 0 | 0 | 1 |
4 | 1 | 0 | 1 | 1 |
I know there are probably time-consuming ways to achieve the desired outcome by renaming one of the table's column names, merging by id, summing the columns, restricting to greater than 1, and so on. However, I want to know if there is a fast and non-tedious way for operation as the reason why I initially split is that the original dataset was too huge. Hopefully, the expert can provide a more efficient method. Thank you.
Stack the two datasets, then use PROC MEANS:
data stacked,
set a b;
by id;
run;
proc means data=stacked;
by id;
var env water transport production;
output
out=want (drop=_type _freq_)
max()=
;
run;
Stack the two datasets, then use PROC MEANS:
data stacked,
set a b;
by id;
run;
proc means data=stacked;
by id;
var env water transport production;
output
out=want (drop=_type _freq_)
max()=
;
run;
A merge could be used, but requires renaming the variables of one dataset:
data want;
merge have_a have_b(rename=(env= b_env water=b_water transport = b_transport production = b_production));
by id;
env = env or b_env;
water = water or b_water;
transport = transport or b_transport;
production = production or b_production;
drop b_:;
run;
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Select SAS Training centers are offering in-person courses. View upcoming courses for: