- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi. I have the following 5 datasets. Each has a variable named review_ind, which can be either 1 or 0. Can I merge the datasets together, but only keep the rows if review_ind=1 in at least one of the datasets.
Is there an easy/efficient way to do this during the merge or will I need to individually subset each of the 5 datasets (where review_ind=1) before I merge? I'm trying to reduce the overall lines of code in this program.
Any suggestions would be greatly appreciated.
DATA ALL_CLAIMS;
MERGE
&USERN._LIB.F3_DISTANCE_CAR_F
&USERN._LIB.F4_RFRG_NVR_BLG_CAR_F
&USERN._LIB.F5_BENE_NO_HIST_CAR_F
&USERN._LIB.F6_BENE_SHARE_CAR_F;
BY ID_NUM BENE_SK;
RUN;
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I only see 4 data sets? However, you can do something like this
data all_claims;
merge
&usern._lib.f3_distance_car_f(where=(review_ind=1))
&usern._lib.f4_rfrg_nvr_blg_car_f(where=(review_ind=1))
&usern._lib.f5_bene_no_hist_car_f(where=(review_ind=1))
&usern._lib.f6_bene_share_car_f(where=(review_ind=1));
by id_num bene_sk;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
You could rename REVIEW_IND so that it has a different name in each data set (e.g. REVIEW_IND1 through REVIEW_IND5), then if the maximum of all of these new variables is 1, keep the record, otherwise don't keep it.
data all_claims;
merge
&usern._lib.f3_distance_car_f(rename=(review_ind=review_ind1))
&usern._lib.f4_rfrg_nvr_blg_car_f(rename=(review_ind=review_ind2))
&usern._lib.f5_bene_no_hist_car_f(rename=(review_ind=review_ind3))
&usern._lib.f6_bene_share_car_f(rename=(review_ind=review_ind4));
by id_num bene_sk;
if max(of review_ind:)=0 then delete;
run;
Paige Miller
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I only see 4 data sets? However, you can do something like this
data all_claims;
merge
&usern._lib.f3_distance_car_f(where=(review_ind=1))
&usern._lib.f4_rfrg_nvr_blg_car_f(where=(review_ind=1))
&usern._lib.f5_bene_no_hist_car_f(where=(review_ind=1))
&usern._lib.f6_bene_share_car_f(where=(review_ind=1));
by id_num bene_sk;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Señor @PeterClemmensen:
Since neither you nor I really know the contents of these data sets, nor do we really know what else should be in the output data set, I simply mention the possibility that there may be variables with non-missing values in each of these data sets that would not get merged into the final result using your code, but those variables would be present given my code. Or maybe your code is the right thing to do. The explanation given is silent on this issue.
Paige Miller
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@PaigeMiller, you are absolutely right. I read the question too fast it seems 🙂
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content