- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
data EVTQBP EVTnotQBP;
merge EVT.EVT_hometime(in=INfullcohort) EVT.Evt_qbp(in=INQBP);
by CIHI_KEY;
if INfullcohort =1 and INQBP=0 then output EVTnotQBP;
ELSE output EVTQBP;
run;
EVT.Home time data set has 696 keys and EVT.Evt_qbp has 687 keys
I am trying to write out the 9 keys in the EVTnotQBP dataset and want to know if my code is doing it correctly?
Also in the EVTQBP dataset I want to write out the keys that are only present in the EVT.Evt_qbp dataset which has 687 observations
Is my code correct to get the desired output?
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
This comment assumes your have one record per key in each data set.
Just because hometime has 696 unique key values and evt_qbp has 687 values does NOT mean there will be 9 keys in EVTnotQBP ... UNLESS you know that evt_qbp is a proper subset of hometime (i.e. no keys in evt_qbp that are not also in evt_hometime).
As to your code, it should work, but you don't really need the INfullcohort dummy. You could just:
data EVTQBP EVTnotQBP;
merge EVT.EVT_hometime EVT.Evt_qbp (in=inqbp);
by CIHI_KEY;
if INQBP = 0 then output EVTnotQBP;
ELSE output EVTQBP;
run;
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set
Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets
--------------------------
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Why is it tat i dont need the infull cohort marker ?
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
If you know that evt_qbp is a proper subset of evt_hometime, then having in= dummy for evt_hometime would result in that dummy getting a value of 1 for every merged observation. I.e. it would effectively be a constant, and offer no discriminatory power.
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set
Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets
--------------------------
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
else if inqbp = 1 and infullcohort=0 then output; Now if I need the cases that are only in QBP then saying if INfullcohort =1 and INQBP=1 then output; would be correct right?
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
You've got all the tools presented to you. It's time to experiment on sample data. For this question, I suspect that will be far more valuable than an answer from even the most instructive forum response.
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set
Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets
--------------------------
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
It would be safer to account for every possibility:
data evt_qbp_only evt_hometime_only evt_qbp_hometime;
merge EVT.EVT_hometime(in=INfullcohort) EVT.Evt_qbp(in=INQBP);
by CIHI_KEY;
if INfullcohort =1 and INQBP=0 then output evt_hometime_only;
if INfullcohort =0 and INQBP=1 then output evt_qbp_only;
if INfullcohort =1 and INQBP=1 then output evt_qbp_hometime;
run;