02-12-2015 06:28 PM
Hi guys I'll try to articulate this the best I can. I have two data sets - the original the modified (Cut from the original). One of the variables is a unique patient identifier that can occur for multiple observation lines (these observations are doctor visits, so a patient identifier that occurs on two separate observation lines indicates this patient visited two times).
The modified is concerned with visits after 5:30 pm, so all observations with visits before 5:30 pm were cut from the original to create the modified set. However, I am interested in seeing if those patient IDs in those observations occur in the original data set as well. I would like to combine these two datasets to create a dataset that has every instance of a patient ID that occurs in the modified dataset and the modified dataset only. Let me write it out:
Original Modified New
A A A
B B B
C E E
D G G
D H H
E I H
So the original contains every visit.
The modified contains every visit after 5
The new contains every visit after 5 and any other visits the after 5 patients may have had (regardless of their time).
So my question would be, how would I get the "New" dataset described above?
I tried something like:
set original (in=a) modified (in=b);
if a and b;
-you ever see a frog boy?
02-12-2015 06:58 PM
If the two sets are sorted by the Id variable
merge modified original;
If not sorted then sort them first.
Note: there are a variety of things that happen with other variables depending on the order the data sets appear on the merge statement. If you want both values for other variables then one version needs to be renamed (dataset option useful to know).