BookmarkSubscribeRSS Feed
FrankReynolds
Calcite | Level 5

Hi guys I'll try to articulate this the best I can. I have two data sets - the original the modified (Cut from the original). One of the variables is a unique patient identifier that can occur for multiple observation lines (these observations are doctor visits, so a patient identifier that occurs on two separate observation lines indicates this patient visited two times).

The modified is concerned with visits after 5:30 pm, so all observations with visits before 5:30 pm were cut from the original to create the modified set. However, I am interested in seeing if those patient IDs in those observations occur in the original data set as well. I would like to combine these two datasets to create a dataset that has every instance of a patient ID that occurs in the modified dataset and the modified dataset only. Let me write it out:

Original     Modified    New

A               A               A

B               B               B

C               E               E

D               G               G

D               H                H

E               I                  H

F                                   H

F                                   I

G                                   I

H

H

H

I
I

So the original contains every visit.

The modified contains every visit after 5

The new contains every visit after 5 and any other visits the after 5 patients may have had (regardless of their time).

So my question would be, how would I get the "New" dataset described above?

I tried something like:

data new;

set original (in=a) modified (in=b);

by patient_id;

if a and b;

run;

Thanks

-you ever see a frog boy?

frank reynolds

2 REPLIES 2
ballardw
Super User

If the two sets are sorted by the Id variable

data want;

     merge modified original;

     by idvariable;

run;

If not sorted then sort them first.

Note: there are a variety of things that happen with other variables depending on the order the data sets appear on the merge statement. If you want both values for other variables then one version needs to be renamed (dataset option useful to know).

FrankReynolds
Calcite | Level 5

Thanks! I will try tomorrow back in the lab.

-and then I come out and start eating garbage

Frank Reyonlds

sas-innovate-white.png

Missed SAS Innovate in Orlando?

Catch the best of SAS Innovate 2025 — anytime, anywhere. Stream powerful keynotes, real-world demos, and game-changing insights from the world’s leading data and AI minds.

 

Register now

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 2 replies
  • 1106 views
  • 3 likes
  • 2 in conversation