BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
axescot78
Quartz | Level 8

I have 2 datasets that look like the this:

 

 

data match_data;
input uniq_ID_A $ Sex $ Race $ Dx_Age 3. uniq_ID_B $ Sex $ Race $ Age_Begin 3. Age_End 3.;
datalines;
5018 F B 35 1013 F B 36 39
3127 M H 25 2996 M H 26 37
5571 M W 32 1290 M W 33 42
2058 F W 35 4130 F W 38 45
1785 F B 37 3053 F B 38 41
1278 M W 49 1686 M W 49 53
2446 M W 33 3716 M W 35 44
3067 M H 39 5010 M H 40 52
3146 M B 32 5299 M B 32 45
3429 M B 30 5243 M B 30 35
5771 M A 30 4027 M A 31 42
;


data pt_data;
input uniq_ID_B $ DOB:MMDDYY10. Visit_Date:MMDDYY10. Visit_Reason_Code $ Other_Demographic_Data $ ;
format DOB MMDDYY10. Visit_Date MMDDYY10.;
datalines;
2996 9/25/1990 9/24/2016 6 xzxz
2996 9/25/1990 11/15/2016 6 xzxz
2996 9/25/1990 12/5/2016 4 xzxz
5299 12/1/1984 12/1/2016 2 xzxz
5299 12/1/1984 12/27/2016 5 xzxz
5299 12/1/1984 2/21/2017 5 xzxz
5299 12/1/1984 3/14/2017 1 xzxz
5299 12/1/1984 4/26/2017 6 xzxz
5299 12/1/1984 5/21/2017 5 xzxz
4027 6/3/1990 6/2/2021 1 xzxz
4027 6/3/1990 7/5/2021 4 xzxz
;

The first has a study population matched with a control population. Due to size constraints, I couldn't pull all the data that I wanted from the control population. Now I would like to go back to the second data set and pull the remaining data. However, I am only interested in the entry whose age is closest to the match from the first data set. For example, for 2996 I am interested in the entry whose age is closest to 25 (the age of his match). For 5299, I am interested in the entry when he was 32 (the age of his match). Same for 4027, 30, the age of his match.

 

 

Any advice how to code this?

1 ACCEPTED SOLUTION

Accepted Solutions
axescot78
Quartz | Level 8
I ended up calculating the age difference, sorting by ID and age difference smallest to largest, and then selecting the first ID entry. Worked well.

View solution in original post

5 REPLIES 5
KachiM
Rhodochrosite | Level 12
"Same for 4027, 31, the age of his match."

Is it not 30?
axescot78
Quartz | Level 8

oh yea, 30... edited

andreas_lds
Jade | Level 19

Please show exactly which obs to pull from the second dataset and explain what what should happen, if two obs in the second dataset have the same age.

axescot78
Quartz | Level 8
Good catch! So this is fake data that I created for this post. The actual data has ages to 2 decimal places so there won't be any observations with the same age.
axescot78
Quartz | Level 8
I ended up calculating the age difference, sorting by ID and age difference smallest to largest, and then selecting the first ID entry. Worked well.

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 1075 views
  • 0 likes
  • 3 in conversation