BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Htein
Fluorite | Level 6

Hi everyone, help please! I am very new to using SAS and I would like to combine two datasets by ID and Date. Some visits in the two datasets fall on the same date while some can be different +/- 90 days. Please see the below for example datasets.

Dataset One

IDDatePulse Rate
14/2/1885
110/3/1990
13/2/2075
210/2/1885
215/3/1969
225/2/2099

 

Dataset Two

IDDateBP
14/2/18125
115/5/19110
13/2/20100
210/2/18160
215/3/19150
212/12/19120

 

The output dataset I want is 

IDDatePulse RateBP
14/2/1885125
110/3/1990110
13/2/2075100
210/2/1885160
215/3/1969150
225/2/2099120

 

As you can see, some visits are on different dates. But I want to merge them if the date in Dataset Two is within  +/- 90 days of the Dataset One. Could you please advise how to write the script for this. Thanks in advance.

 

1 ACCEPTED SOLUTION

Accepted Solutions
Kurt_Bremser
Super User

You can do this in SQL:

data one;
infile datalines dlm='09'x dsd truncover;
input id $ date :ddmmyy10. pulse_rate;
format date yymmddd10.;
datalines;
1	4/2/18	85
1	10/3/19	90
1	3/2/20	75
2	10/2/18	85
2	15/3/19	69
2	25/2/20	99
;

data two;
infile datalines dlm='09'x dsd truncover;
input id $ date :ddmmyy10. bp;
format date yymmddd10.;
datalines;
1	4/2/18	125
1	15/5/19	110
1	3/2/20	100
2	10/2/18	160
2	15/3/19	150
2	12/12/19	120
;

proc sql;
create table want as
  select
    one.id,
    one.date,
    one.pulse_rate,
    two.bp
  from one, two
  where one.id = two.id and abs(one.date - two.date) le 90
;
quit;

Note:

  • example data is presented in data steps with datalines, so everyone can easily recreate your dataset exactly as-is for testing
  • if several dates fall within the 90 day range, you will get a "cartesian join", increasing the number of observations

View solution in original post

4 REPLIES 4
Kurt_Bremser
Super User

You can do this in SQL:

data one;
infile datalines dlm='09'x dsd truncover;
input id $ date :ddmmyy10. pulse_rate;
format date yymmddd10.;
datalines;
1	4/2/18	85
1	10/3/19	90
1	3/2/20	75
2	10/2/18	85
2	15/3/19	69
2	25/2/20	99
;

data two;
infile datalines dlm='09'x dsd truncover;
input id $ date :ddmmyy10. bp;
format date yymmddd10.;
datalines;
1	4/2/18	125
1	15/5/19	110
1	3/2/20	100
2	10/2/18	160
2	15/3/19	150
2	12/12/19	120
;

proc sql;
create table want as
  select
    one.id,
    one.date,
    one.pulse_rate,
    two.bp
  from one, two
  where one.id = two.id and abs(one.date - two.date) le 90
;
quit;

Note:

  • example data is presented in data steps with datalines, so everyone can easily recreate your dataset exactly as-is for testing
  • if several dates fall within the 90 day range, you will get a "cartesian join", increasing the number of observations
Htein
Fluorite | Level 6

Thanks, Kurt. It helps. Just an additional question. I wonder if I can modify the script to link only to a nearest date in data two within +/- 90 days to avoid catersian join. Thanks again.

Kurt_Bremser
Super User

Expand the SQL by creating a column that holds the difference and a group by, and select for the minimum difference:

proc sql;
create table want as
  select
    one.id,
    one.date,
    one.pulse_rate,
    two.bp,
    abs(two.date - one.date) as datedif
  from one, two
  where one.id = two.id and abs(one.date - two.date) le 90
  group by one.id, one.date
  having datedif = min(datedif)
;
quit;

Note that you still might get two observations for a single date if you have two with the same distance.

To avoid this, you would need to run a data step:

proc sql;
create table want as
  select
    one.id,
    one.date,
    one.pulse_rate,
    two.bp,
    abs(two.date - one.date) as datedif
  from one, two
  where one.id = two.id and abs(one.date - two.date) le 90
  order by one.id, one.date, calculated datedif
;
quit;

data final_result;
set want;
by id date;
if first.date;
run;
Htein
Fluorite | Level 6
Thank you very much, Kurt. It helps me a lot.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 414 views
  • 3 likes
  • 2 in conversation