SAS Data Integration Studio, DataFlux Data Management Studio, SAS/ACCESS, SAS Data Loader for Hadoop and others

conditional merge of two data sets

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 5
Accepted Solution

conditional merge of two data sets

Hi,

I am trying to merge two data sets. One of the datasets looks like:

ID testdate1

1 09/30/2015

1 06/15/2016

1 01/05/2017

2 03/20/2017

2 04/22/2017

The number of records for each person varies.

 

The 2nd data set looks like:

ID testdate2

1 07/06/2016

1 01/02/2017

2 05/19/2017

2 10/20/2017

 

I can only match them using ID. What I want is:

ID  testdate1      testdate2    Indicator

1 06/15/2016   07/06/2016     1

1 01/05/2017  01/02/2017      1

 

If the interval between testdate1 and testdate2 is greater than 6 months, an indicator variable will be coded 1, otherwise the indicator variable is coded 0.

When I used the ID to merge the two datasets, it turns out to be

ID  testdate1      testdate2   Indicator

1  09/30/2015 07/06/2016  0

1 06/15/2016   01/02/2017 0

1 01/05/2017  01/02/2017  1

 

In the merged dataset above, the first person is mismatched. I want testdate1=06/15/2016 to be matched with 07/06/2016 and testdate1=09/30/2015 should not be matched with any records in data set 2.

 

Thanks!


Accepted Solutions
Solution
‎01-09-2018 09:26 AM
Super User
Posts: 23,323

Re: conditional merge of two data sets


newbe wrote:

Thanks for your input! Problem is solved.


@newbe Please mark the appropriate solution or post the solution you found in the end. 

Thanks.

View solution in original post


All Replies
PROC Star
Posts: 1,592

Re: conditional merge of two data sets

Show us your code and log. Let's see if we can modify it. If nothing works out, we shall work on a new solution

Occasional Contributor
Posts: 5

Re: conditional merge of two data sets

My code is simple, but I don't think it is correct.

proc sort data=one;

by ID descending testdate1;

run;

proc sort data=two;

by ID descending testdate2;

run;

 

data three;

merge one two;

by ID;

run;

 

I created another variable in data set 1 and 2 by counting the number of observations for each person and then merge the two datasets using ID and the count. I am not sure whether this is correct.

data one1;

set one;

by ID descending testdate1;

if first.ID then count=0;

count+1;

run;

 

data two1;

set two;

by ID descending testdate2;

if first.ID then count=0;

count+1;

run;

 

data all;

merge one1 two1;

by ID count;

run;

 

PROC Star
Posts: 1,592

Re: conditional merge of two data sets

[ Edited ]

seems like sql is better for this this:

 

data have1;
input iD testdate1 :mmddyy10.;
format testdate1 mmddyy10.;
datalines;
1 09/30/2015
1 06/15/2016
1 01/05/2017
2 03/20/2017
2 04/22/2017
;

data have2;
input iD testdate1 :mmddyy10.;
format testdate1 mmddyy10.;
datalines;
1 07/06/2016
1 01/02/2017
2 05/19/2017
2 10/20/2017
;

 

Try and let me know
proc sql;
create table want as
select *,intck('month',testdate1,testdate2)>6 as indicator
from have1 a, have2(rename=(testdate1=testdate2)) b
where a.id=b.id;

quit;

Occasional Contributor
Posts: 5

Re: conditional merge of two data sets

My description of the problem is not clear. Let me try again. What I really want is an indicator variable in the merged data set with a code of 1 indicating the time interval between testdate2 and testdate1 is less than 6 months and a code of 0 if the time interval between testdate2 and testdate1 is more than 6 months.

PROC Star
Posts: 1,592

Re: conditional merge of two data sets

changing do lt (less than< 
proc sql;
create table want as
select *,intck('month',testdate1,testdate2)<6 as indicator
from have1 a, have2(rename=(testdate1=testdate2)) b
where a.id=b.id;

quit;

Occasional Contributor
Posts: 5

Re: conditional merge of two data sets

Thanks for your help! It doesn't work. Testdate1 and testdate2 are not in the same data set.

 

Occasional Contributor
Posts: 5

Re: conditional merge of two data sets

Thanks for your input! Problem is solved.

Solution
‎01-09-2018 09:26 AM
Super User
Posts: 23,323

Re: conditional merge of two data sets


newbe wrote:

Thanks for your input! Problem is solved.


@newbe Please mark the appropriate solution or post the solution you found in the end. 

Thanks.

PROC Star
Posts: 503

Re: conditional merge of two data sets

[ Edited ]

code copied from @novinosrin , add a case statement.

 

proc sql;
create table want as
select *, case when intck('month',a.testdate1,b.testdate2)<6 then 1 else 0 end as indicatior
from have1 a, have2 b
where a.id=b.id;

quit;

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 9 replies
  • 348 views
  • 0 likes
  • 4 in conversation