Help using Base SAS procedures

SAS QUERY

Accepted Solution Solved
Reply
Frequent Contributor
Posts: 145
Accepted Solution

SAS QUERY

Hi Friends,

 

Can someone help me with below query. I have two datasets X and Y.

 

X has two variables ID and date

Y has two variables ID and date

 

ID is the key variable and can have different number of records in X and Y for the same ID.

 

For every ID and DATE in X, Date should be compared  with Y and should output only those where X.DATE=+_3 days.Y.date.

 

I Hope i explained and thanks in advance.

Sample data

x:

 

140011001 2011-09-09 
140011001 2011-09-11 
140011001 2011-09-11 
140011001 2011-09-11 
140011001 2011-09-11 
140011001 2011-09-11 
140011001 2011-09-20 
140011001 2011-09-20 
140011001 2011-09-20 

 

Y:

140011001 2011-09-09
140011001 2011-09-29

 

OUTPUT:

 

140011001 2011-09-20 
140011001 2011-09-20 
140011001 2011-09-20 

 


Accepted Solutions
Solution
‎01-12-2016 10:42 AM
Super User
Posts: 10,041

Re: SAS QUERY

Posted in reply to rakeshvvv
data x;
input id : $20. date yymmdd12.;
format date yymmdd.;
cards;
140011001 2011-09-09 
140011001 2011-09-11 
140011001 2011-09-11 
140011001 2011-09-11 
140011001 2011-09-11 
140011001 2011-09-11 
140011001 2011-09-20 
140011001 2011-09-20 
140011001 2011-09-20 
;
run;
data y;
input id : $20. date yymmdd12.;
format date yymmdd.;
cards;
140011001 2011-09-09
140011001 2011-09-29
;
run;
proc sql;
create table want as
 select x.*
  from x,y
   where x.id eq y.id and x.date between y.date-3 and y.date+3;
quit;


View solution in original post


All Replies
Super User
Posts: 19,855

Re: SAS QUERY

Posted in reply to rakeshvvv

It doesn't look like your data matches your rule, 2011-09-20 +-3 days is 2011-09-17 and 2011-09-23 and those dates aren't in the second table. Please clarify your requirements.

Super User
Posts: 11,343

Re: SAS QUERY

Posted in reply to rakeshvvv

Are your date variables SAS Date value or character? The comparisons will work much better with SAS data values.

 

Also your requirement seems to say that the X date is 3 days from the Y date but your example out put shows the X date 2011-09-20 to be 9 days from the displayed Y date of 2011-09-20. You may need to expand on the detales of the rule for us.

Frequent Contributor
Posts: 145

Re: SAS QUERY

Posted in reply to rakeshvvv

The data is correct.....date is a character variable in yyyymmdd format.

 

140011001 2011-09-09 
140011001 2011-09-11 
140011001 2011-09-11 
140011001 2011-09-11 
140011001 2011-09-11 
140011001 2011-09-11 
140011001 2011-09-20 
140011001 2011-09-20 
140011001 2011-09-20 

 

Y:

140011001 2011-09-09
140011001 2011-09-29

 

OUTPUT:

 

140011001 2011-09-09 
140011001 2011-09-11 
140011001 2011-09-11 
140011001 2011-09-11 
140011001 2011-09-11 
140011001 2011-09-11 

 

Reason for output: As the first 6 records in X(2011-09-09  2011-09-11 ) within 3 days of Y(2011-09-09). We should output those records.

 

140011001 2011-09-20= it is beyond 3 days from Y and hence we are not outputting those records 

Solution
‎01-12-2016 10:42 AM
Super User
Posts: 10,041

Re: SAS QUERY

Posted in reply to rakeshvvv
data x;
input id : $20. date yymmdd12.;
format date yymmdd.;
cards;
140011001 2011-09-09 
140011001 2011-09-11 
140011001 2011-09-11 
140011001 2011-09-11 
140011001 2011-09-11 
140011001 2011-09-11 
140011001 2011-09-20 
140011001 2011-09-20 
140011001 2011-09-20 
;
run;
data y;
input id : $20. date yymmdd12.;
format date yymmdd.;
cards;
140011001 2011-09-09
140011001 2011-09-29
;
run;
proc sql;
create table want as
 select x.*
  from x,y
   where x.id eq y.id and x.date between y.date-3 and y.date+3;
quit;


Occasional Contributor
Posts: 10

Re: SAS QUERY

Wouldn't it be better let the join take care of merging the two tables, and let the where-statement take care of the proper selection?

 

proc sql;

create table want as

select x.*

from x as a

left join y as b on (a.id = b.id)

where abs(a.date - b.date) <= 3;

quit;

 

Personally I prefer using abs(x-y)<=3 - makes it more explicit where the boundaries of the selection are...

Trusted Advisor
Posts: 1,118

Re: SAS QUERY

Posted in reply to rakeshvvv

Either way, you would need to consider the case that observations from X would be selected multiple times by PROC SQL if there were two or more "matching" Y records. For example, if Y contained a third observation with ID='140011001' and date='2011-09-10', the current result dataset would double from 6 to 12 observations. Of course, this could be avoided by selecting DISTINCT ... including an additional variable, e.g. a record number, which uniquely identifies observations in X.


Also, please note that with your character dates, the BETWEEN condition would read
input(x.date,yymmdd10.) between input(y.date,yymmdd10.)-3 and input(y.date,yymmdd10.)+3.

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 6 replies
  • 418 views
  • 2 likes
  • 6 in conversation