Hi-
I'm trying to query a second table (b) using account ID from first table (a). I want to only capture first observation from table b where start date from table (b) should correspond either to the same date from table (a), or be the first closest date after (as the data from "second" closest date no longer valid).
Example tables:
Table a
Account_ID | Shopping Date
___________________________
00001 | 03/20/2016
Table b
Key |Account_ID| Start Date | End Date | Score
____|____________________________________________
26 | 000001 | 2/18/2016 | 3/19/2016 | 709
____|___________________________________________
27 | 000001 | 3/19/2016 | 3/18/2016 | 715
____|___________________________________________
28 | 000001 | 4/12/2016 | 5/11/2016 | 718
________________________________________________
29 | 000001 | 5/12/2016 | 12/31/9999| 702
So the result queried from table (b) should be key 27 with "score" of 715.
I would appreciate any suggested code, including brief explanation of the functions involved. Thanks in advance
Hello,
Instead of putting the where clause on the dates as an output dataset option,
create table c(where=(Shopping_Date1 between Start_Date1 AND End_Date1))
it's better to make a composite where clause below the "from"
where a.Account_ID = b.Account_ID
AND Shopping_Date1 between Start_Date1 AND End_Date1
To verify whether it makes a difference indeed, use some feedback options to acquire extra log-info:
OPTIONS MSGLEVEL=I;
PROC SQL _METHOD STIMER noprint;
Kind regards,
Koen
Hello,
dataset b: The end date for Key=27 is before the start date. I changed that.
I wrote a little program that also works when you have thousands of account_id's (instead of just 1).
I'm not completely sure that I fully grasp your request, but the below code (or something close) should do the job:
Take care:
This code is quite greedy (especially when you would / could have multiple shopping dates for an account in table a --> Cartesian product due to repeats of by-values in table a AND table b) but it has the advantage of simplicity. Faster solutions are possible (using hash tables for example) but they would be more difficult to read and understand.
data a;
Account_ID='00001'; Shopping_Date='03/20/2016'; output;
run;
data b;
Key=26; Account_ID='00001'; Start_Date='02/18/2016'; End_Date='03/19/2016'; Score=709; output;
Key=27; Account_ID='00001'; Start_Date='03/19/2016'; End_Date='04/11/2016'; Score=715; output;
Key=28; Account_ID='00001'; Start_Date='04/12/2016'; End_Date='05/11/2016'; Score=718; output;
Key=29; Account_ID='00001'; Start_Date='05/12/2016'; End_Date='12/31/9999'; Score=702; output;
run;
data a; set a;
Shopping_Date1=input(Shopping_Date,mmddyy10.);
format Shopping_Date1 date9.;
run;
data b; set b;
Start_Date1=input(Start_Date,mmddyy10.);
End_Date1 =input(End_Date, mmddyy10.);
format Start_Date1 End_Date1 date9.;
run;
PROC SQL noprint;
create table c(where=(Shopping_Date1 between Start_Date1 AND End_Date1)) as
select b.Key , b.Account_ID , b.Start_Date1 , b.End_Date1 , b.Score , a.Shopping_Date1
, INTCK('DAY',Start_Date1,Shopping_Date1) as DaysDiff
from work.a a
, work.b b
where a.Account_ID = b.Account_ID
order by b.Account_ID , DaysDiff;
QUIT;
data c; set c; by Account_ID DaysDiff; if first.Account_ID; run;
/* end of program */
Cheers,
Koen
Hello,
Instead of putting the where clause on the dates as an output dataset option,
create table c(where=(Shopping_Date1 between Start_Date1 AND End_Date1))
it's better to make a composite where clause below the "from"
where a.Account_ID = b.Account_ID
AND Shopping_Date1 between Start_Date1 AND End_Date1
To verify whether it makes a difference indeed, use some feedback options to acquire extra log-info:
OPTIONS MSGLEVEL=I;
PROC SQL _METHOD STIMER noprint;
Kind regards,
Koen
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.