BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
brulard
Pyrite | Level 9

Hi-
I'm trying to query a second table (b) using account ID from first table (a). I want to only capture first observation from table b where start date from table (b) should correspond either to the same date from table (a), or be the first closest date after (as the data from "second" closest date no longer valid).
Example tables:

Table a

Account_ID | Shopping Date
___________________________
  00001       | 03/20/2016


Table b

Key |Account_ID| Start Date | End Date | Score
____|____________________________________________
26    | 000001   | 2/18/2016  | 3/19/2016 | 709
____|___________________________________________
27    | 000001   | 3/19/2016  | 3/18/2016 | 715
____|___________________________________________
28    | 000001   | 4/12/2016  | 5/11/2016 | 718
________________________________________________
29    | 000001   | 5/12/2016  | 12/31/9999| 702

So the result queried from table (b) should be key 27 with "score" of 715.

I would appreciate any suggested code, including brief explanation of the functions involved. Thanks in advance

1 ACCEPTED SOLUTION

Accepted Solutions
sbxkoenk
SAS Super FREQ

Hello,


Instead of putting the where clause on the dates as an output dataset option, 

create table c(where=(Shopping_Date1 between Start_Date1 AND End_Date1))


it's better to make a composite where clause below the "from"

 where     a.Account_ID = b.Account_ID
       AND Shopping_Date1 between Start_Date1 AND End_Date1

To verify whether it makes a difference indeed, use some feedback options to acquire extra log-info:

OPTIONS MSGLEVEL=I; 
PROC SQL _METHOD STIMER noprint;

Kind regards,

Koen

View solution in original post

2 REPLIES 2
sbxkoenk
SAS Super FREQ

Hello,

 

dataset b: The end date for Key=27 is before the start date. I changed that.

I wrote a little program that also works when you have thousands of account_id's (instead of just 1).

I'm not completely sure that I fully grasp your request, but the below code (or something close) should do the job:

 

Take care:

This code is quite greedy (especially when you would / could have multiple shopping dates for an account in table a --> Cartesian product due to repeats of by-values in table a AND table b) but it has the advantage of simplicity. Faster solutions are possible (using hash tables for example) but they would be more difficult to read and understand.

 

data a;
Account_ID='00001'; Shopping_Date='03/20/2016'; output;
run;
data b;
Key=26; Account_ID='00001'; Start_Date='02/18/2016'; End_Date='03/19/2016'; Score=709; output;
Key=27; Account_ID='00001'; Start_Date='03/19/2016'; End_Date='04/11/2016'; Score=715; output;
Key=28; Account_ID='00001'; Start_Date='04/12/2016'; End_Date='05/11/2016'; Score=718; output;
Key=29; Account_ID='00001'; Start_Date='05/12/2016'; End_Date='12/31/9999'; Score=702; output;
run;
data a; set a;
Shopping_Date1=input(Shopping_Date,mmddyy10.);
format Shopping_Date1 date9.;
run;
data b; set b;
Start_Date1=input(Start_Date,mmddyy10.);
End_Date1  =input(End_Date,  mmddyy10.);
format Start_Date1 End_Date1 date9.;
run;
PROC SQL noprint;
 create table c(where=(Shopping_Date1 between Start_Date1 AND End_Date1)) as
 select   b.Key , b.Account_ID , b.Start_Date1 , b.End_Date1 , b.Score , a.Shopping_Date1
        , INTCK('DAY',Start_Date1,Shopping_Date1) as DaysDiff  
 from   work.a  a
      , work.b  b
 where a.Account_ID = b.Account_ID
 order by b.Account_ID , DaysDiff;  
QUIT;
data c; set c; by Account_ID DaysDiff; if first.Account_ID; run;
/* end of program */

 

Cheers,

Koen

sbxkoenk
SAS Super FREQ

Hello,


Instead of putting the where clause on the dates as an output dataset option, 

create table c(where=(Shopping_Date1 between Start_Date1 AND End_Date1))


it's better to make a composite where clause below the "from"

 where     a.Account_ID = b.Account_ID
       AND Shopping_Date1 between Start_Date1 AND End_Date1

To verify whether it makes a difference indeed, use some feedback options to acquire extra log-info:

OPTIONS MSGLEVEL=I; 
PROC SQL _METHOD STIMER noprint;

Kind regards,

Koen

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 2 replies
  • 470 views
  • 0 likes
  • 2 in conversation