BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
sas_student1
Quartz | Level 8

Hello,

 

I have a question on how to join two tables but with conditions. I am using SAS 9.4. I have one table that has about 4400 records of clients these are duplicates Client ID with Start dates (see table 1 below). I have another table, about 233 rows, that has client_IDs begin program dates, end program dates and program names (see table 2), the client ID in table 2 are the same as in table 1 but they tell me which prgram a cient is enrolled in a certain time period. What I would like to do is to populate Table 1 with the program names where the start_date in table 1 is between or equal to the being and end program dates in table 2. So the output looks like table 3 below.

I used the following left join SQL, but I ended up with about 500 extra rows. Why was that? There are cases where items in table 2 dont match table 1, and if that is the case then I do not want that in table 3. Is that where the extra 500 cases come from?

 

PROC SQL;

CREATE TABLE want_table3 AS

SELECT D.*, B.Program_name, B.Begin_date, B.end_date

FROM table1 D left join table2 B

on D.client_ID = B.client_ID

and b.Begin_date <=d.start_date<= b.end_date;

RUN;

QUIT;

 

Table 1: Original Table
Client_IDStart_Date
11/1/2015
11/2/2015
11/3/2015
12/3/2015
12/4/2015
12/5/2015
21/2/2015
21/3/2015
21/4/2015
22/5/2015
22/6/2015
22/7/2015
33/1/2015
33/2/2015
34/5/2015
34/6/2015

 

Table 2: Referance Table
ClientBegin_dateEnd_dateProgram_name
14/1/20131/31/2015Inpatient
12/1/201512/31/9999Oupatient
21/1/20152/28/2015Inpatient
23/1/20154/1/2015Oupatient
24/2/201512/31/9999Hospital
35/5/20132/28/2015Oupatient

 

 

Table 3: Final Table
Client_IDStart_DateProgram_name
11/1/2015Inpatient
11/2/2015Inpatient
11/3/2015Inpatient
12/3/2015Outpatient
12/4/2015Outpatient
12/5/2015Outpatient
21/2/2015Inpatient
21/3/2015Inpatient
21/4/2015Inpatient
22/5/2015Inpatient
22/6/2015Inpatient
22/7/2015Inpatient
33/1/2015Outpatient
33/2/2015Outpatient
34/5/2015Hospital
34/6/2015Hospital

 

 

Than you for your help in advance

1 ACCEPTED SOLUTION

Accepted Solutions
sas_student1
Quartz | Level 8

Thank you for your reply.

 

But I think I figured out what the problem was.. I did not have a "distinct" after my select statement so...

 

PROC SQL;

CREATE TABLE want_table3 AS

SELECT DISTINCT D.*, B.Program_name, B.Begin_date, B.end_date

FROM table1 D left join table2 B

on D.client_ID = B.client_ID

and b.Begin_date <=d.start_date<= b.end_date;

RUN;

QUIT;

 

by adding the distinct I got the same number of rows as I was expecting.

 

Thank you again

 

View solution in original post

5 REPLIES 5
Reeza
Super User

Please expand your sample data to include some of the issues mentioned. It looks like a clean sample right now. 

sas_student1
Quartz | Level 8

Thank you for your reply. my updated table samples are below.. Notice Client 4 is not in table 2 and that is okay.. I would still like to bring client 4 to my final table.

 

Also notice I added client 5 to table 2 (and one of the cells is purposely left blank), since client 5 is not in table 1 I do not want that client info in my final table.

 

Table 1: Original Table

 
Client_IDStart_Date
11/1/2015
11/2/2015
11/3/2015
12/3/2015
12/4/2015
12/5/2015
21/2/2015
21/3/2015
21/4/2015
22/5/2015
22/6/2015
22/7/2015
33/1/2015
33/2/2015
34/5/2015
34/6/2015
45/1/2015
45/4/2016
45/6/2015

 

 

ClientBegin_dateEnd_dateProgram_name
14/1/20131/31/2015Inpatient
12/1/201512/31/9999Oupatient
21/1/20152/28/2015Inpatient
23/1/20154/1/2015Oupatient
24/2/201512/31/9999Hospital
35/5/20132/28/2015Oupatient
56/7/20135/5/2015 
55/6/201512/31/9999Hospital

 

Client_IDStart_DateProgram_name
11/1/2015Inpatient
11/2/2015Inpatient
11/3/2015Inpatient
12/3/2015Outpatient
12/4/2015Outpatient
12/5/2015Outpatient
21/2/2015Inpatient
21/3/2015Inpatient
21/4/2015Inpatient
22/5/2015Inpatient
22/6/2015Inpatient
22/7/2015Inpatient
33/1/2015Outpatient
33/2/2015Outpatient
34/5/2015Hospital
34/6/2015Hospital
45/1/2015NoREF
45/4/2016NoREF
45/6/2015NoREF

 

Hope this clarifies. Thank you again for your help in advance.

Appreciate it

 

Reeza
Super User

There's nothing wrong with the code you've shown. You're getting 'extra' records either because your condition is not unique, ie the between matches multiple ranges in table 2?

 

You should pull up a few rows and see what they look like. Post some samples if you can.

sas_student1
Quartz | Level 8

Thank you for your reply.

 

But I think I figured out what the problem was.. I did not have a "distinct" after my select statement so...

 

PROC SQL;

CREATE TABLE want_table3 AS

SELECT DISTINCT D.*, B.Program_name, B.Begin_date, B.end_date

FROM table1 D left join table2 B

on D.client_ID = B.client_ID

and b.Begin_date <=d.start_date<= b.end_date;

RUN;

QUIT;

 

by adding the distinct I got the same number of rows as I was expecting.

 

Thank you again

 

Reeza
Super User

You must have duplicates in one of your source tables. Distinct shouldn't be required. 

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 1975 views
  • 0 likes
  • 2 in conversation