Hello,
I have two tables that I need to join on but want to create a flag on the left table so that when it joins with the right table I get a flag in my final table indicating which row is from the left table. For example, I have one that has 683 rows, this include client ID, admit date, discharge date, service type and 30 other columns. I would join this table with a table that has 8 million rows. I would join on the condition of if the Client ID are the same and the admit date from the left table (683 row) is less then or equal to the admit date on the right table (8 million rows). when I join these two tables I get 1250 rows. Now I want to be able to flag the 1250 rows to indicate the first row that was from the 683 table.
Here is the initial table I would have:
data have; informat ID $1. start_date mmddyy10. end_date mmddyy10.; input ID start_date end_date ; format start_date end_date mmddyy10.; datalines; 1 01/01/2018 01/10/2018 2 01/05/2018 01/06/2018 3 02/05/2018 02/15/2018 ; run;
This is the table I would left join with:
data have2; informat ID $1. start_date mmddyy10. end_date mmddyy10.; input ID start_date end_date ; format start_date end_date mmddyy10.; datalines; 1 01/01/2018 01/10/2018 1 01/11/2018 01/31/2018 1 02/05/2018 02/15/2018 1 02/19/2018 05/30/2018 1 06/15/2018 06/30/2018 1 07/05/2018 07/20/2018 2 01/12/2018 01/15/2018 2 01/25/2018 01/30/2018 3 02/05/2018 02/15/2018 3 03/20/2018 04/15/2018 3 04/20/2018 07/31/2018 3 08/14/2018 09/10/2018 ; run;
This is the table I would want
ID | start_date | end_date | flag |
1 | 1/1/2018 | 1/10/2018 | 1 |
1 | 1/11/2018 | 1/31/2018 | 0 |
1 | 2/5/2018 | 2/15/2018 | 0 |
1 | 2/19/2018 | 5/30/2018 | 0 |
1 | 6/15/2018 | 6/30/2018 | 0 |
1 | 7/5/2018 | 7/20/2018 | 0 |
2 | 1/12/2018 | 1/15/2018 | 1 |
2 | 1/25/2018 | 1/30/2018 | 0 |
3 | 2/5/2018 | 2/15/2018 | 1 |
3 | 3/20/2018 | 4/15/2018 | 0 |
3 | 4/20/2018 | 7/31/2018 | 0 |
3 | 8/14/2018 | 9/10/2018 | 0 |
I have a case where the date in the first table is not exactly equal to the date that I am joining with, see ID 2 above, so I would need to make a condition of sort that may be similar to first. but how would I do that using SQL join?
Thanks
You can add the flag by a sas data step after the sql join or do it in one step using merge instead sql join:
1) add to your sql code the line order by ID, start_date then add the next step:
data want;
set want;
by id;
if first.ID then flag=1;
else flag=0;
run;
2) alternatively:
data want;
merge have (in=in1)
have2;
by ID;
if in1; /* equivalent to sql left join */
if first.id then flag = 1;
else flag = 0;
run;
Please show the SQL you have been using.
Your description of the desired result is not clear.
Does the small table have duplicate observations for the same ID?
Are you just asking to take the MINinum date?
I don't quite get how your join criteria correspond to the proposed output WANT. You say that a match occurs when start_date from HAVE is less than OR equal to start_date from HAVE1. Hence, even though start_date=01/05/2018 from HAVE for ID=2 isn't equal to any start_date from HAVE1 for ID=2, it should be in the output file since it is less than any of the start_date in the ID=2 group in HAVE1.
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.