I have 3 datasets.
1. Based on 3 variables (ID, date and company) in dataset 1, I want to subset a dataset 2 and create new dataset (Char) containing matched observations of common variables
2. Based on 2 variables (ID and date) in dataset 1, I want to subset dataset 3 following a logic that if date in dataset 1 falls at or in between the begin date and end date, then keep that row in a new dataset (Act)
Dataset 1 | |||||
ID | date | Company | Type | variable3 | variable4 |
1 | Jan-10 | a | x | ||
1 | May-10 | a | y | ||
2 | Apr-11 | a | x | ||
3 | Dec-11 | a | x | ||
3 | Feb-12 | b | y |
Dataset 2 | ||||
ID | date | Company | variable1 | variable2 |
1 | Jan-10 | a | ||
1 | Feb-10 | a | ||
1 | Mar-10 | a | ||
1 | Apr-10 | a | ||
1 | May-10 | a | ||
1 | Jun-10 | a | ||
1 | Jul-10 | a | ||
1 | Aug-10 | a | ||
1 | Sep-10 | a | ||
1 | Oct-10 | a | ||
1 | Nov-10 | a | ||
1 | Dec-10 | a | ||
2 | Mar-11 | a | ||
2 | Apr-11 | a | ||
2 | May-11 | a | ||
2 | Jun-11 | a | ||
2 | Jul-11 | a | ||
2 | Aug-11 | a | ||
2 | Sep-11 | a | ||
3 | Feb-10 | a | ||
3 | Mar-10 | a | ||
3 | Apr-10 | a | ||
3 | May-10 | a | ||
3 | Jun-10 | a | ||
3 | Jul-10 | a | ||
3 | Aug-10 | a | ||
3 | Sep-10 | a | ||
3 | Oct-10 | a | ||
3 | Nov-10 | a | ||
3 | Dec-10 | a | ||
3 | Jan-11 | a | ||
3 | Feb-11 | a | ||
3 | Mar-11 | a | ||
3 | Apr-11 | a | ||
3 | May-11 | a | ||
3 | Jun-11 | a | ||
3 | Jul-11 | a | ||
3 | Aug-11 | a | ||
3 | Sep-11 | a | ||
3 | Oct-11 | a | ||
3 | Nov-11 | a | ||
3 | Dec-11 | a | ||
3 | Jan-12 | b | ||
3 | Feb-12 | b | ||
3 | Mar-12 | b |
Dataset 3 | |||||
ID | Status | Begin date | End date | variable5 | variable6 |
1 | c | Jan-10 | Apr-10 | ||
1 | w | May-10 | Nov-10 | ||
1 | t | Dec-10 | |||
2 | t | Mar-11 | May-11 | ||
2 | c | Jun-11 | Sep-11 | ||
3 | t | Feb-10 | Dec-11 | ||
3 | w | Jan-12 |
New dataset-Char
ID | date | Company | variable1 | variable2 |
1 | Jan-10 | a | ||
1 | May-10 | a | ||
2 | Apr-11 | a | ||
3 | Dec-11 | a | ||
3 | Feb-12 | b |
New dataset-Act
ID | Status | Begin date | End date | variable5 | variable6 |
1 | c | Jan-10 | Apr-10 | ||
1 | w | May-10 | Nov-10 | ||
2 | t | Mar-11 | May-11 | ||
3 | t | Feb-10 | Dec-11 | ||
3 | w | Jan-12 |
Any suggestion?
Thanks.
Hi @d0816 Have you tried any look up methods like hash, join, merge etc?
It seems very straight forward while I am too lazy to type. There are tons of examples you will find on a google search and all you need to do is copy paste
@d0816 No worries. I don't think your dates are in this monyy format.
10-Feb | |||
10-Mar | |||
10-Apr
|
So, do a favor Please provide a better representative sample with real dates to effectively test the condition
between the begin date and end date
If you could, I will write the code for you.
dates are in month/year format. Let me provide you better sample.
I did try the following to create new dataset Char. I think my date variable in dataset 1 is being read as number and in dataset 2 as character. I am importing excel file and I cant get to read it as both dates.
proc sql noprint;
create table Char as
select distinct a.*
from dataset1 a, dataset2 b
where a.ID = b.ID
and a.date = b.date
;
quit;
dataset 1 | ||||||
ID | date | Company | C Date | R Date | RR Date | Type |
1 | 2017-06 | AW | 6/5/2017 | 5/16/2017 | 7/11/2017 | IVA |
1 | 2017-10 | MS | 10/10/2017 | 10/3/2017 | 10/17/2017 | GA |
2 | 2016-02 | MS | 2/25/2016 | 2/4/2016 | 3/7/2016 | PA |
2 | 2017-07 | MS | 7/20/2017 | 7/11/2017 | 7/25/2017 | GA |
3 | 2017-03 | AW | 3/22/2017 | 3/5/2017 | 5/3/2017 | IVA |
4 | 2017-09 | RS | 9/28/2017 | 9/28/2017 | 9/28/2017 | MH |
5 | 2017-10 | RS | 10/25/2017 | 10/25/2017 | 10/25/2017 | MH |
ID | date | RYM | Company | place | Most Recent Placement Code |
1 | 2016-01 | 2016-01 | AW | MC | 8 |
1 | 2016-02 | 2016-02 | AW | MC | 8 |
1 | 2016-03 | 2016-03 | AW | MC | 1 |
1 | 2016-04 | 2016-04 | AW | MC | 1 |
1 | 2016-05 | 2016-05 | AW | MC | 2 |
1 | 2016-06 | 2016-06 | AW | MC | 2 |
1 | 2016-07 | 2016-07 | AW | MC | 2 |
1 | 2016-08 | 2016-08 | AW | MC | 2 |
1 | 2016-09 | 2016-09 | AW | MC | 2 |
1 | 2016-10 | 2016-10 | AW | MC | 2 |
1 | 2016-11 | 2016-11 | AW | MC | 2 |
1 | 2016-12 | 2016-12 | AW | MC | 2 |
1 | 2017-01 | 2017-01 | AW | MC | 1 |
1 | 2017-02 | 2017-02 | AW | MC | 1 |
1 | 2017-03 | 2017-03 | AW | MC | 1 |
1 | 2017-04 | 2017-04 | AW | MC | 1 |
1 | 2017-05 | 2017-05 | AW | MC | 1 |
1 | 2017-06 | 2017-06 | AW | MC | 1 |
1 | 2017-07 | 2017-07 | AW | MC | 1 |
1 | 2017-08 | 2017-08 | MS | MC | 1 |
1 | 2017-09 | 2017-09 | MS | MC | 1 |
1 | 2017-10 | 2017-10 | MS | MC | 1 |
1 | 2017-11 | 2017-11 | MS | MC | 1 |
1 | 2017-12 | 2017-12 | MS | MC | 1 |
1 | 2018-01 | 2018-01 | MS | MC | 1 |
1 | 2018-02 | 2018-02 | MS | MC | 1 |
1 | 2018-03 | 2018-03 | MS | MC | 1 |
1 | 2018-04 | 2018-04 | MS | MC | 1 |
2 | 2016-01 | 2016-01 | MS | MC | 1 |
2 | 2016-02 | 2016-02 | MS | MC | 1 |
2 | 2016-03 | 2016-03 | MS | MC | 8 |
2 | 2016-04 | 2016-04 | MS | MC | 8 |
2 | 2016-05 | 2016-05 | MS | MC | 1 |
2 | 2016-06 | 2016-06 | MS | MC | 1 |
2 | 2016-07 | 2016-07 | MS | MC | 1 |
2 | 2016-08 | 2016-08 | MS | MC | 1 |
2 | 2016-09 | 2016-09 | MS | MC | 1 |
2 | 2016-10 | 2016-10 | MS | MC | 1 |
2 | 2016-11 | 2016-11 | MS | MC | 1 |
2 | 2016-12 | 2016-12 | MS | MC | 2 |
2 | 2017-01 | 2017-01 | MS | MC | 1 |
2 | 2017-02 | 2017-02 | MS | MC | 1 |
2 | 2017-03 | 2017-03 | MS | MC | 1 |
2 | 2017-04 | 2017-04 | MS | MC | 1 |
2 | 2017-05 | 2017-05 | MS | MC | 2 |
2 | 2017-06 | 2017-06 | MS | MC | 2 |
2 | 2017-07 | 2017-07 | MS | MC | 2 |
2 | 2017-08 | 2017-08 | MS | MC | 2 |
2 | 2017-09 | 2017-09 | MS | MC | 2 |
2 | 2017-10 | 2017-10 | MS | MC | 2 |
3 | 2016-01 | 2016-01 | AW | MC | 1 |
3 | 2016-02 | 2016-02 | AW | MC | 1 |
3 | 2016-03 | 2016-03 | AW | MC | 1 |
3 | 2016-04 | 2016-04 | AW | MC | 1 |
3 | 2016-05 | 2016-05 | AW | MC | 1 |
3 | 2016-06 | 2016-06 | AW | MC | 1 |
3 | 2016-07 | 2016-07 | AW | MC | 1 |
3 | 2016-08 | 2016-08 | AW | MC | 1 |
3 | 2016-09 | 2016-09 | AW | MC | 1 |
3 | 2016-10 | 2016-10 | AW | MC | 1 |
3 | 2016-11 | 2016-11 | AW | MC | 1 |
3 | 2017-02 | 2017-02 | AW | MC | 1 |
3 | 2017-03 | 2017-03 | AW | MC | 1 |
3 | 2017-04 | 2017-04 | AW | MC | 1 |
3 | 2017-05 | 2017-05 | AW | MC | 1 |
3 | 2017-06 | 2017-06 | AW | MC | 1 |
3 | 2017-07 | 2017-07 | AW | MC | 1 |
3 | 2017-08 | 2017-08 | AW | MC | 1 |
3 | 2017-09 | 2017-09 | AW | MC | 1 |
3 | 2017-10 | 2017-10 | AW | MC | 1 |
3 | 2017-11 | 2017-11 | AW | MC | 1 |
3 | 2017-12 | 2017-12 | AW | MC | 1 |
3 | 2018-01 | 2018-01 | AW | MC | 1 |
3 | 2018-02 | 2018-02 | AW | MC | 1 |
3 | 2018-03 | 2018-03 | AW | MC | 13 |
3 | 2018-04 | 2018-04 | AW | MC | 13 |
4 | 2016-01 | 2016-01 | RS | MC | 1 |
4 | 2016-02 | 2016-02 | RS | MC | 1 |
4 | 2016-03 | 2016-03 | RS | MC | 1 |
4 | 2016-04 | 2016-04 | RS | MC | 1 |
4 | 2016-05 | 2016-05 | RS | MC | 1 |
4 | 2016-06 | 2016-06 | RS | MC | 1 |
4 | 2016-07 | 2016-07 | RS | MC | 1 |
4 | 2016-08 | 2016-08 | RS | MC | 1 |
4 | 2016-09 | 2016-09 | RS | MC | 1 |
4 | 2016-10 | 2016-10 | RS | MC | 1 |
4 | 2016-11 | 2016-11 | RS | MC | 1 |
4 | 2016-12 | 2016-12 | RS | MC | 1 |
4 | 2017-01 | 2017-01 | RS | MC | 1 |
4 | 2017-02 | 2017-02 | RS | MC | 1 |
4 | 2017-03 | 2017-03 | RS | MC | 1 |
4 | 2017-04 | 2017-04 | RS | MC | 1 |
4 | 2017-05 | 2017-05 | RS | MC | 1 |
4 | 2017-06 | 2017-06 | RS | MC | 1 |
4 | 2017-07 | 2017-07 | RS | MC | 1 |
4 | 2017-08 | 2017-08 | RS | MC | 1 |
4 | 2017-09 | 2017-09 | RS | MC | 1 |
4 | 2017-10 | 2017-10 | RS | MC | 1 |
4 | 2017-11 | 2017-11 | RS | MC | 1 |
4 | 2017-12 | 2017-12 | RS | MC | 1 |
4 | 2018-01 | 2018-01 | RS | MC | 1 |
4 | 2018-02 | 2018-02 | RS | MC | 1 |
4 | 2018-03 | 2018-03 | RS | MC | 1 |
4 | 2018-04 | 2018-04 | RS | MC | 1 |
5 | 2016-01 | 2016-01 | RS | MC | 1 |
5 | 2016-02 | 2016-02 | RS | MC | 1 |
5 | 2016-03 | 2016-03 | RS | MC | 1 |
5 | 2016-04 | 2016-04 | RS | MC | 1 |
5 | 2016-05 | 2016-05 | RS | MC | 1 |
5 | 2016-06 | 2016-06 | RS | MC | 1 |
5 | 2016-07 | 2016-07 | RS | MC | 1 |
5 | 2016-08 | 2016-08 | RS | MC | 1 |
5 | 2016-09 | 2016-09 | RS | MC | 1 |
5 | 2016-10 | 2016-10 | RS | MC | 1 |
5 | 2016-11 | 2016-11 | RS | MC | 1 |
5 | 2016-12 | 2016-12 | RS | MC | 1 |
5 | 2017-01 | 2017-01 | RS | MC | 1 |
5 | 2017-02 | 2017-02 | RS | MC | 2 |
5 | 2017-03 | 2017-03 | RS | MC | 2 |
5 | 2017-04 | 2017-04 | RS | MC | 2 |
5 | 2017-05 | 2017-05 | RS | MC | 2 |
5 | 2017-06 | 2017-06 | RS | MC | 2 |
5 | 2017-07 | 2017-07 | RS | MC | 2 |
5 | 2017-08 | 2017-08 | RS | MC | 2 |
5 | 2017-09 | 2017-09 | RS | MC | 2 |
5 | 2017-10 | 2017-10 | RS | MC | 2 |
5 | 2017-11 | 2017-11 | RS | MC | 1 |
5 | 2017-12 | 2017-12 | RS | MC | 1 |
5 | 2018-01 | 2018-01 | RS | MC | 1 |
5 | 2018-02 | 2018-02 | RS | MC | 1 |
5 | 2018-03 | 2018-03 | RS | MC | 1 |
5 | 2018-04 | 2018-04 | RS | MC | 1 |
dataset 3 | ||||
ID | Status | Begin Date | End Date | variable 5 |
1 | c | 5/1/2017 | 6/30/2017 | m |
1 | w | 7/1/2017 | 8/30/2017 | m |
1 | t | 9/1/2017 | m | |
2 | t | 3/1/2017 | 5/30/2017 | f |
2 | c | 6/1/2017 | 9/30/2017 | f |
3 | t | 2/1/2017 | 12/30/2017 | f |
3 | w | 1/1/2018 | f |
Here are the 3 representative datasets as I have it right now in excel.
Thank you.
excel attachment:
@d0816 Thank you for extending that help in providing a better sample.
Please notice here carefully:
Condition: Based on 2 variables (ID and date) in dataset 1, I want to subset dataset 3 following a logic that if date in dataset 1 falls at or in between the begin date and end date, then keep that row in a new dataset (Act)
So when i read the raw file to sas dataset using
data one;
infile cards truncover;
input ID date :anydtdte21. Company $ CDate :mmddyy10. RDate :mmddyy10. RRDate :mmddyy10. Type $;
format date mmddyy10.;
cards;
1 2017-06 AW 6/5/2017 5/16/2017 7/11/2017 IVA
1 2017-10 MS 10/10/2017 10/3/2017 10/17/2017 GA
2 2016-02 MS 2/25/2016 2/4/2016 3/7/2016 PA
2 2017-07 MS 7/20/2017 7/11/2017 7/25/2017 GA
3 2017-03 AW 3/22/2017 3/5/2017 5/3/2017 IVA
4 2017-09 RS 9/28/2017 9/28/2017 9/28/2017 MH
5 2017-10 RS 10/25/2017 10/25/2017 10/25/2017 MH
;
These year-month dates will revert to
date=06/01/2017
date=10/01/2017
date=02/01/2016
date=07/01/2017
date=03/01/2017
date=09/01/2017
date=10/01/2017
so when we look up dataset1 <->dataset3 using ID key and try to determine a date value 2017-06 (read as 06/01/2017) between a complete date
1 | c | 5/1/2017 | 6/30/2017 | m |
will cause serious erroneous results. because we do not know what's the real date value of the year month dates like 2017-06 are?
This is what you need to clarify.
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.