I have 3 datasets.
1. Based on 3 variables (ID, date and company) in dataset 1, I want to subset a dataset 2 and create new dataset (Char) containing matched observations of common variables
2. Based on 2 variables (ID and date) in dataset 1, I want to subset dataset 3 following a logic that if date in dataset 1 falls at or in between the begin date and end date, then keep that row in a new dataset (Act)
Dataset 1 | |||||
ID | date | Company | Type | variable3 | variable4 |
1 | Jan-10 | a | x | ||
1 | May-10 | a | y | ||
2 | Apr-11 | a | x | ||
3 | Dec-11 | a | x | ||
3 | Feb-12 | b | y |
Dataset 2 | ||||
ID | date | Company | variable1 | variable2 |
1 | Jan-10 | a | ||
1 | Feb-10 | a | ||
1 | Mar-10 | a | ||
1 | Apr-10 | a | ||
1 | May-10 | a | ||
1 | Jun-10 | a | ||
1 | Jul-10 | a | ||
1 | Aug-10 | a | ||
1 | Sep-10 | a | ||
1 | Oct-10 | a | ||
1 | Nov-10 | a | ||
1 | Dec-10 | a | ||
2 | Mar-11 | a | ||
2 | Apr-11 | a | ||
2 | May-11 | a | ||
2 | Jun-11 | a | ||
2 | Jul-11 | a | ||
2 | Aug-11 | a | ||
2 | Sep-11 | a | ||
3 | Feb-10 | a | ||
3 | Mar-10 | a | ||
3 | Apr-10 | a | ||
3 | May-10 | a | ||
3 | Jun-10 | a | ||
3 | Jul-10 | a | ||
3 | Aug-10 | a | ||
3 | Sep-10 | a | ||
3 | Oct-10 | a | ||
3 | Nov-10 | a | ||
3 | Dec-10 | a | ||
3 | Jan-11 | a | ||
3 | Feb-11 | a | ||
3 | Mar-11 | a | ||
3 | Apr-11 | a | ||
3 | May-11 | a | ||
3 | Jun-11 | a | ||
3 | Jul-11 | a | ||
3 | Aug-11 | a | ||
3 | Sep-11 | a | ||
3 | Oct-11 | a | ||
3 | Nov-11 | a | ||
3 | Dec-11 | a | ||
3 | Jan-12 | b | ||
3 | Feb-12 | b | ||
3 | Mar-12 | b |
Dataset 3 | |||||
ID | Status | Begin date | End date | variable5 | variable6 |
1 | c | Jan-10 | Apr-10 | ||
1 | w | May-10 | Nov-10 | ||
1 | t | Dec-10 | |||
2 | t | Mar-11 | May-11 | ||
2 | c | Jun-11 | Sep-11 | ||
3 | t | Feb-10 | Dec-11 | ||
3 | w | Jan-12 |
New dataset-Char
ID | date | Company | variable1 | variable2 |
1 | Jan-10 | a | ||
1 | May-10 | a | ||
2 | Apr-11 | a | ||
3 | Dec-11 | a | ||
3 | Feb-12 | b |
New dataset-Act
ID | Status | Begin date | End date | variable5 | variable6 |
1 | c | Jan-10 | Apr-10 | ||
1 | w | May-10 | Nov-10 | ||
2 | t | Mar-11 | May-11 | ||
3 | t | Feb-10 | Dec-11 | ||
3 | w | Jan-12 |
Any suggestion?
Thanks.
Hi @d0816 Have you tried any look up methods like hash, join, merge etc?
It seems very straight forward while I am too lazy to type. There are tons of examples you will find on a google search and all you need to do is copy paste
@d0816 No worries. I don't think your dates are in this monyy format.
10-Feb | |||
10-Mar | |||
10-Apr
|
So, do a favor Please provide a better representative sample with real dates to effectively test the condition
between the begin date and end date
If you could, I will write the code for you.
dates are in month/year format. Let me provide you better sample.
I did try the following to create new dataset Char. I think my date variable in dataset 1 is being read as number and in dataset 2 as character. I am importing excel file and I cant get to read it as both dates.
proc sql noprint;
create table Char as
select distinct a.*
from dataset1 a, dataset2 b
where a.ID = b.ID
and a.date = b.date
;
quit;
dataset 1 | ||||||
ID | date | Company | C Date | R Date | RR Date | Type |
1 | 2017-06 | AW | 6/5/2017 | 5/16/2017 | 7/11/2017 | IVA |
1 | 2017-10 | MS | 10/10/2017 | 10/3/2017 | 10/17/2017 | GA |
2 | 2016-02 | MS | 2/25/2016 | 2/4/2016 | 3/7/2016 | PA |
2 | 2017-07 | MS | 7/20/2017 | 7/11/2017 | 7/25/2017 | GA |
3 | 2017-03 | AW | 3/22/2017 | 3/5/2017 | 5/3/2017 | IVA |
4 | 2017-09 | RS | 9/28/2017 | 9/28/2017 | 9/28/2017 | MH |
5 | 2017-10 | RS | 10/25/2017 | 10/25/2017 | 10/25/2017 | MH |
ID | date | RYM | Company | place | Most Recent Placement Code |
1 | 2016-01 | 2016-01 | AW | MC | 8 |
1 | 2016-02 | 2016-02 | AW | MC | 8 |
1 | 2016-03 | 2016-03 | AW | MC | 1 |
1 | 2016-04 | 2016-04 | AW | MC | 1 |
1 | 2016-05 | 2016-05 | AW | MC | 2 |
1 | 2016-06 | 2016-06 | AW | MC | 2 |
1 | 2016-07 | 2016-07 | AW | MC | 2 |
1 | 2016-08 | 2016-08 | AW | MC | 2 |
1 | 2016-09 | 2016-09 | AW | MC | 2 |
1 | 2016-10 | 2016-10 | AW | MC | 2 |
1 | 2016-11 | 2016-11 | AW | MC | 2 |
1 | 2016-12 | 2016-12 | AW | MC | 2 |
1 | 2017-01 | 2017-01 | AW | MC | 1 |
1 | 2017-02 | 2017-02 | AW | MC | 1 |
1 | 2017-03 | 2017-03 | AW | MC | 1 |
1 | 2017-04 | 2017-04 | AW | MC | 1 |
1 | 2017-05 | 2017-05 | AW | MC | 1 |
1 | 2017-06 | 2017-06 | AW | MC | 1 |
1 | 2017-07 | 2017-07 | AW | MC | 1 |
1 | 2017-08 | 2017-08 | MS | MC | 1 |
1 | 2017-09 | 2017-09 | MS | MC | 1 |
1 | 2017-10 | 2017-10 | MS | MC | 1 |
1 | 2017-11 | 2017-11 | MS | MC | 1 |
1 | 2017-12 | 2017-12 | MS | MC | 1 |
1 | 2018-01 | 2018-01 | MS | MC | 1 |
1 | 2018-02 | 2018-02 | MS | MC | 1 |
1 | 2018-03 | 2018-03 | MS | MC | 1 |
1 | 2018-04 | 2018-04 | MS | MC | 1 |
2 | 2016-01 | 2016-01 | MS | MC | 1 |
2 | 2016-02 | 2016-02 | MS | MC | 1 |
2 | 2016-03 | 2016-03 | MS | MC | 8 |
2 | 2016-04 | 2016-04 | MS | MC | 8 |
2 | 2016-05 | 2016-05 | MS | MC | 1 |
2 | 2016-06 | 2016-06 | MS | MC | 1 |
2 | 2016-07 | 2016-07 | MS | MC | 1 |
2 | 2016-08 | 2016-08 | MS | MC | 1 |
2 | 2016-09 | 2016-09 | MS | MC | 1 |
2 | 2016-10 | 2016-10 | MS | MC | 1 |
2 | 2016-11 | 2016-11 | MS | MC | 1 |
2 | 2016-12 | 2016-12 | MS | MC | 2 |
2 | 2017-01 | 2017-01 | MS | MC | 1 |
2 | 2017-02 | 2017-02 | MS | MC | 1 |
2 | 2017-03 | 2017-03 | MS | MC | 1 |
2 | 2017-04 | 2017-04 | MS | MC | 1 |
2 | 2017-05 | 2017-05 | MS | MC | 2 |
2 | 2017-06 | 2017-06 | MS | MC | 2 |
2 | 2017-07 | 2017-07 | MS | MC | 2 |
2 | 2017-08 | 2017-08 | MS | MC | 2 |
2 | 2017-09 | 2017-09 | MS | MC | 2 |
2 | 2017-10 | 2017-10 | MS | MC | 2 |
3 | 2016-01 | 2016-01 | AW | MC | 1 |
3 | 2016-02 | 2016-02 | AW | MC | 1 |
3 | 2016-03 | 2016-03 | AW | MC | 1 |
3 | 2016-04 | 2016-04 | AW | MC | 1 |
3 | 2016-05 | 2016-05 | AW | MC | 1 |
3 | 2016-06 | 2016-06 | AW | MC | 1 |
3 | 2016-07 | 2016-07 | AW | MC | 1 |
3 | 2016-08 | 2016-08 | AW | MC | 1 |
3 | 2016-09 | 2016-09 | AW | MC | 1 |
3 | 2016-10 | 2016-10 | AW | MC | 1 |
3 | 2016-11 | 2016-11 | AW | MC | 1 |
3 | 2017-02 | 2017-02 | AW | MC | 1 |
3 | 2017-03 | 2017-03 | AW | MC | 1 |
3 | 2017-04 | 2017-04 | AW | MC | 1 |
3 | 2017-05 | 2017-05 | AW | MC | 1 |
3 | 2017-06 | 2017-06 | AW | MC | 1 |
3 | 2017-07 | 2017-07 | AW | MC | 1 |
3 | 2017-08 | 2017-08 | AW | MC | 1 |
3 | 2017-09 | 2017-09 | AW | MC | 1 |
3 | 2017-10 | 2017-10 | AW | MC | 1 |
3 | 2017-11 | 2017-11 | AW | MC | 1 |
3 | 2017-12 | 2017-12 | AW | MC | 1 |
3 | 2018-01 | 2018-01 | AW | MC | 1 |
3 | 2018-02 | 2018-02 | AW | MC | 1 |
3 | 2018-03 | 2018-03 | AW | MC | 13 |
3 | 2018-04 | 2018-04 | AW | MC | 13 |
4 | 2016-01 | 2016-01 | RS | MC | 1 |
4 | 2016-02 | 2016-02 | RS | MC | 1 |
4 | 2016-03 | 2016-03 | RS | MC | 1 |
4 | 2016-04 | 2016-04 | RS | MC | 1 |
4 | 2016-05 | 2016-05 | RS | MC | 1 |
4 | 2016-06 | 2016-06 | RS | MC | 1 |
4 | 2016-07 | 2016-07 | RS | MC | 1 |
4 | 2016-08 | 2016-08 | RS | MC | 1 |
4 | 2016-09 | 2016-09 | RS | MC | 1 |
4 | 2016-10 | 2016-10 | RS | MC | 1 |
4 | 2016-11 | 2016-11 | RS | MC | 1 |
4 | 2016-12 | 2016-12 | RS | MC | 1 |
4 | 2017-01 | 2017-01 | RS | MC | 1 |
4 | 2017-02 | 2017-02 | RS | MC | 1 |
4 | 2017-03 | 2017-03 | RS | MC | 1 |
4 | 2017-04 | 2017-04 | RS | MC | 1 |
4 | 2017-05 | 2017-05 | RS | MC | 1 |
4 | 2017-06 | 2017-06 | RS | MC | 1 |
4 | 2017-07 | 2017-07 | RS | MC | 1 |
4 | 2017-08 | 2017-08 | RS | MC | 1 |
4 | 2017-09 | 2017-09 | RS | MC | 1 |
4 | 2017-10 | 2017-10 | RS | MC | 1 |
4 | 2017-11 | 2017-11 | RS | MC | 1 |
4 | 2017-12 | 2017-12 | RS | MC | 1 |
4 | 2018-01 | 2018-01 | RS | MC | 1 |
4 | 2018-02 | 2018-02 | RS | MC | 1 |
4 | 2018-03 | 2018-03 | RS | MC | 1 |
4 | 2018-04 | 2018-04 | RS | MC | 1 |
5 | 2016-01 | 2016-01 | RS | MC | 1 |
5 | 2016-02 | 2016-02 | RS | MC | 1 |
5 | 2016-03 | 2016-03 | RS | MC | 1 |
5 | 2016-04 | 2016-04 | RS | MC | 1 |
5 | 2016-05 | 2016-05 | RS | MC | 1 |
5 | 2016-06 | 2016-06 | RS | MC | 1 |
5 | 2016-07 | 2016-07 | RS | MC | 1 |
5 | 2016-08 | 2016-08 | RS | MC | 1 |
5 | 2016-09 | 2016-09 | RS | MC | 1 |
5 | 2016-10 | 2016-10 | RS | MC | 1 |
5 | 2016-11 | 2016-11 | RS | MC | 1 |
5 | 2016-12 | 2016-12 | RS | MC | 1 |
5 | 2017-01 | 2017-01 | RS | MC | 1 |
5 | 2017-02 | 2017-02 | RS | MC | 2 |
5 | 2017-03 | 2017-03 | RS | MC | 2 |
5 | 2017-04 | 2017-04 | RS | MC | 2 |
5 | 2017-05 | 2017-05 | RS | MC | 2 |
5 | 2017-06 | 2017-06 | RS | MC | 2 |
5 | 2017-07 | 2017-07 | RS | MC | 2 |
5 | 2017-08 | 2017-08 | RS | MC | 2 |
5 | 2017-09 | 2017-09 | RS | MC | 2 |
5 | 2017-10 | 2017-10 | RS | MC | 2 |
5 | 2017-11 | 2017-11 | RS | MC | 1 |
5 | 2017-12 | 2017-12 | RS | MC | 1 |
5 | 2018-01 | 2018-01 | RS | MC | 1 |
5 | 2018-02 | 2018-02 | RS | MC | 1 |
5 | 2018-03 | 2018-03 | RS | MC | 1 |
5 | 2018-04 | 2018-04 | RS | MC | 1 |
dataset 3 | ||||
ID | Status | Begin Date | End Date | variable 5 |
1 | c | 5/1/2017 | 6/30/2017 | m |
1 | w | 7/1/2017 | 8/30/2017 | m |
1 | t | 9/1/2017 | m | |
2 | t | 3/1/2017 | 5/30/2017 | f |
2 | c | 6/1/2017 | 9/30/2017 | f |
3 | t | 2/1/2017 | 12/30/2017 | f |
3 | w | 1/1/2018 | f |
Here are the 3 representative datasets as I have it right now in excel.
Thank you.
excel attachment:
@d0816 Thank you for extending that help in providing a better sample.
Please notice here carefully:
Condition: Based on 2 variables (ID and date) in dataset 1, I want to subset dataset 3 following a logic that if date in dataset 1 falls at or in between the begin date and end date, then keep that row in a new dataset (Act)
So when i read the raw file to sas dataset using
data one;
infile cards truncover;
input ID date :anydtdte21. Company $ CDate :mmddyy10. RDate :mmddyy10. RRDate :mmddyy10. Type $;
format date mmddyy10.;
cards;
1 2017-06 AW 6/5/2017 5/16/2017 7/11/2017 IVA
1 2017-10 MS 10/10/2017 10/3/2017 10/17/2017 GA
2 2016-02 MS 2/25/2016 2/4/2016 3/7/2016 PA
2 2017-07 MS 7/20/2017 7/11/2017 7/25/2017 GA
3 2017-03 AW 3/22/2017 3/5/2017 5/3/2017 IVA
4 2017-09 RS 9/28/2017 9/28/2017 9/28/2017 MH
5 2017-10 RS 10/25/2017 10/25/2017 10/25/2017 MH
;
These year-month dates will revert to
date=06/01/2017
date=10/01/2017
date=02/01/2016
date=07/01/2017
date=03/01/2017
date=09/01/2017
date=10/01/2017
so when we look up dataset1 <->dataset3 using ID key and try to determine a date value 2017-06 (read as 06/01/2017) between a complete date
1 | c | 5/1/2017 | 6/30/2017 | m |
will cause serious erroneous results. because we do not know what's the real date value of the year month dates like 2017-06 are?
This is what you need to clarify.
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.