BookmarkSubscribeRSS Feed
d0816
Quartz | Level 8

I have 3 datasets.

 

1. Based on 3 variables (ID, date and company) in dataset 1, I want to subset a dataset 2 and create new dataset (Char) containing matched observations of common variables

 

2. Based on 2 variables (ID and date) in dataset 1, I want to subset dataset 3 following a logic that if date in dataset 1 falls at or in between the begin date and end date, then keep that row in a new dataset (Act)

Dataset 1     
IDdateCompanyTypevariable3variable4
1Jan-10ax  
1May-10ay  
2Apr-11ax  
3Dec-11ax  
3Feb-12by  

 

Dataset 2    
IDdateCompanyvariable1variable2
1Jan-10a  
1Feb-10a  
1Mar-10a  
1Apr-10a  
1May-10a  
1Jun-10a  
1Jul-10a  
1Aug-10a  
1Sep-10a  
1Oct-10a  
1Nov-10a  
1Dec-10a  
2Mar-11a  
2Apr-11a  
2May-11a  
2Jun-11a  
2Jul-11a  
2Aug-11a  
2Sep-11a  
3Feb-10a  
3Mar-10a  
3Apr-10a  
3May-10a  
3Jun-10a  
3Jul-10a  
3Aug-10a  
3Sep-10a  
3Oct-10a  
3Nov-10a  
3Dec-10a  
3Jan-11a  
3Feb-11a  
3Mar-11a  
3Apr-11a  
3May-11a  
3Jun-11a  
3Jul-11a  
3Aug-11a  
3Sep-11a  
3Oct-11a  
3Nov-11a  
3Dec-11a  
3Jan-12b  
3Feb-12b  
3Mar-12b  

 

Dataset 3     
IDStatusBegin dateEnd datevariable5variable6
1cJan-10Apr-10  
1wMay-10Nov-10  
1tDec-10   
2tMar-11May-11  
2cJun-11Sep-11  
3tFeb-10Dec-11  
3wJan-12   

 

 

New dataset-Char

IDdateCompanyvariable1variable2
1Jan-10a  
1May-10a  
2Apr-11a  
3Dec-11a  
3Feb-12b  

 

New dataset-Act

IDStatusBegin dateEnd datevariable5variable6
1cJan-10Apr-10  
1wMay-10Nov-10  
2tMar-11May-11  
3tFeb-10Dec-11  
3wJan-12   

Any suggestion?

 

Thanks.

7 REPLIES 7
novinosrin
Tourmaline | Level 20

Hi  @d0816 Have you tried any look up methods like hash, join, merge etc? 

 

It seems very straight forward while I am too lazy to type. There are tons of examples you will find on a google search and all you need to do is copy paste

 

d0816
Quartz | Level 8
I have done simple merges. But don't know hash or join methods. Can you point me to resource for this method?






novinosrin
Tourmaline | Level 20

@d0816 No worries. I don't think your dates are in this monyy format. 

10-Feb
10-Mar
10-Apr
10-Feb
10-Mar
10-Apr

 

So, do a favor Please provide a better representative sample with real dates to effectively test the condition

between the begin date and end date  

 

If you could, I will write the code for you.

d0816
Quartz | Level 8

dates are in month/year format. Let me provide you better sample.

 

I did try the following to create new dataset Char. I think my date variable in dataset 1 is being read as number and in dataset 2 as character. I am importing excel file and I cant get to read it as both dates.

 

proc sql noprint;

create table Char as

select distinct a.*

from dataset1 a, dataset2 b

where a.ID = b.ID

and a.date = b.date

;

quit;

 

d0816
Quartz | Level 8
dataset 1     
IDdateCompanyC DateR DateRR DateType
12017-06AW6/5/20175/16/20177/11/2017IVA
12017-10MS10/10/201710/3/201710/17/2017GA
22016-02MS2/25/20162/4/20163/7/2016PA
22017-07MS7/20/20177/11/20177/25/2017GA
32017-03AW3/22/20173/5/20175/3/2017IVA
42017-09RS9/28/20179/28/20179/28/2017MH
52017-10RS10/25/201710/25/201710/25/2017MH

 

IDdateRYMCompanyplaceMost Recent Placement Code
12016-012016-01AWMC8
12016-022016-02AWMC8
12016-032016-03AWMC1
12016-042016-04AWMC1
12016-052016-05AWMC2
12016-062016-06AWMC2
12016-072016-07AWMC2
12016-082016-08AWMC2
12016-092016-09AWMC2
12016-102016-10AWMC2
12016-112016-11AWMC2
12016-122016-12AWMC2
12017-012017-01AWMC1
12017-022017-02AWMC1
12017-032017-03AWMC1
12017-042017-04AWMC1
12017-052017-05AWMC1
12017-062017-06AWMC1
12017-072017-07AWMC1
12017-082017-08MSMC1
12017-092017-09MSMC1
12017-102017-10MSMC1
12017-112017-11MSMC1
12017-122017-12MSMC1
12018-012018-01MSMC1
12018-022018-02MSMC1
12018-032018-03MSMC1
12018-042018-04MSMC1
22016-012016-01MSMC1
22016-022016-02MSMC1
22016-032016-03MSMC8
22016-042016-04MSMC8
22016-052016-05MSMC1
22016-062016-06MSMC1
22016-072016-07MSMC1
22016-082016-08MSMC1
22016-092016-09MSMC1
22016-102016-10MSMC1
22016-112016-11MSMC1
22016-122016-12MSMC2
22017-012017-01MSMC1
22017-022017-02MSMC1
22017-032017-03MSMC1
22017-042017-04MSMC1
22017-052017-05MSMC2
22017-062017-06MSMC2
22017-072017-07MSMC2
22017-082017-08MSMC2
22017-092017-09MSMC2
22017-102017-10MSMC2
32016-012016-01AWMC1
32016-022016-02AWMC1
32016-032016-03AWMC1
32016-042016-04AWMC1
32016-052016-05AWMC1
32016-062016-06AWMC1
32016-072016-07AWMC1
32016-082016-08AWMC1
32016-092016-09AWMC1
32016-102016-10AWMC1
32016-112016-11AWMC1
32017-022017-02AWMC1
32017-032017-03AWMC1
32017-042017-04AWMC1
32017-052017-05AWMC1
32017-062017-06AWMC1
32017-072017-07AWMC1
32017-082017-08AWMC1
32017-092017-09AWMC1
32017-102017-10AWMC1
32017-112017-11AWMC1
32017-122017-12AWMC1
32018-012018-01AWMC1
32018-022018-02AWMC1
32018-032018-03AWMC13
32018-042018-04AWMC13
42016-012016-01RSMC1
42016-022016-02RSMC1
42016-032016-03RSMC1
42016-042016-04RSMC1
42016-052016-05RSMC1
42016-062016-06RSMC1
42016-072016-07RSMC1
42016-082016-08RSMC1
42016-092016-09RSMC1
42016-102016-10RSMC1
42016-112016-11RSMC1
42016-122016-12RSMC1
42017-012017-01RSMC1
42017-022017-02RSMC1
42017-032017-03RSMC1
42017-042017-04RSMC1
42017-052017-05RSMC1
42017-062017-06RSMC1
42017-072017-07RSMC1
42017-082017-08RSMC1
42017-092017-09RSMC1
42017-102017-10RSMC1
42017-112017-11RSMC1
42017-122017-12RSMC1
42018-012018-01RSMC1
42018-022018-02RSMC1
42018-032018-03RSMC1
42018-042018-04RSMC1
52016-012016-01RSMC1
52016-022016-02RSMC1
52016-032016-03RSMC1
52016-042016-04RSMC1
52016-052016-05RSMC1
52016-062016-06RSMC1
52016-072016-07RSMC1
52016-082016-08RSMC1
52016-092016-09RSMC1
52016-102016-10RSMC1
52016-112016-11RSMC1
52016-122016-12RSMC1
52017-012017-01RSMC1
52017-022017-02RSMC2
52017-032017-03RSMC2
52017-042017-04RSMC2
52017-052017-05RSMC2
52017-062017-06RSMC2
52017-072017-07RSMC2
52017-082017-08RSMC2
52017-092017-09RSMC2
52017-102017-10RSMC2
52017-112017-11RSMC1
52017-122017-12RSMC1
52018-012018-01RSMC1
52018-022018-02RSMC1
52018-032018-03RSMC1
52018-042018-04RSMC1

 

dataset 3   
IDStatusBegin DateEnd Datevariable 5
1c5/1/20176/30/2017m
1w7/1/20178/30/2017m
1t9/1/2017 m
2t3/1/20175/30/2017f
2c6/1/20179/30/2017f
3t2/1/201712/30/2017f
3w1/1/2018 f

 

 

Here are the 3 representative datasets as I have it right now in excel.

 

Thank you.

d0816
Quartz | Level 8

excel attachment:

novinosrin
Tourmaline | Level 20

@d0816 Thank you for extending that help in providing a better sample.

Please notice here carefully:

 

Condition: Based on 2 variables (ID and date) in dataset 1, I want to subset dataset 3 following a logic that if date in dataset 1 falls at or in between the begin date and end date, then keep that row in a new dataset (Act)

 

So when i read the raw file to sas dataset using 

 

data one;
infile cards truncover;
input ID date :anydtdte21. Company $ CDate :mmddyy10. RDate :mmddyy10. RRDate :mmddyy10. Type $;
format date mmddyy10.;
cards;
1 2017-06 AW 6/5/2017 5/16/2017 7/11/2017 IVA
1 2017-10 MS 10/10/2017 10/3/2017 10/17/2017 GA
2 2016-02 MS 2/25/2016 2/4/2016 3/7/2016 PA
2 2017-07 MS 7/20/2017 7/11/2017 7/25/2017 GA
3 2017-03 AW 3/22/2017 3/5/2017 5/3/2017 IVA
4 2017-09 RS 9/28/2017 9/28/2017 9/28/2017 MH
5 2017-10 RS 10/25/2017 10/25/2017 10/25/2017 MH
;

 

These year-month dates will revert to

date=06/01/2017
date=10/01/2017
date=02/01/2016
date=07/01/2017
date=03/01/2017
date=09/01/2017
date=10/01/2017

 

so when we look up dataset1 <->dataset3 using ID key and try to determine  a date value 2017-06 (read as 06/01/2017) between a complete date 

1c5/1/20176/30/2017m

will cause serious erroneous results. because we do not know what's the real date value of the year month dates like 2017-06  are?

 

This is what you need to clarify. 

 

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 2406 views
  • 1 like
  • 2 in conversation