BookmarkSubscribeRSS Feed
scb
Obsidian | Level 7 scb
Obsidian | Level 7

May I know how to get the "events" into the project dataset when the period is in between the event startdate and enddate?

 

data Project;
format Period Date9.;
input Period Date9. Project $;
datalines;
12JAN2019 ProjectA
16FEB2019 ProjectB
17APR2019 ProjectC
;
RUN;

data Event;
FORMAT DATESTART DATE9.; FORMAT DATEEND DATE9.;
input DATESTART:DATE9. DATEEND:DATE9. EVENT $10.;
DATALINES;
01JAN2019 12JAN2019 TESTING_A
13JAN2019 17FEB2019 TESTING_B
15FEB2019 18APR2019 TESTING_C

02MAY2019 04MAY2019 TESTING_D
;
RUN;

 

My desired result will be:

Period          Project           Event

12JAN2019 ProjectA         TESTING_A
16FEB2019 ProjectB         TESTING_B, TESTING_C
17APR2019 ProjectC         TESTING_C

 

Any idea? Thanks.

4 REPLIES 4
mkeintz
PROC Star

By what rule does your PROJECTA (date=12JAN2019) match with the EVENT of TESTINGA (with dates 01JAN2019 through 11JAN2019)? 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
scb
Obsidian | Level 7 scb
Obsidian | Level 7

I have rectified the code, thanks.

mkeintz
PROC Star

One approach to this is to read is to build a daily calendar from the date ranges in the EVENT dataset.  That calendar can be an array of character values, indexed by the date range of your data.

 

Let's say that you expect only dates in 2019.  Then an array with lower bound equivalent to 01jan2019 and upper bound equivalent to 31dec2019 could store a set of 40-character elements,  where each element can store a comma-separated list of events corresponding to that date.

 

Then you could read in PROJECT dataset, look up the PERIOD date in the array, and retrieve the corresponding event list:

 

data Project;
format Period Date9.;
input Period Date9. Project $;
datalines;
12JAN2019 ProjectA
16FEB2019 ProjectB
17APR2019 ProjectC
RUN;

data Event;
FORMAT DATESTART DATE9.; FORMAT DATEEND DATE9.;
input DATESTART DATE9. DATEEND  :DATE9. EVENT $10.;
DATALINES;
01JAN2019 12JAN2019 TESTING_A
13JAN2019 17FEB2019 TESTING_B
15FEB2019 18APR2019 TESTING_C
02MAY2019 04MAY2019 TESTING_D
RUN;

%let lower_bound=%sysfunc(inputn(01jan2019,date9.));
%let upper_bound=%sysfunc(inputn(30jun2019,date9.));

data want (keep=period project event_list);
  array test_list {&lower_bound:&upper_bound} $40 _temporary_;
  set event (in=inE) project (in=inP);
  if inE then do d=datestart to dateend;  
     test_list{d}=catx(',',test_list{d},event);
  end;
  if inP;
  event_list=test_list{period};
run;

 

 

  1. The %let lower_bound=  statement tells sas to make a macro variable LOWER_BOUND with the numeric value of 01jan2019 (01jan1960 is value zero).  The you can use the macrovars LOWER_BOUND and UPPER_BOUND as array limits.
  2. The SET statement reads all the EVENT records first, then all the PROJECT records.  If the record-in-hand is an event record (inE is true), then update each date in the array from DATESTART to DATEEND.
  3. The "if inP" statement is a subsetting if, so only PROJECT records pass this filter.
  4. Finally, retrieve the construct calendar entry for the PERIOD date.

 

Just be sure to (1) have LOWER_BOUND and UPPER_BOUND cover your entire date range, and (2) use a character lenght ($40 above) long enough to cover the busiest date in your study.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Kurt_Bremser
Super User

In order to not have your code mangled, ALWAYS post code using the "little running man" icon. PLEASE.

data project;
format period date9.;
input period :Date9. project $;
datalines;
12JAN2019 ProjectA
16FEB2019 ProjectB
17APR2019 ProjectC
;
run;

data event;
format datestart dateend date9.;
input datestart :date9. dateend :date9. event :$10.;
datalines;
01JAN2019 12JAN2019 TESTING_A
13JAN2019 17FEB2019 TESTING_B
15FEB2019 18APR2019 TESTING_C
02MAY2019 04MAY2019 TESTING_D
;
run;

proc sql;
create table want_int as
select
  a.*,
  b.event as _event
from
  project a
  left join
  event b
on a.period between b.datestart and b.dateend
;
quit;

data want;
set want_int;
by project;
retain event;
length event $100;
if first.project then event = "";
event = catx(',',event,_event);
if last.project;
drop _event;
run;

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 4 replies
  • 576 views
  • 0 likes
  • 3 in conversation