BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
cuan
Obsidian | Level 7

Dear SAS community,

 

I have a problem, it will be great if you can give me the solution.

 

I need to setup 3 periods:

dataset I have:

 

id date treat

1 10aug2019 1

1 16aug2019 1

1 20aug2019 1

1 30aug2019 1

1 30aug2019 2

1 02sep2019 2

1 06sep2019 2

1 09sep2019 2

1 15 sep2019 3

 

I need three periods:

 

dataset I want:

id period  startdt            stopdt

 1    1       10 aug2019 29aug2019

1     2        30aug2019  09sep2019 

1    3         10sep2019  15sep2019

 

Could you help me wo create these a dataset with 3 startdt and 3 stopdt?

 

Many thanks.

 

Cuan.

 

1 ACCEPTED SOLUTION

Accepted Solutions
mkeintz
PROC Star

I presume period corresponds to treatment, except:

  1. If two periods have an overlapping day, you want the overlapping date to belong exclusively to the later period, correct?
  2. And if there is a gap between periods, you want the gap to belong to the later period.  Is that also correct?

If so, then:

data have;
  input id date :date9. treat;
  format date date9.;
datalines;
1 10aug2019 1
1 16aug2019 1
1 20aug2019 1
1 30aug2019 1
1 30aug2019 2
1 02sep2019 2
1 06sep2019 2
1 09sep2019 2
1 15sep2019 3
run;

data want (keep=id treat start_dt end_dt);
  set have (keep=id treat) end=end_of_have;
  by id treat;

  merge have have (firstobs=2 keep=date rename=(date=_nxt_date));

  retain start_dt end_dt;
  format start_dt end_dt date9.;

  if first.treat then start_dt=ifn(first.id,date,sum(end_dt,1));
  if end_of_have then end_dt=date;
  else end_dt=min(date,_nxt_date-1);

  if last.treat;

run;

Note this only works if the overlap is never more than 1 day.   Reviewed the program - it DOES work if the overlap is more than 1 day.

 

 

 

 

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

View solution in original post

6 REPLIES 6
mkeintz
PROC Star

I presume period corresponds to treatment, except:

  1. If two periods have an overlapping day, you want the overlapping date to belong exclusively to the later period, correct?
  2. And if there is a gap between periods, you want the gap to belong to the later period.  Is that also correct?

If so, then:

data have;
  input id date :date9. treat;
  format date date9.;
datalines;
1 10aug2019 1
1 16aug2019 1
1 20aug2019 1
1 30aug2019 1
1 30aug2019 2
1 02sep2019 2
1 06sep2019 2
1 09sep2019 2
1 15sep2019 3
run;

data want (keep=id treat start_dt end_dt);
  set have (keep=id treat) end=end_of_have;
  by id treat;

  merge have have (firstobs=2 keep=date rename=(date=_nxt_date));

  retain start_dt end_dt;
  format start_dt end_dt date9.;

  if first.treat then start_dt=ifn(first.id,date,sum(end_dt,1));
  if end_of_have then end_dt=date;
  else end_dt=min(date,_nxt_date-1);

  if last.treat;

run;

Note this only works if the overlap is never more than 1 day.   Reviewed the program - it DOES work if the overlap is more than 1 day.

 

 

 

 

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
cuan
Obsidian | Level 7

Hi ,  many thanks for this,

 

but look like your code doesn't work with this dataset:

01AUG2017 1
16AUG2017 1
30AUG2017 1
30AUG2017 2
31AUG2017 2
01SEP2017 2
02SEP2017 2
03SEP2017 2
14SEP2017 3

 

It is getting the 3rd period: statdt=04sep2017 (correct, but stopdate=28nov2016 (incorrect),

it should be 14sep2017.

 

Not surre why you are getting 28nov2016.

 

Many tanks.

cuan
Obsidian | Level 7

ohh I see,

 

your code is working fine, what about tif we put two id?

id date treat

1 01AUG2017 1
1 16AUG2017 1
1 30AUG2017 1
1 30AUG2017 2
1 31AUG2017 2
1 01SEP2017 2
1 02SEP2017 2
1 03SEP2017 2
1 14SEP2017 3
2 29NOV2016 1
2 13DEC2016 1
2 27DEC2016 1

 

please could  you  your code with now 2 id? because if you run your actual code, with these two id is what stopdate (3 period) come from = 29NOV2016, and should be 14sep2017.

Many thanks.

cuan
Obsidian | Level 7

And if there is a gap between periods, you want the gap to belong to the later period.  Is that also correct?

 

No, I ma afraid, this is not correct, I would like to consider gaps between periods, for example , like this example for one ID:

 

treatment 3: should start the 7jun, and your code start the 29May.

 

08MAR2018 1
23MAR2018 1
26APR2018 1
10MAY2018 1
24MAY2018 1
24MAY2018 2
26MAY2018 2
27MAY2018 2
28MAY2018 2
29MAY2018 2
07JUN2018 3
21JUN2018 3
21JUN2018 4
22JUN2018 4
23JUN2018 4
24JUN2018 4
25JUN2018 4
05JUL2018 5

 

 

 

mkeintz
PROC Star

@cuan wrote:

And if there is a gap between periods, you want the gap to belong to the later period.  Is that also correct?

 

No, I ma afraid, this is not correct, I would like to consider gaps between periods, for example , like this example for one ID:

 

The last two observations of your initial problem statement had:

1 09sep2019 2

1 15 sep2019 3

which has 5 unaccounted-for days (10sep-14sep) between treatments 2 and 3 - i.e. a gap. 

Yet you stated that you expected these dates for treatments 2 and 3:

id period  startdt    stopdt

1     2    30aug2019  09sep2019 

1    3     10sep2019  15sep2019

 

This is clearly an example of assigning the gap to the later treatment.  Now you've introduced some mission creep.  Please take a long look at what you want and confirm before we attempt any more suggestions.   Help us help you - state the task clearly and precisely.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
cuan
Obsidian | Level 7

my apologize for the confusion.

 

Will be great if you can solve my problem:

 

08MAR2018 1
23MAR2018 1
26APR2018 1
10MAY2018 1
24MAY2018 1
24MAY2018 2
26MAY2018 2
27MAY2018 2
28MAY2018 2
29MAY2018 2
07JUN2018 3
21JUN2018 3
21JUN2018 4

 

start date stop date

08mar2018  23may2018

24may2018 29may2018

07jun2018 20jun2018

21jun2018 21jun2018

 

Cheers,

 

Cuan.

 

 

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 1565 views
  • 1 like
  • 2 in conversation