BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
cuan
Obsidian | Level 7

Dear SAS community,

 

I have a problem, it will be great if you can give me the solution.

 

I need to setup 3 periods:

dataset I have:

 

id date treat

1 10aug2019 1

1 16aug2019 1

1 20aug2019 1

1 30aug2019 1

1 30aug2019 2

1 02sep2019 2

1 06sep2019 2

1 09sep2019 2

1 15 sep2019 3

 

I need three periods:

 

dataset I want:

id period  startdt            stopdt

 1    1       10 aug2019 29aug2019

1     2        30aug2019  09sep2019 

1    3         10sep2019  15sep2019

 

Could you help me wo create these a dataset with 3 startdt and 3 stopdt?

 

Many thanks.

 

Cuan.

 

1 ACCEPTED SOLUTION

Accepted Solutions
mkeintz
PROC Star

I presume period corresponds to treatment, except:

  1. If two periods have an overlapping day, you want the overlapping date to belong exclusively to the later period, correct?
  2. And if there is a gap between periods, you want the gap to belong to the later period.  Is that also correct?

If so, then:

data have;
  input id date :date9. treat;
  format date date9.;
datalines;
1 10aug2019 1
1 16aug2019 1
1 20aug2019 1
1 30aug2019 1
1 30aug2019 2
1 02sep2019 2
1 06sep2019 2
1 09sep2019 2
1 15sep2019 3
run;

data want (keep=id treat start_dt end_dt);
  set have (keep=id treat) end=end_of_have;
  by id treat;

  merge have have (firstobs=2 keep=date rename=(date=_nxt_date));

  retain start_dt end_dt;
  format start_dt end_dt date9.;

  if first.treat then start_dt=ifn(first.id,date,sum(end_dt,1));
  if end_of_have then end_dt=date;
  else end_dt=min(date,_nxt_date-1);

  if last.treat;

run;

Note this only works if the overlap is never more than 1 day.   Reviewed the program - it DOES work if the overlap is more than 1 day.

 

 

 

 

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

View solution in original post

6 REPLIES 6
mkeintz
PROC Star

I presume period corresponds to treatment, except:

  1. If two periods have an overlapping day, you want the overlapping date to belong exclusively to the later period, correct?
  2. And if there is a gap between periods, you want the gap to belong to the later period.  Is that also correct?

If so, then:

data have;
  input id date :date9. treat;
  format date date9.;
datalines;
1 10aug2019 1
1 16aug2019 1
1 20aug2019 1
1 30aug2019 1
1 30aug2019 2
1 02sep2019 2
1 06sep2019 2
1 09sep2019 2
1 15sep2019 3
run;

data want (keep=id treat start_dt end_dt);
  set have (keep=id treat) end=end_of_have;
  by id treat;

  merge have have (firstobs=2 keep=date rename=(date=_nxt_date));

  retain start_dt end_dt;
  format start_dt end_dt date9.;

  if first.treat then start_dt=ifn(first.id,date,sum(end_dt,1));
  if end_of_have then end_dt=date;
  else end_dt=min(date,_nxt_date-1);

  if last.treat;

run;

Note this only works if the overlap is never more than 1 day.   Reviewed the program - it DOES work if the overlap is more than 1 day.

 

 

 

 

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
cuan
Obsidian | Level 7

Hi ,  many thanks for this,

 

but look like your code doesn't work with this dataset:

01AUG2017 1
16AUG2017 1
30AUG2017 1
30AUG2017 2
31AUG2017 2
01SEP2017 2
02SEP2017 2
03SEP2017 2
14SEP2017 3

 

It is getting the 3rd period: statdt=04sep2017 (correct, but stopdate=28nov2016 (incorrect),

it should be 14sep2017.

 

Not surre why you are getting 28nov2016.

 

Many tanks.

cuan
Obsidian | Level 7

ohh I see,

 

your code is working fine, what about tif we put two id?

id date treat

1 01AUG2017 1
1 16AUG2017 1
1 30AUG2017 1
1 30AUG2017 2
1 31AUG2017 2
1 01SEP2017 2
1 02SEP2017 2
1 03SEP2017 2
1 14SEP2017 3
2 29NOV2016 1
2 13DEC2016 1
2 27DEC2016 1

 

please could  you  your code with now 2 id? because if you run your actual code, with these two id is what stopdate (3 period) come from = 29NOV2016, and should be 14sep2017.

Many thanks.

cuan
Obsidian | Level 7

And if there is a gap between periods, you want the gap to belong to the later period.  Is that also correct?

 

No, I ma afraid, this is not correct, I would like to consider gaps between periods, for example , like this example for one ID:

 

treatment 3: should start the 7jun, and your code start the 29May.

 

08MAR2018 1
23MAR2018 1
26APR2018 1
10MAY2018 1
24MAY2018 1
24MAY2018 2
26MAY2018 2
27MAY2018 2
28MAY2018 2
29MAY2018 2
07JUN2018 3
21JUN2018 3
21JUN2018 4
22JUN2018 4
23JUN2018 4
24JUN2018 4
25JUN2018 4
05JUL2018 5

 

 

 

mkeintz
PROC Star

@cuan wrote:

And if there is a gap between periods, you want the gap to belong to the later period.  Is that also correct?

 

No, I ma afraid, this is not correct, I would like to consider gaps between periods, for example , like this example for one ID:

 

The last two observations of your initial problem statement had:

1 09sep2019 2

1 15 sep2019 3

which has 5 unaccounted-for days (10sep-14sep) between treatments 2 and 3 - i.e. a gap. 

Yet you stated that you expected these dates for treatments 2 and 3:

id period  startdt    stopdt

1     2    30aug2019  09sep2019 

1    3     10sep2019  15sep2019

 

This is clearly an example of assigning the gap to the later treatment.  Now you've introduced some mission creep.  Please take a long look at what you want and confirm before we attempt any more suggestions.   Help us help you - state the task clearly and precisely.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
cuan
Obsidian | Level 7

my apologize for the confusion.

 

Will be great if you can solve my problem:

 

08MAR2018 1
23MAR2018 1
26APR2018 1
10MAY2018 1
24MAY2018 1
24MAY2018 2
26MAY2018 2
27MAY2018 2
28MAY2018 2
29MAY2018 2
07JUN2018 3
21JUN2018 3
21JUN2018 4

 

start date stop date

08mar2018  23may2018

24may2018 29may2018

07jun2018 20jun2018

21jun2018 21jun2018

 

Cheers,

 

Cuan.

 

 

Catch up on SAS Innovate 2026

Dive into keynotes, announcements and breakthroughs on demand.

Explore Now →
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 2516 views
  • 1 like
  • 2 in conversation