BookmarkSubscribeRSS Feed
interism79
Fluorite | Level 6

Hi, I am working on the project and stuck at this point.

I have a data with id, start date, end date for prescribed drug, drug as below.

This is the list of treatment days for certain drug for each patient.

I need to keep the sequence of prescribed drug.

 

The data 'have' is like 

ID start_date    end_date       drug

~~~~~~~~~~~~~~~~~~~~~~~~~~~

1 2003-01-01   2003-01-28     A

2003-02-01   2003-03-01     A

2003-03-15   2003-03-31     A

2003-04-11   2003-04-30     B

2003-06-01   2003-07-28     B

1 2003-07-31   2003-08-15     B

1 2003-09-01   2003-09-28     A

1 2003-10-01   2003-11-28     A

 

2 2005-04-30   2005-06-01   C

........

.....

...

 

if I use the code like

 

proc sql;
   create table want as
     select id, min(start_date) as start, max(end_date) as end, drug
     from have
     group by drug;
quit;

 

I would have a data like

 

id start_date   end_date   drug

~~~~~~~~~~~~~~~~~~~~~~~~~

2003-01-01 2003-11-28   A

1 2003-04-11 2003-08-15   B

this is wrong.

 

The output I want is

id start_date   end_date   drug

~~~~~~~~~~~~~~~~~~~~~~~~~

2003-01-01 2003-03-31   A

1 2003-04-11 2003-08-15   B

1 2003-09-01 2003-11-28   A

........

.....

I need to keep the sequence of drug and get the min. date and max. date for each sequence of drug.

 

anybody know what to do? I tried first.last. retain..... etc..

 

i am really stuck here.. please help me out!!

 

 

 

 

 

 

 

7 REPLIES 7
novinosrin
Tourmaline | Level 20
data have;
input	ID (start_date    end_date) (:yymmdd10.)       drug $;
format start_date    end_date yymmdd10.;
cards;
1 2003-01-01   2003-01-28     A
1 2003-02-01   2003-03-01     A
1 2003-03-15   2003-03-31     A
1 2003-04-11   2003-04-30     B
1 2003-06-01   2003-07-28     B
1 2003-07-31   2003-08-15     B
1 2003-09-01   2003-09-28     A
1 2003-10-01   2003-11-28     A
;

data want;
if 0 then set have;
call missing(start,end);
do until(last.drug);
set have;
by id drug notsorted;
start=min(start,start_date);
end=max(end,end_date);
end;
format start end yymmdd10.;
keep id start end drug;
run;
interism79
Fluorite | Level 6
I did not try this code yet, but it looks so great!
I did not think about that this way at all..
Thank you for your help!
PaigeMiller
Diamond | Level 26

You need to create a sequence variable somehow, assuming your input data is sorted properly.

 

UNTESTED CODE

 

data have2;
    set have;
    previous_id=lag(id);
    previous_drug=lag(drug);
    if id=previous_id and drug^=previous_drug then sequence+1;
run;
proc summary data=have2;
    class id drug sequence;
    var start_date end_date;
    output out=want min(start_date)= max(end_date)=;
run;
--
Paige Miller
Reeza
Super User

Try adding the NOTSORTED options to a BY statement within PROC SUMMARY or MEANS.

proc summary data=have2;
  by id drug notsorted;
  var start_date end_date;
  output out=want min(start_date)= max(end_date)=; 
run;
interism79
Fluorite | Level 6
"notsorted" option works like a magic! Thank you for the tips!!
interism79
Fluorite | Level 6
oh! Thanks Paige, I am not familiar with proc summary! but it perfectly works out! Thank you!!
interism79
Fluorite | Level 6
data have2;
	set have;
	by id;
	previous_drug=lag(drug);
	if first.id then previous_drug='';
	if drug^=previous_drug then sequence+1;
	if first.id then sequence=1;
run;

proc summary data=have2;
	by id drug notsorted;
	var start_date end_date ;
	output out=want min(start_date)= max(end_date)= ;
run;

I use the code above.. It works perfectly as I wanted! 

I really appreciated your help!! 

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 7 replies
  • 5840 views
  • 6 likes
  • 4 in conversation