Hi, I am working on the project and stuck at this point.
I have a data with id, start date, end date for prescribed drug, drug as below.
This is the list of treatment days for certain drug for each patient.
I need to keep the sequence of prescribed drug.
The data 'have' is like
ID start_date end_date drug
~~~~~~~~~~~~~~~~~~~~~~~~~~~
1 2003-01-01 2003-01-28 A
1 2003-02-01 2003-03-01 A
1 2003-03-15 2003-03-31 A
1 2003-04-11 2003-04-30 B
1 2003-06-01 2003-07-28 B
1 2003-07-31 2003-08-15 B
1 2003-09-01 2003-09-28 A
1 2003-10-01 2003-11-28 A
2 2005-04-30 2005-06-01 C
........
.....
...
if I use the code like
proc sql;
create table want as
select id, min(start_date) as start, max(end_date) as end, drug
from have
group by drug;
quit;
I would have a data like
id start_date end_date drug
~~~~~~~~~~~~~~~~~~~~~~~~~
1 2003-01-01 2003-11-28 A
1 2003-04-11 2003-08-15 B
this is wrong.
The output I want is
id start_date end_date drug
~~~~~~~~~~~~~~~~~~~~~~~~~
1 2003-01-01 2003-03-31 A
1 2003-04-11 2003-08-15 B
1 2003-09-01 2003-11-28 A
........
.....
I need to keep the sequence of drug and get the min. date and max. date for each sequence of drug.
anybody know what to do? I tried first.last. retain..... etc..
i am really stuck here.. please help me out!!
data have;
input ID (start_date end_date) (:yymmdd10.) drug $;
format start_date end_date yymmdd10.;
cards;
1 2003-01-01 2003-01-28 A
1 2003-02-01 2003-03-01 A
1 2003-03-15 2003-03-31 A
1 2003-04-11 2003-04-30 B
1 2003-06-01 2003-07-28 B
1 2003-07-31 2003-08-15 B
1 2003-09-01 2003-09-28 A
1 2003-10-01 2003-11-28 A
;
data want;
if 0 then set have;
call missing(start,end);
do until(last.drug);
set have;
by id drug notsorted;
start=min(start,start_date);
end=max(end,end_date);
end;
format start end yymmdd10.;
keep id start end drug;
run;
You need to create a sequence variable somehow, assuming your input data is sorted properly.
UNTESTED CODE
data have2;
set have;
previous_id=lag(id);
previous_drug=lag(drug);
if id=previous_id and drug^=previous_drug then sequence+1;
run;
proc summary data=have2;
class id drug sequence;
var start_date end_date;
output out=want min(start_date)= max(end_date)=;
run;
Try adding the NOTSORTED options to a BY statement within PROC SUMMARY or MEANS.
proc summary data=have2;
by id drug notsorted;
var start_date end_date;
output out=want min(start_date)= max(end_date)=;
run;
data have2;
set have;
by id;
previous_drug=lag(drug);
if first.id then previous_drug='';
if drug^=previous_drug then sequence+1;
if first.id then sequence=1;
run;
proc summary data=have2;
by id drug notsorted;
var start_date end_date ;
output out=want min(start_date)= max(end_date)= ;
run;I use the code above.. It works perfectly as I wanted!
I really appreciated your help!!
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.