Dear friends,
I have a data as below.
I try to derive patient's treatment line name based on the algorithm in "Newline" column of the table below:
1. Start treatment: 1st line, 20 days drugs used in this period is the line name
2. Advance line: a: treatment Gap > 3 months or b: a new Chemo drug after 20 days of treatment or C: A new bio drug advance or not after 20 days of treatment based on the condition listed in the table
The complex part is to code the part when I see a new BIO. I am wondering whether there is good solution.
Thank you very much!
ID | DrugName | Dt | DrugCat | LineName | LineNumber | NewLine |
1 | A | 1/1/2019 | Chemo | A | 1 | Line name is generated based on 20 days period when patient starting treatment |
1 | B | 1/12/2019 | Chemo | A+B | 1 | |
1 | A | 1/26/2019 | Chemo | A+B | 1 | |
1 | B | 1/27/2019 | Chemo | A+B | 1 | |
1 | A | 2/20/2019 | Chemo | A+B | 1 | |
1 | B | 02/29/2019 | Chemo | A+B | 1 | |
1 | A | 02/30/2019 | Chemo | A+B | 1 | |
1 | D | 3/15/2019 | Bio | 1 | Not new line or | |
1 | A | 3/30/2019 | Chemo | 1 | ||
1 | D | 4/5/2019 | Bio | A+B+C | 1 | MAKE DECISION |
1 | A | 4/15/2019 | Chemo | A+B+C | 1 | |
1 | A | 5/5/2019 | Chemo | A+B+C | 1 | |
1 | C | 9/9/2019 | Bio | 2 | New line, because during 28 days period, no new chemo, and, stopping all previous line drugs | |
1 | C | 9/19/2019 | Bio | C | 2 | MAKE DECISION |
1 | C | 10/20/2019 | Bio | C | 2 | |
1 | C | 11/5/2019 | Bio | C | 2 | |
1 | C | 12/1/2020 | Bio | C | 3 | New line: Due to treatment GAP > 6 months |
1 | C | 12/2/2020 | Bio | C | 3 | |
1 | C | 12/3/2020 | Bio | C | 3 | |
1 | F | 12/29/2020 | Chemo | C | 4 | New line: Due to new chemo |
1 | F | 12/30/2020 | Chemo | C | 4 |
Hello @Ivy This solution is just purely for fun as i wanted some excitement, I'll prolly post the simple one later.
data have;
infile cards truncover;
input ID LineName :$75. samedaydrugs :$75. AdministeredDate :date9. RefDt :date9.;
format AdministeredDate RefDt date9.;
cards;
1 . rituximab+gemcitabine 9-Sep-15 .
1 . rituximab+gemcitabine 29-Sep-15 .
1 . rituximab+gemcitabine 30-Sep-15 .
1 rituximab+gemcitabine rituximab+gemcitabine 6-Oct-15 .
1 . crizotinib 2-Nov-15 29-Nov-15
1 . rituximab 16-Nov-15 29-Nov-15
1 . rituximab 23-Nov-15 29-Nov-15
1 . rituximab 30-Nov-15 .
1 . rituximab 7-Dec-15 .
;
data want;
if _n_=1 then do;
if 0 then set have(rename=(AdministeredDate=_d));
dcl hash H (dataset:'have(rename=(AdministeredDate=_d))',ordered: "A") ;
h.definekey ("id","_d") ;
h.definedata ("id","_d") ;
h.definedone () ;
dcl hiter hi('h');
end;
do until(last.id);
set have ;
by id;
if samedaydrugs='crizotinib' then do;
do _rc=hi.setcur(key:id,key:AdministeredDate) by 0 while(_rc=0);
if intck('days',AdministeredDate,_d)<28 then _iorc_=_d;
else leave;
_rc = hi.next();
end;
end;
combo=ifc(_iorc_=AdministeredDate,catx('+','crizotinib',samedaydrugs),' ');
output;
end;
call missing(_iorc_, _d);
drop _:;
run;
data have;
infile cards truncover;
input ID LineName :$75. samedaydrugs :$75. AdministeredDate :date9. RefDt :date9.;
format AdministeredDate RefDt date9.;
cards;
1 . rituximab+gemcitabine 9-Sep-15 .
1 . rituximab+gemcitabine 29-Sep-15 .
1 . rituximab+gemcitabine 30-Sep-15 .
1 rituximab+gemcitabine rituximab+gemcitabine 6-Oct-15 .
1 . crizotinib 2-Nov-15 29-Nov-15
1 . rituximab 16-Nov-15 29-Nov-15
1 . rituximab 23-Nov-15 29-Nov-15
1 . rituximab 30-Nov-15 .
1 . rituximab 7-Dec-15 .
;
data want;
do _n_=1 by 1 until(last.id);
set have ;
by id;
array t(999999) _temporary_;
if samedaydrugs='crizotinib' then _f=1;
if _f then t(_n_)=AdministeredDate;
end;
do _n_=1 to _n_;
set have ;
if samedaydrugs='crizotinib' then do;
_iorc_=AdministeredDate;
do while(intck('day',AdministeredDate,_iorc_)<27);
_iorc_=intnx('day',_iorc_,1);
if _iorc_ in t then _j=_iorc_;
end;
end;
combo=ifc(_j=AdministeredDate,catx('+','crizotinib',samedaydrugs),' ');
output;
end;
call missing(_iorc_,of t(*));
drop _:;
run;
Sorry, ignore the previous
data have;
infile cards truncover;
input ID LineName :$75. samedaydrugs :$75. AdministeredDate :date9. RefDt :date9.;
format AdministeredDate RefDt date9.;
cards;
1 . rituximab+gemcitabine 9-Sep-15 .
1 . rituximab+gemcitabine 29-Sep-15 .
1 . rituximab+gemcitabine 30-Sep-15 .
1 rituximab+gemcitabine rituximab+gemcitabine 6-Oct-15 .
1 . crizotinib 2-Nov-15 29-Nov-15
1 . rituximab 16-Nov-15 29-Nov-15
1 . rituximab 23-Nov-15 29-Nov-15
1 . rituximab 30-Nov-15 .
1 . rituximab 7-Dec-15 .
;
data want;
do _n_=1 by 1 until(last.id);
set have ;
by id;
if samedaydrugs='crizotinib' then do; _d= AdministeredDate;_f=1;end;
else if _f then if intck('days',_d,AdministeredDate)<28 then __d=AdministeredDate;
end;
do _n_=1 to _n_;
set have;
by id;
combo=ifc(__d=AdministeredDate,catx('+','crizotinib',samedaydrugs),' ');
output;
end;
drop _:;
run;
Hello, @novinosrin
Thank you so much for your help.
This works for this case.
However, in my real data situation, this part is part of my data processing to generate the LineName, there are other conditions lining up with this, which I should process together in one data processing step. I tried to identify the last date of this period and then output the combination drugs in one condition such as : If (ref data - AdministeredDate) <= 28 days then do: ---- end; Maybe there is no way to do it ? Thank you. I am guessing I am confusing you.....
Hi @Ivy To your " there are other conditions lining up with this, which I should process together in one data processing step. ". Well this sounds like your requirement is much more. However, I wouldn't know. But if i can guess those are similar, you could perhaps modify along the same lines. So, what's in your real data and the real requirement is something you and your org alone have control over.
My general opinion/approach would be:
1. If there are too many conditions making it too tedious let alone complex, I 'd split and approach in piecemeal.
2. It's also easier to maintain, modify and update
3. The high end processing involving very advance techniques might surface but the trade off should be considered what you prioritize.
4. A lot of times, the kind of fancy solutions I post or take from others here on the community are not always well comprehended by other users in my team.
5. First break it down, get a working solution, and then see if you can consolidate into fewer steps if possible
SQL:
data have;
infile cards truncover;
input ID LineName :$75. samedaydrugs :$75. AdministeredDate :date9. RefDt :date9.;
format AdministeredDate RefDt date9.;
cards;
1 . rituximab+gemcitabine 9-Sep-15 .
1 . rituximab+gemcitabine 29-Sep-15 .
1 . rituximab+gemcitabine 30-Sep-15 .
1 rituximab+gemcitabine rituximab+gemcitabine 6-Oct-15 .
1 . crizotinib 2-Nov-15 29-Nov-15
1 . rituximab 16-Nov-15 29-Nov-15
1 . rituximab 23-Nov-15 29-Nov-15
1 . rituximab 30-Nov-15 .
1 . rituximab 7-Dec-15 .
;
proc sql;
create table want(drop=t:) as
select *, ifc(max( AdministeredDate)=AdministeredDate and t1,catx('+','crizotinib',samedaydrugs),' ') as Combo
from
(select *,max((samedaydrugs='crizotinib')* AdministeredDate) as t, intck('days',calculated t,AdministeredDate)<28 as t1
from have
group by id)
group by id, t1
order by id, AdministeredDate;
quit;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.