Hi SAS folks,
I am working on some reseach project using hospital visit data. I have to identify the subjects who had switched their treatment. the defination of treatment switch is like:
Treatments are grouped by topical, non-biologic systemic, and biologic. A switching event is defined as the withdrawal of a medication that was not dispensed at the time of the last treatment withdrawal. There is a hierarchy, with biologics at the top, then non-bio systemics, then topical.
According to the logic above, once a patient has switched to a higher class of therapy, the lower classes are considered concomitant medication and a withdrawal of a concomitant medication does not indicate a treatment switch.
In nutshell, If a patient moves from lower class of treatment to upper or withing the same class(different ATC code) then it will consider as a Switch.
Treatment hierarchy
Biologics> Non_Bio_Sys>Topical
I am adding sample dataset for the reference, with variable information :
LPNR: unique subjID
EDATUM: prescrition date ( treatment recieved date)
ATC_CODE: codes for each of treament type
'D05AX52' ,
'D05AX02' = PSorasis(Topical)
'L01XC02' ,
'L04AA24',
'L04AB01',
'L04AB02',
'L04AB04',
'L04AB05',
'L04AB06',
'L04AC03',
'L04AC07' =Non_Bio_systemic
'A11CC03' ,
'D05BB02',
'L01BA01',
'L01BB03' ,
'L01XX05' ,
'L04AA21' ,
'L04AD01' ,
'L04AX03' =Biologics
TYPE: Treatment type
It would be great , if you can help me with SAS code to indentify the Subjects with Switches .
proc format;
value treatType 1 = "Topical"
2 = "Non-biologic systemics"
3 = "Biologics";
run;
data have;
format pnr 8. tType treatType. tCode $10. date YYMMDDs8.;
informat tType 1. date yymmdd10.;
input pnr tCode tType date;
datalines;
478 L04AX03 2 2010-02-10
478 D05AX52 1 2010-02-23
478 L04AX03 2 2010-03-05
478 L04AX03 2 2010-05-18
478 L04AX03 2 2010-08-02
478 L04AX03 2 2010-11-03
478 L04AX03 2 2011-02-07
478 L04AX03 2 2011-08-16
478 L04AX03 2 2011-11-22
603 D05AX02 1 2005-08-24
603 D05AX02 1 2005-10-13
603 D05AX02 1 2006-06-02
603 D05AX02 1 2006-07-19
603 D05AX02 1 2007-04-03
603 D05AX02 1 2007-12-07
603 D05AX02 1 2008-01-04
603 D05AX02 1 2008-01-04
603 D05AX02 1 2008-10-02
603 D05AX52 1 2009-03-16
603 D05AX52 1 2009-03-16
603 D05AX52 1 2009-05-07
603 D05AX52 1 2009-08-21
603 D05AX52 1 2009-11-09
603 D05AX52 1 2010-03-19
603 D05AX52 1 2010-07-30
603 D05AX52 1 2010-07-30
603 D05AX52 1 2011-01-13
603 D05AX52 1 2011-05-03
603 D05AX52 1 2011-12-12
1103 L04AX03 2 2006-01-17
1103 L04AX03 2 2006-08-23
1103 L04AX03 2 2006-11-09
1103 D05AX52 1 2007-04-19
1103 L04AX03 2 2007-04-19
1103 D05AX52 1 2007-10-26
1103 D05AX52 1 2008-04-01
1103 L04AX03 2 2008-10-20
1103 D05AX52 1 2008-10-20
1103 L04AX03 2 2009-04-07
1103 L04AX03 2 2009-11-20
1103 L04AX03 2 2010-11-12
1103 L04AX03 2 2011-08-12
1103 D05AX52 1 2011-10-22
1103 D05AX52 1 2011-12-29
run;
proc sort data=have;
by pnr date;
run;
data want;
set have;
by pnr;
format switch 1. tCode_prev $10. tType_prev 3.;
retain tCode_prev '' tType_prev .;
format switch 1.;
if first.pnr then do;
switch = 0;
end;
else if tType_prev = 1 then do;
if tType > 1 or tCode ne tCode_prev then switch = 1;
else switch = 0;
end;
else if tType_prev = 2 then do;
if tType > 2 or (tType = 2 and tCode ne tCode_prev) then switch = 1;
else switch = 0;
end;
else if tType_prev = 3 then do;
/* Here's your homework */
end;
output;
tCode_prev = tCode;
tType_prev = tType;
drop tCode_prev tType_prev;
run;
Hi mate.
I couldn't understand what you mean exactly but i think you are looking for something to check where are these dependencies changed.
My guess is you can use the LAST. or FIRST. options.
Check the code below:
FILENAME REFFILE "/PATH/sample_data.xlsx" TERMSTR=CR;
PROC IMPORT DATAFILE=REFFILE
DBMS=XLSX
OUT=WORK.IMPORT;
GETNAMES=YES;
RUN;
PROC SORT DATA=IMPORT;
BY LPNR ATC_CODE TYPE;
RUN;
DATA NEED;
LENGTH INFO_CHANGE $ 20;
SET IMPORT;
BY LPNR ATC_CODE TYPE;
IF LAST.LPNR AND LAST.ATC_CODE AND LAST.TYPE THEN INFO_CHANGE = "CHANGED";
ELSE INFO_CHANGE = "UNCHANGED";
RUN;
The last data step in this code prints the changed if the dependencies occured in the last time they changed.
And unchanged for the rest.
If you need the first time jus switch to FIRST. and checkout the result.
Hope this helps
@DartRodrigo Hi , Thanks for the quick reply. I am sorry for the unclear explaination, but what i exactly means is:
as you see the data, we have three treatment type in TYPE variable. Now for each Type variable we have different ATC_CODEs. I need to identify the event (SWITCH) , whenever subject changes his treatment type from Lower treatment type( i,e Topical) to Higher treatment group(i,e Non_bio_sys or Biologics) during his treatment period. Same is true for Non_sys_Bio , if Subject changes to Biologics, then it will also consider as Switch. Event is also to be counted whenever subjects changes his ATC_CODEs within his treatment TYPE , ie in case, when Treatment TYPE is "Psorasis" and if Its' ATC_CODE changes , then we have to identify it as SWITCH.
same is true for other treatment arm.
order of treatment: BIOLOGICS> NON_BIO_SYS> TOPICAL.
Hope i am able to clear you doubt. please help. thanks
data have;
length TYPE $20.;
input pnr ATC_CODE $ type $ EDATUM $ 10. ;
date=input(edatum,mmddyy10.);
format date mmddyy10.;
cards;
478 L04AX03 Nonbiologic_systemic 2/10/2010
478 D05AX52 Psoriasis 2/23/2010
478 L04AX03 Nonbiologic_systemic 3/5/2010
478 L04AX03 Nonbiologic_systemic 5/18/2010
478 L04AX03 Nonbiologic_systemic 8/2/2010
478 L04AX03 Nonbiologic_systemic 11/3/2010
478 L04AX03 Nonbiologic_systemic 2/7/2011
478 L04AX03 Nonbiologic_systemic 8/16/2011
478 L04AX03 Nonbiologic_systemic 11/22/2011
603 D05AX02 Psoriasis 8/24/2005
603 D05AX02 Psoriasis 10/13/2005
603 D05AX02 Psoriasis 6/2/2006
603 D05AX02 Psoriasis 7/19/2006
603 D05AX02 Psoriasis 4/3/2007
603 D05AX02 Psoriasis 12/7/2007
603 D05AX02 Psoriasis 1/4/2008
603 D05AX02 Psoriasis 1/4/2008
603 D05AX02 Psoriasis 10/2/2008
603 D05AX52 Psoriasis 3/16/2009
603 D05AX52 Psoriasis 3/16/2009
603 D05AX52 Psoriasis 5/7/2009
603 D05AX52 Psoriasis 8/21/2009
603 D05AX52 Psoriasis 11/9/2009
603 D05AX52 Psoriasis 3/19/2010
603 D05AX52 Psoriasis 7/30/2010
603 D05AX52 Psoriasis 7/30/2010
603 D05AX52 Psoriasis 1/13/2011
603 D05AX52 Psoriasis 5/3/2011
603 D05AX52 Psoriasis 12/12/2011
1103 L04AX03 Nonbiologic_systemic 1/17/2006
1103 L04AX03 Nonbiologic_systemic 8/23/2006
1103 L04AX03 Nonbiologic_systemic 11/9/2006
1103 D05AX52 Psoriasis 4/19/2007
1103 L04AX03 Nonbiologic_systemic 4/19/2007
1103 D05AX52 Psoriasis 10/26/2007
1103 D05AX52 Psoriasis 4/1/2008
1103 L04AX03 Nonbiologic_systemic 10/20/2008
1103 D05AX52 Psoriasis 10/20/2008
1103 L04AX03 Nonbiologic_systemic 4/7/2009
1103 L04AX03 Nonbiologic_systemic 11/20/2009
1103 L04AX03 Nonbiologic_systemic 11/12/2010
1103 L04AX03 Nonbiologic_systemic 8/12/2011
1103 D05AX52 Psoriasis 10/22/2011
1103 D05AX52 Psoriasis 12/29/2011
;
run;
/*sort the data to keep the date in increase order to know subject moved to other treatment or not*/
proc sort data=have;
by pnr date;
run;
data need;
set have;
length change $ 10.;
by pnr;
/*create new variable change and keep constant if it is topical for first observation*/
/*compare topical values with next values if it changes to other non bio or bio then switched*/
if first.pnr and atc_code in ( 'D05AX52','D05AX02') then change="";
else if not first.pnr and atc_code not in('D05AX52','D05AX02','L01XC02','L04AA24','L04AB01','L04AB02','L04AB04','L04AB05','L04AB06','L04AC03','L04AC07','A11CC03','D05BB02','L01BA01','L01BB03' ,'L01XX05' ,'L04AA21' ,'L04AD01' ,'L04AX03')
then change='switched';
/*compare non bio values with next values if it changes to other non bio or bio then switched*/
if first.pnr and atc_code in ('L01XC02','L04AA24','L04AB01','L04AB02','L04AB04','L04AB05','L04AB06','L04AC03','L04AC07') then change="";
else if not first.pnr and atc_code not in( 'D05AX52','D05AX02') and atc_code in('L01XC02','L04AA24','L04AB01','L04AB02','L04AB04','L04AB05','L04AB06','L04AC03','L04AC07','A11CC03','D05BB02','L01BA01','L01BB03' ,'L01XX05' ,'L04AA21' ,'L04AD01' ,'L04AX03')
then change='switched';
/*compare bio with next values if it changes to other bio then switched*/
if first.pnr and atc_code in('A11CC03','D05BB02','L01BA01','L01BB03' ,'L01XX05' ,'L04AA21' ,'L04AD01' ,'L04AX03') then change='';
else if not first.pnr and atc_code not in( 'D05AX52','D05AX02') and atc_code in('A11CC03','D05BB02','L01BA01','L01BB03' ,'L01XX05' ,'L04AA21' ,'L04AD01' ,'L04AX03') then change='switched';
if atc_code=lag(atc_code) then change="";
run;
Hope this may help you 🙂
@ramchinna24 Hi, Thanks for the logic, but what if ATC code withing the psorasis or within other treatment group changes, we also need to identify those changes as switches.
i want this :
lpnr | ATC_CODE | TYPE | EDATUM | Flag_switch | Comment |
478 | L04AX03 | Nonbiologic_systemic | 2/10/2010 | ||
478 | D05AX52 | Psoriasis | 2/23/2010 | ||
478 | L04AX03 | Nonbiologic_systemic | 3/5/2010 | Y | This is switch because subject is changing his treatment from lower(psorais) to higher treatment(non_bio_sys/bio) |
478 | L04AX03 | Nonbiologic_systemic | 5/18/2010 | ||
478 | L04AX03 | Nonbiologic_systemic | 8/2/2010 | ||
478 | L04AX03 | Nonbiologic_systemic | 11/3/2010 | ||
478 | L04AX03 | Nonbiologic_systemic | 2/7/2011 | ||
478 | L04AX03 | Nonbiologic_systemic | 8/16/2011 | ||
478 | L04AX03 | Nonbiologic_systemic | 11/22/2011 | ||
603 | D05AX02 | Psoriasis | 8/24/2005 | ||
603 | D05AX02 | Psoriasis | 10/13/2005 | ||
603 | D05AX02 | Psoriasis | 6/2/2006 | ||
603 | D05AX02 | Psoriasis | 7/19/2006 | ||
603 | D05AX02 | Psoriasis | 4/3/2007 | ||
603 | D05AX02 | Psoriasis | 12/7/2007 | ||
603 | D05AX02 | Psoriasis | 1/4/2008 | ||
603 | D05AX02 | Psoriasis | 10/2/2008 | ||
603 | D05AX52 | Psoriasis | 3/16/2009 | Y | This is switch because subject is changing his ATC code within Psorasis(or any other treatment group) |
603 | D05AX52 | Psoriasis | 5/7/2009 | ||
603 | D05AX52 | Psoriasis | 8/21/2009 | ||
603 | D05AX52 | Psoriasis | 11/9/2009 | ||
603 | D05AX52 | Psoriasis | 3/19/2010 | ||
603 | D05AX52 | Psoriasis | 7/30/2010 | ||
603 | D05AX52 | Psoriasis | 1/13/2011 | ||
603 | D05AX52 | Psoriasis | 5/3/2011 | ||
603 | D05AX52 | Psoriasis | 12/12/2011 | ||
1103 | L04AX03 | Nonbiologic_systemic | 1/17/2006 | ||
1103 | L04AX03 | Nonbiologic_systemic | 8/23/2006 | ||
1103 | L04AX03 | Nonbiologic_systemic | 11/9/2006 | ||
1103 | L04AX03 | Nonbiologic_systemic | 4/19/2007 | ||
1103 | D05AX52 | Psoriasis | 10/26/2007 | ||
1103 | D05AX52 | Psoriasis | 4/1/2008 | ||
1103 | L04AX03 | Nonbiologic_systemic | 10/20/2008 | Y | This is switch because subject is changing his treatment from lower(psorais) to higher treatment(non_bio_sys/bio) |
1103 | D05AX52 | Psoriasis | 10/20/2008 | ||
1103 | L04AX03 | Nonbiologic_systemic | 4/7/2009 | Y | This is switch because subject is changing his treatment from lower(psorais) to higher treatment(non_bio_sys/bio) |
1103 | L04AX03 | Nonbiologic_systemic | 11/20/2009 | ||
1103 | L04AX03 | Nonbiologic_systemic | 11/12/2010 | ||
1103 | L04AX03 | Nonbiologic_systemic | 8/12/2011 | ||
1103 | D05AX52 | Psoriasis | 10/22/2011 | ||
1103 | D05AX52 | Psoriasis | 12/29/2011 |
even if code changes with in the group, it works. Just try it once. hope it works. If it doesnt work, tell me .
@ramchinna24 Hi , code is working on treatment Type level, however whenever atc_codes are changing withing the treatment type, it's not working. I have added the desired output in previous reply, please kindly check.
Many thanks for your's help!!
@ramchinna24 I think, sorting order is conflicting with data, for example on: PNR 478, Switch was recorded at 5th March 2010 ,
Nonbiologic_systemic | 2/10/2010 |
Psoriasis | 2/23/2010 |
Nonbiologic_systemic | 3/5/2010 |
but when we are sorting it on atc_code and then type and followed by edatum , switch is recording at 2/10/2010.
@ramchinna24 Hi I have tried this:
data test;
set new_have;/*sorted on pnr date*/
by pnr;
if lag(type)='Psoriasis' and type in ('Nonbiologic_systemic' 'Biologics') then f=1;
if lag(type)='Psoriasis' and type='Psoriasis' and lag(atc_code) ne atc_code then f=1;
if lag(type)='Nonbiologic_systemic' and type in ('Biologics') then f=1;
if lag(type)='Nonbiologic_systemic' and type='Nonbiologic_systemic' and lag(atc_code) ne atc_code then f=1;
if lag(type) in ('Nonbiologic_systemic' 'Psoriasis') and type='Biologics' then f=1;
if lag(type)='Biologics' and type='Biologics' and lag(atc_code) ne atc_code then f=1;
/* if lag(pnr) =pnr and lag(atc_code) ne atc_code then f=1; */
run;
But it's ignoring the by group and adding flagging for situations when patient by group is changes.
proc format;
value treatType 1 = "Topical"
2 = "Non-biologic systemics"
3 = "Biologics";
run;
data have;
format pnr 8. tType treatType. tCode $10. date YYMMDDs8.;
informat tType 1. date yymmdd10.;
input pnr tCode tType date;
datalines;
478 L04AX03 2 2010-02-10
478 D05AX52 1 2010-02-23
478 L04AX03 2 2010-03-05
478 L04AX03 2 2010-05-18
478 L04AX03 2 2010-08-02
478 L04AX03 2 2010-11-03
478 L04AX03 2 2011-02-07
478 L04AX03 2 2011-08-16
478 L04AX03 2 2011-11-22
603 D05AX02 1 2005-08-24
603 D05AX02 1 2005-10-13
603 D05AX02 1 2006-06-02
603 D05AX02 1 2006-07-19
603 D05AX02 1 2007-04-03
603 D05AX02 1 2007-12-07
603 D05AX02 1 2008-01-04
603 D05AX02 1 2008-01-04
603 D05AX02 1 2008-10-02
603 D05AX52 1 2009-03-16
603 D05AX52 1 2009-03-16
603 D05AX52 1 2009-05-07
603 D05AX52 1 2009-08-21
603 D05AX52 1 2009-11-09
603 D05AX52 1 2010-03-19
603 D05AX52 1 2010-07-30
603 D05AX52 1 2010-07-30
603 D05AX52 1 2011-01-13
603 D05AX52 1 2011-05-03
603 D05AX52 1 2011-12-12
1103 L04AX03 2 2006-01-17
1103 L04AX03 2 2006-08-23
1103 L04AX03 2 2006-11-09
1103 D05AX52 1 2007-04-19
1103 L04AX03 2 2007-04-19
1103 D05AX52 1 2007-10-26
1103 D05AX52 1 2008-04-01
1103 L04AX03 2 2008-10-20
1103 D05AX52 1 2008-10-20
1103 L04AX03 2 2009-04-07
1103 L04AX03 2 2009-11-20
1103 L04AX03 2 2010-11-12
1103 L04AX03 2 2011-08-12
1103 D05AX52 1 2011-10-22
1103 D05AX52 1 2011-12-29
run;
proc sort data=have;
by pnr date;
run;
data want;
set have;
by pnr;
format switch 1. tCode_prev $10. tType_prev 3.;
retain tCode_prev '' tType_prev .;
format switch 1.;
if first.pnr then do;
switch = 0;
end;
else if tType_prev = 1 then do;
if tType > 1 or tCode ne tCode_prev then switch = 1;
else switch = 0;
end;
else if tType_prev = 2 then do;
if tType > 2 or (tType = 2 and tCode ne tCode_prev) then switch = 1;
else switch = 0;
end;
else if tType_prev = 3 then do;
/* Here's your homework */
end;
output;
tCode_prev = tCode;
tType_prev = tType;
drop tCode_prev tType_prev;
run;
@ramchinna24 Many Thanks for the help and time! . It worked
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.