Hi everyone,
I have a dataset with a similar structure as the one shown below:
data have;
input PAT_ID RX_DT : ddmmyy10. RX_Name $;
format RX_DT ddmmyy10.;
datalines;;
201 18/07/2013 DrugA
201 27/12/2013 DrugA
201 13/06/2013 DrugC
201 19/08/2013 DrugC
201 01/05/2014 DrugC
201 14/10/2013 DrugC
201 20/02/2014 DrugC
201 27/03/2014 DrugA
201 16/09/2013 DrugB
201 18/12/2013 DrugC
202 16/04/2014 DrugC
202 16/01/2014 DrugC
202 02/10/2013 DrugC
202 15/07/2013 DrugC
202 13/05/2013 DrugC
202 11/03/2014 DrugB
202 11/11/2013 DrugB
202 23/09/2013 DrugA
202 05/08/2013 DrugA
202 24/06/2013 DrugA
;
I would like to count the number of switches per patient from DrugA -> DrugB or from DrugB -> DrugA.
For example, both patients have one swhitch.
Thank you very much for your help in advance!
I like the use of the where and first.id techniques. In the interest of compactness, I think it makes sense to reduce the amount of code to one data step:
data want (keep=pat_id nswitch);
set have;
where drug in ('A','B');
by pat_id drug notsorted;
if first.pat_id then nswitch=0;
else if first.drug then nswitch+1;
if last.pat_id;
run;
m
This seems to work for your example data:
data have; input PAT_ID RX_DT : ddmmyy10. RX_Name $; format RX_DT ddmmyy10.; ld = lag(rx_name); switch = (ld='DrugA' and Rx_name='DrugB') or (ld='DrugB' and Rx_name='DrugA') ; drop ld; datalines; 201 18/07/2013 DrugA 201 27/12/2013 DrugA 201 13/06/2013 DrugC 201 19/08/2013 DrugC 201 01/05/2014 DrugC 201 14/10/2013 DrugC 201 20/02/2014 DrugC 201 27/03/2014 DrugA 201 16/09/2013 DrugB 201 18/12/2013 DrugC 202 16/04/2014 DrugC 202 16/01/2014 DrugC 202 02/10/2013 DrugC 202 15/07/2013 DrugC 202 13/05/2013 DrugC 202 11/03/2014 DrugB 202 11/11/2013 DrugB 202 23/09/2013 DrugA 202 05/08/2013 DrugA 202 24/06/2013 DrugA ; run;
May get cumbersome to use this approach for more specific comparisons though.
And I hope your real data has very consistent spelling.
data have;
input PAT_ID RX_DT : ddmmyy10. RX_Name $;
format RX_DT ddmmyy10.;
datalines;;
201 18/07/2013 DrugA
201 27/12/2013 DrugA
201 13/06/2013 DrugC
201 19/08/2013 DrugC
201 01/05/2014 DrugC
201 14/10/2013 DrugC
201 20/02/2014 DrugC
201 27/03/2014 DrugA
201 16/09/2013 DrugB
201 18/12/2013 DrugC
202 16/04/2014 DrugC
202 16/01/2014 DrugC
202 02/10/2013 DrugC
202 15/07/2013 DrugC
202 13/05/2013 DrugC
202 11/03/2014 DrugB
202 11/11/2013 DrugB
202 23/09/2013 DrugA
202 05/08/2013 DrugA
202 24/06/2013 DrugA
;
run;
data temp;
set have(where=(RX_Name in ('DrugA' 'DrugB')));
by PAT_ID RX_Name notsorted;
if first.RX_Name;
run;
proc sql;
select PAT_ID,int(count(*)/2) as n_switch
from temp
group by PAT_ID ;
quit;
I like the use of the where and first.id techniques. In the interest of compactness, I think it makes sense to reduce the amount of code to one data step:
data want (keep=pat_id nswitch);
set have;
where drug in ('A','B');
by pat_id drug notsorted;
if first.pat_id then nswitch=0;
else if first.drug then nswitch+1;
if last.pat_id;
run;
m
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.