Hi everyone,
I have a dataset with a similar structure as the one shown below:
data have;
input PAT_ID RX_DT : ddmmyy10. RX_Name $;
format RX_DT ddmmyy10.;
datalines;;
201 18/07/2013 DrugA
201 27/12/2013 DrugA
201 13/06/2013 DrugC
201 19/08/2013 DrugC
201 01/05/2014 DrugC
201 14/10/2013 DrugC
201 20/02/2014 DrugC
201 27/03/2014 DrugA
201 16/09/2013 DrugB
201 18/12/2013 DrugC
202 16/04/2014 DrugC
202 16/01/2014 DrugC
202 02/10/2013 DrugC
202 15/07/2013 DrugC
202 13/05/2013 DrugC
202 11/03/2014 DrugB
202 11/11/2013 DrugB
202 23/09/2013 DrugA
202 05/08/2013 DrugA
202 24/06/2013 DrugA
;
I would like to count the number of switches per patient from DrugA -> DrugB or from DrugB -> DrugA.
For example, both patients have one swhitch.
Thank you very much for your help in advance!
I like the use of the where and first.id techniques. In the interest of compactness, I think it makes sense to reduce the amount of code to one data step:
data want (keep=pat_id nswitch);
set have;
where drug in ('A','B');
by pat_id drug notsorted;
if first.pat_id then nswitch=0;
else if first.drug then nswitch+1;
if last.pat_id;
run;
m
This seems to work for your example data:
data have; input PAT_ID RX_DT : ddmmyy10. RX_Name $; format RX_DT ddmmyy10.; ld = lag(rx_name); switch = (ld='DrugA' and Rx_name='DrugB') or (ld='DrugB' and Rx_name='DrugA') ; drop ld; datalines; 201 18/07/2013 DrugA 201 27/12/2013 DrugA 201 13/06/2013 DrugC 201 19/08/2013 DrugC 201 01/05/2014 DrugC 201 14/10/2013 DrugC 201 20/02/2014 DrugC 201 27/03/2014 DrugA 201 16/09/2013 DrugB 201 18/12/2013 DrugC 202 16/04/2014 DrugC 202 16/01/2014 DrugC 202 02/10/2013 DrugC 202 15/07/2013 DrugC 202 13/05/2013 DrugC 202 11/03/2014 DrugB 202 11/11/2013 DrugB 202 23/09/2013 DrugA 202 05/08/2013 DrugA 202 24/06/2013 DrugA ; run;
May get cumbersome to use this approach for more specific comparisons though.
And I hope your real data has very consistent spelling.
data have;
input PAT_ID RX_DT : ddmmyy10. RX_Name $;
format RX_DT ddmmyy10.;
datalines;;
201 18/07/2013 DrugA
201 27/12/2013 DrugA
201 13/06/2013 DrugC
201 19/08/2013 DrugC
201 01/05/2014 DrugC
201 14/10/2013 DrugC
201 20/02/2014 DrugC
201 27/03/2014 DrugA
201 16/09/2013 DrugB
201 18/12/2013 DrugC
202 16/04/2014 DrugC
202 16/01/2014 DrugC
202 02/10/2013 DrugC
202 15/07/2013 DrugC
202 13/05/2013 DrugC
202 11/03/2014 DrugB
202 11/11/2013 DrugB
202 23/09/2013 DrugA
202 05/08/2013 DrugA
202 24/06/2013 DrugA
;
run;
data temp;
set have(where=(RX_Name in ('DrugA' 'DrugB')));
by PAT_ID RX_Name notsorted;
if first.RX_Name;
run;
proc sql;
select PAT_ID,int(count(*)/2) as n_switch
from temp
group by PAT_ID ;
quit;
I like the use of the where and first.id techniques. In the interest of compactness, I think it makes sense to reduce the amount of code to one data step:
data want (keep=pat_id nswitch);
set have;
where drug in ('A','B');
by pat_id drug notsorted;
if first.pat_id then nswitch=0;
else if first.drug then nswitch+1;
if last.pat_id;
run;
m
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.