DATA HAVE;
INPUT patientid 1-2 dx1 3-5 dx2 7-9 dx3 11-13 dx4 15-17 date mmddyy10.;
datalines;
1 250 223 224 444 5/5/2009
1 555 666 120 290 10/6/2009
2 120 666 120 290 1/2/2007
2 120 666 120 290 1/1/2007
3 250 120 290 120 2/2/2004
3 240 250 120 290 3/3/2004
3 250 120 290 120 1/1/2004
;
run;
I want to flag patients who meets the following conditions:
have two dx='666' occurring with one year
or those who have one dx='666' and one dx='250' within one year
The output for this example:
1 250 223 224 444 5/5/2009
1 555 666 120 290 10/6/2009
2 120 666 1/2/2007
2 120 666 1/1/2007
Well, this works, its a bit verbose:
data have; input patientid 1-2 dx1 3-5 dx2 7-9 dx3 11-13 dx4 15-17 date mmddyy10.; datalines; 1 250 223 224 444 5/5/2009 1 555 666 120 290 10/6/2009 2 120 666 120 290 1/2/2007 2 120 666 120 290 1/1/2007 3 250 120 290 120 2/2/2004 3 240 250 120 290 3/3/2004 3 250 120 290 120 1/1/20042 ; run; data inter; set have; by patientid; retain tmp1 tmp2; if first.patientid then do; tmp1=index(catx(",",of dx:),"666"); tmp2=index(catx(",",of dx:),"250"); end; if last.patientid and intck("year",lag(date),date) <= 1 then do; if tmp1 and index(catx(",",of dx:),"666") then output; if (tmp1 and index(catx(",",of dx:),"250")) or (tmp2 and index(catx(",",of dx:),"666")) then output; end; run; proc sql; create table want as select * from have where patientid in (select patientid from inter); quit;
Could probably drop it to one step with a do while loop.
Your test datastep does not work. Therefore, just theory:
Retain a variable across rows which is the sum condition, do this by patient id year period - another variable to create?
If any of array dx is 666 or 250, then add one to sum condition.
If last.patientid and sum>=2 then output.
Thank you for the response, I corrected the data. I can flag the variables but I am not sure I fully understood how to differentiate between the two conditions in the code.
If variables a would flag 666
and variable b flag 250
the condition is that if a=1 and a=1 within a year then 1
or if a=1 and b=1 withing a year then 1
However of b=1 and b=1 only (meaning no other a=1) within a year then 0
Well, this works, its a bit verbose:
data have; input patientid 1-2 dx1 3-5 dx2 7-9 dx3 11-13 dx4 15-17 date mmddyy10.; datalines; 1 250 223 224 444 5/5/2009 1 555 666 120 290 10/6/2009 2 120 666 120 290 1/2/2007 2 120 666 120 290 1/1/2007 3 250 120 290 120 2/2/2004 3 240 250 120 290 3/3/2004 3 250 120 290 120 1/1/20042 ; run; data inter; set have; by patientid; retain tmp1 tmp2; if first.patientid then do; tmp1=index(catx(",",of dx:),"666"); tmp2=index(catx(",",of dx:),"250"); end; if last.patientid and intck("year",lag(date),date) <= 1 then do; if tmp1 and index(catx(",",of dx:),"666") then output; if (tmp1 and index(catx(",",of dx:),"250")) or (tmp2 and index(catx(",",of dx:),"666")) then output; end; run; proc sql; create table want as select * from have where patientid in (select patientid from inter); quit;
Could probably drop it to one step with a do while loop.
data have;
input patientid 1-2 dx1 3-5 dx2 7-9 dx3 11-13 dx4 15-17 date mmddyy10.;
year=year(date);
format date mmddyy10.;
datalines;
1 250 223 224 444 5/5/2009
1 555 666 120 290 10/6/2009
2 120 666 120 290 1/2/2007
2 120 666 120 290 1/1/2007
3 250 120 290 120 2/2/2004
3 240 250 120 290 3/3/2004
3 250 120 290 120 1/1/20042
;
run;
proc sql;
create table temp as
select patientid,year,
sum(dx1=666)+sum(dx2=666)+sum(dx3=666) as sum_666,
sum(dx1=250)+sum(dx2=250)+sum(dx3=250) as sum_250
from have
group by patientid,year
having calculated sum_666=2 or (calculated sum_666=1 and calculated sum_250=1);
create table want as
select *
from have
where patientid in (select patientid from temp);
quit;
Thank you both!
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.