Hello Community,
I am working with a health record data set that consists of a patient identifier, diagnoses for each patient, diagnosis dates, and an index date (i.e., starting date) for a prescription of interest- please see example dataset 'Have' below. I would like to create a variable (e.g., variable= flag) indicating whether each patient has ANY diagnosis within a group of diagnoses (diagnosis= A, B, or C) that has a diagnosis date ON or within 6 months BEFORE the index date. I have provided an example of the desired output dataset below. Any help with how to program this would be much appreciated!!
Have:
Patient_ID | Diagnosis | Diagnosis_date | Index_date |
|
1 | A | 01JAN2016 | 01JAN2016 | *Flag=1 because ‘A’ diagnosis date on index date |
1 | D | 01JAN2016 | 01JAN2016 | |
1 | E | 01FEB2016 | 01JAN2016 | |
1 | A | 01FEB2016 | 01JAN2016 |
|
2 | B | 15MAY2018 | 15JUN2018 | *Flag=1 because ‘B’ diagnosis date within 6 months before index date |
2 | E | 15MAY2018 | 15JUN2018 | |
2 | F | 01JUN2018 | 15JUN2018 | |
3 | A | 05MAR2014 | 20FEB2015 | *Flag=1 because ‘C’ diagnosis date within 6 months before index date although ‘A’ diagnosis is not |
3 | G | 10OCT2014 | 20FEB2015 | |
3 | C | 01FEB2015 | 20FEB2015 | |
3 | H | 01FEB2015 | 20FEB2015 | |
4 | J | 02JUL2014 | 02JUL2014 | *Flag=0 because ‘B’ diagnosis date after index date |
4 | K | 01AUG2014 | 02JUL2014 | |
4 | B | 01AUG2014 | 02JUL2014 | |
5 | W | 01NOV2009 | 01JAN2010 | *Flag=0 because no diagnoses A, B, or C within 6 months before index date |
5 | Z | 01JAN2010 | 01JAN2010 | |
5 | H | 01FEB2010 | 01JAN2010 | |
6 | A | 10MAR2011 | 15JUN2011 | *Flag=1 because there is a diagnosis ‘A’ and ‘B’ within 6 months before index date |
6 | B | 12MAY2011 | 15JUN2011 | |
6 | X | 15JUL2011 | 15JUN2011 | |
6 | A | 15JUL2011 | 15JUN2011 |
Want:
Patient_ID | Flag |
1 | 1 |
2 | 1 |
3 | 1 |
4 | 0 |
5 | 0 |
6 | 1 |
data have;
input Patient_ID Diagnosis $ (Diagnosis_date Index_date)(:date9.);
format Diagnosis_date Index_date date9.;
datalines;
1 A 01JAN2016 01JAN2016
1 D 01JAN2016 01JAN2016
1 E 01FEB2016 01JAN2016
1 A 01FEB2016 01JAN2016
2 B 15MAY2018 15JUN2018
2 E 15MAY2018 15JUN2018
2 F 01JUN2018 15JUN2018
3 A 05MAR2014 20FEB2015
3 G 10OCT2014 20FEB2015
3 C 01FEB2015 20FEB2015
3 H 01FEB2015 20FEB2015
4 J 02JUL2014 02JUL2014
4 K 01AUG2014 02JUL2014
4 B 01AUG2014 02JUL2014
5 W 01NOV2009 01JAN2010
5 Z 01JAN2010 01JAN2010
5 H 01FEB2010 01JAN2010
6 A 10MAR2011 15JUN2011
6 B 12MAY2011 15JUN2011
6 X 15JUL2011 15JUN2011
6 A 15JUL2011 15JUN2011
;
Hi, hope the below helps:
%let diagnosis = 'A','B','C';
data have;
input Patient_ID Diagnosis $ (Diagnosis_date Index_date)(:date9.);
format Diagnosis_date Index_date date9. Flag_Variable $1.;
if diagnosis in (&diagnosis.) and Diagnosis_date <= index_date and Diagnosis_date >= intnx('month',index_date,-6,'s') then
Flag_Variable = "Y";
else Flag_Variable = " ";
datalines;
1 A 01JAN2016 01JAN2016
1 D 01JAN2016 01JAN2016
1 E 01FEB2016 01JAN2016
1 A 01FEB2016 01JAN2016
2 B 15MAY2018 15JUN2018
2 E 15MAY2018 15JUN2018
2 F 01JUN2018 15JUN2018
3 A 05MAR2014 20FEB2015
3 G 10OCT2014 20FEB2015
3 C 01FEB2015 20FEB2015
3 H 01FEB2015 20FEB2015
4 J 02JUL2014 02JUL2014
4 K 01AUG2014 02JUL2014
4 B 01AUG2014 02JUL2014
5 W 01NOV2009 01JAN2010
5 Z 01JAN2010 01JAN2010
5 H 01FEB2010 01JAN2010
6 A 10MAR2011 15JUN2011
6 B 12MAY2011 15JUN2011
6 X 15JUL2011 15JUN2011
6 A 15JUL2011 15JUN2011
;
run;
Hi, hope the below helps:
%let diagnosis = 'A','B','C';
data have;
input Patient_ID Diagnosis $ (Diagnosis_date Index_date)(:date9.);
format Diagnosis_date Index_date date9. Flag_Variable $1.;
if diagnosis in (&diagnosis.) and Diagnosis_date <= index_date and Diagnosis_date >= intnx('month',index_date,-6,'s') then
Flag_Variable = "Y";
else Flag_Variable = " ";
datalines;
1 A 01JAN2016 01JAN2016
1 D 01JAN2016 01JAN2016
1 E 01FEB2016 01JAN2016
1 A 01FEB2016 01JAN2016
2 B 15MAY2018 15JUN2018
2 E 15MAY2018 15JUN2018
2 F 01JUN2018 15JUN2018
3 A 05MAR2014 20FEB2015
3 G 10OCT2014 20FEB2015
3 C 01FEB2015 20FEB2015
3 H 01FEB2015 20FEB2015
4 J 02JUL2014 02JUL2014
4 K 01AUG2014 02JUL2014
4 B 01AUG2014 02JUL2014
5 W 01NOV2009 01JAN2010
5 Z 01JAN2010 01JAN2010
5 H 01FEB2010 01JAN2010
6 A 10MAR2011 15JUN2011
6 B 12MAY2011 15JUN2011
6 X 15JUL2011 15JUN2011
6 A 15JUL2011 15JUN2011
;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.