Hello SAS Community,
I need help creating a variable that flags if a person was tested during an encounter and if they were not tested during the encounter but tested during a previous encounter within 12 months it still flags the encounter as tested.
I have a dataset of multiple encounters by patient ID. It includes the admission date (adm_date) of the encounter and if the patient (patient_mrn) was tested (tested) for a disease during that encounter or not. It is possible a patient would not have been tested during a current encounter because they were seen and tested during a previous encounter within 12 months. If that is the case, then I want to flag the current encounter as tested because it meets the testing definition (new variable = tested_12). I have a variable if tested (Tested) during the encounter and another variable if the current visit is within 12 months of the previous visit (visit12mo). I need to create the tested_12 variable as described above.
Here is an example of the data structure and the variable I need to create (tested_12) and how it should look afterwards. Any help is appreciated!
PATIENT_MRN | ADM_DATE | Tested | Visit12mo | Tested_12 |
10521 | 7/21/2016 | 0 | 0 | 0 |
10764 | 7/5/2013 | 0 | 0 | 0 |
10764 | 5/1/2015 | 1 | 0 | 1 |
10764 | 1/21/2016 | 0 | 1 | 1 |
100002 | 12/27/2012 | 1 | 0 | 1 |
100088 | 1/2/2014 | 1 | 0 | 1 |
100293 | 10/16/2013 | 1 | 0 | 1 |
100293 | 5/15/2015 | 0 | 0 | 0 |
100293 | 5/18/2015 | 1 | 1 | 1 |
100293 | 7/31/2015 | 0 | 1 | 1 |
100369 | 3/25/2017 | 0 | 0 | 0 |
109825 | 8/28/2016 | 0 | 0 | 0 |
600563 | 8/10/2012 | 1 | 0 | 1 |
1000063 | 10/23/2018 | 1 | 0 | 1 |
1000083 | 5/20/2015 | 1 | 0 | 1 |
1000262 | 10/30/2017 | 0 | 0 | 0 |
1000270 | 1/5/2012 | 0 | 0 | 0 |
1000647 | 8/22/2018 | 0 | 0 | 0 |
1086737 | 3/28/2014 | 0 | 0 | 0 |
20000039 | 6/15/2015 | 1 | 0 | 1 |
100000486 | 3/20/2014 | 0 | 0 | 0 |
100000486 | 4/14/2014 | 0 | 1 | 0 |
167000635 | 8/2/2018 | 0 | 0 | 0 |
167000635 | 11/12/2018 | 0 | 1 | 0 |
190000370 | 2/25/2016 | 0 | 0 | 0 |
190000370 | 10/20/2016 | 1 | 1 | 1 |
190000370 | 12/26/2016 | 0 | 1 | 1 |
190000370 | 1/8/2017 | 0 | 1 | 1 |
190000370 | 4/8/2017 | 0 | 1 | 1 |
190000370 | 6/16/2017 | 0 | 1 | 1 |
190000370 | 9/11/2017 | 0 | 1 | 1 |
190000370 | 9/22/2017 | 0 | 1 | 1 |
190000370 | 2/14/2018 | 0 | 1 | 0 |
190000370 | 6/1/2018 | 0 | 1 | 0 |
210000242 | 8/24/2015 | 0 | 0 | 0 |
500000243 | 12/2/2012 | 1 | 0 | 1 |
500000506 | 10/6/2012 | 1 | 0 | 1 |
600000248 | 10/2/2012 | 1 | 0 | 1 |
600000248 | 6/5/2015 | 0 | 0 | 0 |
1670000502 | 3/14/2013 | 0 | 0 | 0 |
1670000502 | 11/15/2014 | 0 | 0 | 0 |
1670000502 | 11/15/2014 | 0 | 1 | 0 |
1670000502 | 12/1/2014 | 1 | 1 | 1 |
1670000502 | 12/8/2014 | 0 | 1 | 1 |
1670000502 | 6/25/2015 | 1 | 1 | 1 |
3100000238 | 12/30/2014 | 1 | 0 | 1 |
3100000238 | 7/30/2015 | 0 | 1 | 1 |
3100000238 | 1/27/2016 | 0 | 1 | 0 |
3100000238 | 7/7/2016 | 0 | 1 | 0 |
@novicesas74532 sorry I didn't catch all cases. please try this one.
patient_mrn | adm_date | tested | prev_date | prev_ID | retain | test_date | test_dateCF | visit12mo | tested_12 | |
---|---|---|---|---|---|---|---|---|---|---|
11 | 190000370 | 02/25/2016 | 0 | 07/31/2015 | 100293 | . | . | . | 0 | 0 |
12 | 190000370 | 10/20/2016 | 1 | 02/25/2016 | 190000370 | 10/20/2016 | 10/20/2016 | 10/20/2016 | 1 | 1 |
13 | 190000370 | 12/26/2016 | 0 | 10/20/2016 | 190000370 | 10/20/2016 | . | 10/20/2016 | 1 | 1 |
14 | 190000370 | 01/08/2017 | 0 | 12/26/2016 | 190000370 | 10/20/2016 | . | 10/20/2016 | 1 | 1 |
15 | 190000370 | 04/08/2017 | 0 | 01/08/2017 | 190000370 | 10/20/2016 | . | 10/20/2016 | 1 | 1 |
16 | 190000370 | 06/16/2017 | 0 | 04/08/2017 | 190000370 | 10/20/2016 | . | 10/20/2016 | 1 | 1 |
17 | 190000370 | 09/11/2017 | 0 | 06/16/2017 | 190000370 | 10/20/2016 | . | 10/20/2016 | 1 | 1 |
18 | 190000370 | 09/22/2017 | 0 | 09/11/2017 | 190000370 | 10/20/2016 | . | 10/20/2016 | 1 | 1 |
19 | 190000370 | 02/14/2018 | 0 | 09/22/2017 | 190000370 | 10/20/2016 | . | 10/20/2016 | 1 | 0 |
20 | 190000370 | 06/01/2018 | 0 | 02/14/2018 | 190000370 | 10/20/2016 | . | 10/20/2016 | 1 | 0 |
data have2;
set have;
by patient_mrn adm_date;
retain prev_date prev_ID prev_test retain;
if first.patient_mrn then do;
prev_date=.;
prev_ID=.;
retain=.;
end;
if tested=1 then test_date=adm_date;
if test_date ne . then retain=test_Date;
if test_date=. then do;
test_dateCF=retain;
end;
else test_dateCF=test_date;
prev_date=lag(adm_date);
prev_ID=lag(patient_mrn);
visit12mo=(patient_mrn=prev_ID and (adm_date <=prev_date+365));
if (tested=1) or (patient_mrn=prev_ID and adm_date <=test_dateCF+365) then tested_12=1;
else tested_12=0;
/* if (tested=1) or (patient_mrn=prev_ID and prev_test=1 and visit12mo=1) then tested_12=1; */
/* else tested_12=0; */
format prev_date test_date retain test_dateCF mmddyy10.;
run;
First thing: when discussing rules for coding refer to VARIABLES in your data set. "Encounter" is not a variable and you do not define what an encounter actually is in terms of the data shown.
Second: Show the data you have first, then show a desired result.
Third: Show the data you have in the form of data step code. This becomes somewhat critical when dealing with certain types of data, such as dates, times or datetime values. We see a "date" and will assume that it is a SAS date value when you my not actually have such. This is especially important if you provide example dates that look like 01/02/03. That value without a description could be 2Jan2003, 1Feb2003, 3Feb2001 (or possibly years 1903 and 1901).
For instance, ID variables such as your Patient_mrn quite often are not numeric but character. When sort order may be important, as in this problem, incorrect guess on our part my mean that our code doesn't work as expected.
I have built a partial data set and have some suggested code:
data have; input PATIENT_MRN ADM_DATE : mmddyy10. Tested ; format adm_date mmddyy10.; datalines; 10521 7/21/2016 0 0 0 10764 7/5/2013 0 0 0 10764 5/1/2015 1 0 1 10764 1/21/2016 0 1 1 100002 12/27/2012 1 0 1 100088 1/2/2014 1 0 1 100293 10/16/2013 1 0 1 100293 5/15/2015 0 0 0 100293 5/18/2015 1 1 1 100293 7/31/2015 0 1 1 100369 3/25/2017 0 0 0 109825 8/28/2016 0 0 0 600563 8/10/2012 1 0 1 1000063 10/23/2018 1 0 1 1000083 5/20/2015 1 0 1 1000262 10/30/2017 0 0 0 1000270 1/5/2012 0 0 0 ; /*may need to include this step with your data*/ /*proc sort data=have;*/ /* by PATIENT_MRN ADM_DATE;*/ /*run;*/ data want; set have; retain testdate; lastdate = lag(adm_date); by patient_mrn ; if first.patient_mrn then call missing(lastdate,testdate); if lastdate then visit12mo = (intck('month',lastdate,adm_date,'C') le 12); else visit12mo=0; if tested then testdate=adm_date; if testdate then tested_12= (intck('month',testdate,adm_date,'C') le 12); else tested_12=0; drop lastdate testdate; run;
The Sort order is important to use BY processing which is critical to doing things correctly for the first record of each patient.
By group processing creates automatic variables in a data step that indicate whether the current record is the first or last of the group that are 1/0 coded values which SAS will use a true/false in comparisons.
The Retain function keeps the value of a variable across data step boundaries. So we can keep the last date that a test was performed (assuming 1 in tested means that, you did not state such). The Lag function returns the value of a variable from the previous time the condition executes (use of Lag in conditional statements like IF gets tricky) as used here it will have the value of the previous record's Adm_date which is what I have to assume is an encounter.
The INTCK function is used to return the number of intervals between two date, time or datetime values and is the basic tool for testing duration. Your definition of "12 months" is a bit incomplete as you do not clearly state whether the interval from 31Jan2016 to 1Feb2016 would one month or not, i.e. counting month boundaries. I have guessed that you mean more of the duration and used the 'C' switch, for continuous, so that dates within the month are considered.
If you have a different interpretation of "month" you may want a different switch or option. Note that Intck will do 'Year' as well. In which case you might want to use that and compare to 1 instead of 12.
https://communities.sas.com/t5/SAS-Communities-Library/Working-with-Dates-and-Times-in-SAS-Tutorial/... has a PDF with much information about dates.
guessing this is hospital data so encounter is all the visits when they are admitted to the hospital.
Thank you @ballardw for this information. I will remember your tips for a better post next time. Sorry for a poor explanation. Each row in the dataset is an encounter and the adm_date = the test date. Yes 0=not tested or visit not within 12 months of the previous visit. 1=tested or visit within 12 months of the previous visit. And within 12 months should equal within the year. So for example, between 12/20/2014 and 12/19/2015 would be within 12 months.
Your response almost works perfectly! Except the 2nd to last row of the original data (modified below to include all of the encounters for the last patient) returns a 1 for tested when it should be a 0 because the last test date (adm_date) was 12/20/2014 and new adm date )1/27/2016) is more than 12 months from when the person was tested on 12/20/2014. I can't seem to figure out why because the code seems right. Am I missing something?
3100000238 12/20/2014 1 0 1
3100000238 7/30/2015 0 1 1
3100000238 1/27/2016 0 1 1 (should be 0)
3100000238 7/7/2016 0 1 0
please try my code. it gets your desired output...
Obs | patient_mrn | adm_date | tested | prev_date | prev_ID | prev_test | visit12mo | tested_12 |
---|---|---|---|---|---|---|---|---|
11 | 3100000238 | 12/20/2014 | 1 | 07/31/2015 | 100293 | 0 | 0 | 1 |
12 | 3100000238 | 07/30/2015 | 0 | 12/20/2014 | 3100000238 | 1 | 1 | 1 |
13 | 3100000238 | 01/27/2016 | 0 | 07/30/2015 | 3100000238 | 0 | 1 | 0 |
14 | 3100000238 | 07/07/2016 | 0 | 01/27/2016 | 3100000238 | 0 | 1 | 0 |
Hi @tarheel13 ! Thank you for your code! It works on the last patient scenario but it doesn't work for this patient scenario below. I think because of the prev_test variable. Is there a way to make the prev_Test variable retain for 12 months? That would solve it. I've made the column in blue what it should be. The tested_12_new is your results with code as-is. Because the patient was tested on 10/20/2016 the new tested variable should be tested=1 through any encounters by 10/19/2017.
PATIENT_MRN | ADM_DATE | Tested | Visit12mo | Tested_12 | prev_date | prev_ID | prev_test | tested_12_new |
190000370 | 25-Feb-16 | 0 | 0 | 0 | 11/12/2018 | 167000635 | 0 | 0 |
190000370 | 20-Oct-16 | 1 | 1 | 1 | 2/25/2016 | 190000370 | 0 | 1 |
190000370 | 26-Dec-16 | 0 | 1 | 1 | 10/20/2016 | 190000370 | 1 | 1 |
190000370 | 8-Jan-17 | 0 | 1 | 1 | 12/26/2016 | 190000370 | 0 | 0 |
190000370 | 8-Apr-17 | 0 | 1 | 1 | 1/8/2017 | 190000370 | 0 | 0 |
190000370 | 16-Jun-17 | 0 | 1 | 1 | 4/8/2017 | 190000370 | 0 | 0 |
190000370 | 11-Sep-17 | 0 | 1 | 1 | 6/16/2017 | 190000370 | 0 | 0 |
190000370 | 22-Sep-17 | 0 | 1 | 1 | 9/11/2017 | 190000370 | 0 | 0 |
190000370 | 14-Feb-18 | 0 | 1 | 0 | 9/22/2017 | 190000370 | 0 | 0 |
190000370 | 1-Jun-18 | 0 | 1 | 0 | 2/14/2018 | 190000370 | 0 | 0 |
@novicesas74532 sorry I didn't catch all cases. please try this one.
patient_mrn | adm_date | tested | prev_date | prev_ID | retain | test_date | test_dateCF | visit12mo | tested_12 | |
---|---|---|---|---|---|---|---|---|---|---|
11 | 190000370 | 02/25/2016 | 0 | 07/31/2015 | 100293 | . | . | . | 0 | 0 |
12 | 190000370 | 10/20/2016 | 1 | 02/25/2016 | 190000370 | 10/20/2016 | 10/20/2016 | 10/20/2016 | 1 | 1 |
13 | 190000370 | 12/26/2016 | 0 | 10/20/2016 | 190000370 | 10/20/2016 | . | 10/20/2016 | 1 | 1 |
14 | 190000370 | 01/08/2017 | 0 | 12/26/2016 | 190000370 | 10/20/2016 | . | 10/20/2016 | 1 | 1 |
15 | 190000370 | 04/08/2017 | 0 | 01/08/2017 | 190000370 | 10/20/2016 | . | 10/20/2016 | 1 | 1 |
16 | 190000370 | 06/16/2017 | 0 | 04/08/2017 | 190000370 | 10/20/2016 | . | 10/20/2016 | 1 | 1 |
17 | 190000370 | 09/11/2017 | 0 | 06/16/2017 | 190000370 | 10/20/2016 | . | 10/20/2016 | 1 | 1 |
18 | 190000370 | 09/22/2017 | 0 | 09/11/2017 | 190000370 | 10/20/2016 | . | 10/20/2016 | 1 | 1 |
19 | 190000370 | 02/14/2018 | 0 | 09/22/2017 | 190000370 | 10/20/2016 | . | 10/20/2016 | 1 | 0 |
20 | 190000370 | 06/01/2018 | 0 | 02/14/2018 | 190000370 | 10/20/2016 | . | 10/20/2016 | 1 | 0 |
data have2;
set have;
by patient_mrn adm_date;
retain prev_date prev_ID prev_test retain;
if first.patient_mrn then do;
prev_date=.;
prev_ID=.;
retain=.;
end;
if tested=1 then test_date=adm_date;
if test_date ne . then retain=test_Date;
if test_date=. then do;
test_dateCF=retain;
end;
else test_dateCF=test_date;
prev_date=lag(adm_date);
prev_ID=lag(patient_mrn);
visit12mo=(patient_mrn=prev_ID and (adm_date <=prev_date+365));
if (tested=1) or (patient_mrn=prev_ID and adm_date <=test_dateCF+365) then tested_12=1;
else tested_12=0;
/* if (tested=1) or (patient_mrn=prev_ID and prev_test=1 and visit12mo=1) then tested_12=1; */
/* else tested_12=0; */
format prev_date test_date retain test_dateCF mmddyy10.;
run;
Thank you @tarheel13! This worked! Many thanks!
I'm happy I could do it and thanks for pointing out the error in my previous code 😄
I only tested the first 10 rows. it seems to get your desired output. it helps us more if you post the data as datalines.
data have;
input patient_mrn adm_date :mmddyy10. tested;
format adm_date mmddyy10.;
datalines;
10521 7/21/2016 0
10764 7/5/2013 0
10764 5/1/2015 1
10764 1/21/2016 0
100002 12/27/2012 1
100088 1/2/2014 1
100293 10/16/2013 1
100293 5/15/2015 0
100293 5/18/2015 1
100293 7/31/2015 0
;
run;
proc sort data=have;
by patient_mrn adm_date;
run;
data have2;
set have;
by patient_mrn adm_date;
retain prev_date prev_ID prev_test;
if first.patient_mrn then do;
prev_date=.;
prev_ID=.;
prev_test=.;
end;
prev_date=lag(adm_date);
prev_ID=lag(patient_mrn);
prev_test=lag(tested);
visit12mo=(patient_mrn=prev_ID and (adm_date <=prev_date+365));
if (tested=1) or (patient_mrn=prev_ID and prev_test=1 and visit12mo=1) then tested_12=1;
else tested_12=0;
format prev_date mmddyy10.;
run;
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.