BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
novicesas74532
Fluorite | Level 6

Hello SAS Community,

I need help creating a variable that flags if a person was tested during an encounter and if they were not tested during the encounter but tested during a previous encounter within 12 months it still flags the encounter as tested. 

 

I have a dataset of multiple encounters by patient ID. It includes the admission date (adm_date) of the encounter and if the patient (patient_mrn) was tested (tested) for a disease during that encounter or not. It is possible a patient would not have been tested during a current encounter because they were seen and tested during a previous encounter within 12 months. If that is the case, then I want to flag the current encounter as tested because it meets the testing definition (new variable = tested_12).  I have a variable if tested (Tested) during the encounter and another variable if the current visit is within 12 months of the previous visit (visit12mo).  I need to create the tested_12 variable as described above. 

 

Here is an example of the data structure and the variable I need to create (tested_12) and how it should look afterwards. Any help is appreciated!

 

PATIENT_MRNADM_DATETestedVisit12moTested_12
105217/21/2016000
107647/5/2013000
107645/1/2015101
107641/21/2016011
10000212/27/2012101
1000881/2/2014101
10029310/16/2013101
1002935/15/2015000
1002935/18/2015111
1002937/31/2015011
1003693/25/2017000
1098258/28/2016000
6005638/10/2012101
100006310/23/2018101
10000835/20/2015101
100026210/30/2017000
10002701/5/2012000
10006478/22/2018000
10867373/28/2014000
200000396/15/2015101
1000004863/20/2014000
1000004864/14/2014010
1670006358/2/2018000
16700063511/12/2018010
1900003702/25/2016000
19000037010/20/2016111
19000037012/26/2016011
1900003701/8/2017011
1900003704/8/2017011
1900003706/16/2017011
1900003709/11/2017011
1900003709/22/2017011
1900003702/14/2018010
1900003706/1/2018010
2100002428/24/2015000
50000024312/2/2012101
50000050610/6/2012101
60000024810/2/2012101
6000002486/5/2015000
16700005023/14/2013000
167000050211/15/2014000
167000050211/15/2014010
167000050212/1/2014111
167000050212/8/2014011
16700005026/25/2015111
310000023812/30/2014101
31000002387/30/2015011
31000002381/27/2016010
31000002387/7/2016010
1 ACCEPTED SOLUTION

Accepted Solutions
tarheel13
Rhodochrosite | Level 12

@novicesas74532 sorry I didn't catch all cases. please try this one. 

patient_mrn adm_date tested prev_date prev_ID retain test_date test_dateCF visit12mo tested_12
11 190000370 02/25/2016 0 07/31/2015 100293 . . . 0 0
12 190000370 10/20/2016 1 02/25/2016 190000370 10/20/2016 10/20/2016 10/20/2016 1 1
13 190000370 12/26/2016 0 10/20/2016 190000370 10/20/2016 . 10/20/2016 1 1
14 190000370 01/08/2017 0 12/26/2016 190000370 10/20/2016 . 10/20/2016 1 1
15 190000370 04/08/2017 0 01/08/2017 190000370 10/20/2016 . 10/20/2016 1 1
16 190000370 06/16/2017 0 04/08/2017 190000370 10/20/2016 . 10/20/2016 1 1
17 190000370 09/11/2017 0 06/16/2017 190000370 10/20/2016 . 10/20/2016 1 1
18 190000370 09/22/2017 0 09/11/2017 190000370 10/20/2016 . 10/20/2016 1 1
19 190000370 02/14/2018 0 09/22/2017 190000370 10/20/2016 . 10/20/2016 1 0
20 190000370 06/01/2018 0 02/14/2018 190000370 10/20/2016 . 10/20/2016 1 0
data have2;
	set have;
	by patient_mrn adm_date;
	retain prev_date prev_ID prev_test retain;
	if first.patient_mrn then do;
		prev_date=.;
		prev_ID=.;
		retain=.;
	end;
	if tested=1 then test_date=adm_date;
	if test_date ne . then retain=test_Date;
		if test_date=. then do;
		test_dateCF=retain;
		end;
	else test_dateCF=test_date;

	prev_date=lag(adm_date);
	prev_ID=lag(patient_mrn);
	
	visit12mo=(patient_mrn=prev_ID and (adm_date <=prev_date+365));
	if (tested=1) or (patient_mrn=prev_ID and adm_date <=test_dateCF+365) then tested_12=1;
	else tested_12=0;
/* 	if (tested=1) or (patient_mrn=prev_ID and prev_test=1 and visit12mo=1) then tested_12=1; */
/* 	else tested_12=0; */
	format prev_date test_date retain test_dateCF mmddyy10.;
run;

View solution in original post

9 REPLIES 9
ballardw
Super User

First thing: when discussing rules for coding refer to VARIABLES in your data set. "Encounter" is not a variable and you do not define what an encounter actually is in terms of the data shown.

 

Second: Show the data you have first, then show a desired result.

 

Third: Show the data you have in the form of data step code. This becomes somewhat critical when dealing with certain types of data, such as dates, times or datetime values. We see a "date" and will assume that it is a SAS date value when you my not actually have such. This is especially important if you provide example dates that look like 01/02/03. That value without a description could be 2Jan2003, 1Feb2003, 3Feb2001 (or possibly years 1903 and 1901).

 

For instance, ID variables such as your Patient_mrn quite often are not numeric but character. When sort order may be important, as in this problem, incorrect guess on our part my mean that our code doesn't work as expected.

 

I have built a partial data set and have some suggested code:

data have;
   input PATIENT_MRN	 ADM_DATE : mmddyy10.	Tested ;
   format adm_date mmddyy10.;
datalines;
10521	7/21/2016	0	0	0
10764	7/5/2013	0	0	0
10764	5/1/2015	1	0	1
10764	1/21/2016	0	1	1
100002	12/27/2012	1	0	1
100088	1/2/2014	1	0	1
100293	10/16/2013	1	0	1
100293	5/15/2015	0	0	0
100293	5/18/2015	1	1	1
100293	7/31/2015	0	1	1
100369	3/25/2017	0	0	0
109825	8/28/2016	0	0	0
600563	8/10/2012	1	0	1
1000063	10/23/2018	1	0	1
1000083	5/20/2015	1	0	1
1000262	10/30/2017	0	0	0
1000270	1/5/2012	0	0	0
;
/*may need to include this step with your data*/
/*proc sort data=have;*/
/*   by PATIENT_MRN	 ADM_DATE;*/
/*run;*/

data want;
   set have;
   retain testdate;
   lastdate = lag(adm_date);
   by patient_mrn ;
   if first.patient_mrn then call missing(lastdate,testdate);
   if lastdate then visit12mo = (intck('month',lastdate,adm_date,'C') le 12);
   else visit12mo=0;
   if tested then testdate=adm_date;
   if testdate then tested_12= (intck('month',testdate,adm_date,'C') le 12);
   else tested_12=0;
   drop lastdate testdate;
run;

The Sort order is important to use BY processing which is critical to doing things correctly for the first record of each patient.

By group processing creates automatic variables in a data step that indicate whether the current record is the first or last of the group that are 1/0 coded values which SAS will use a true/false in comparisons.

The Retain function keeps the value of a variable across data step boundaries. So we can keep the last date that a test was performed (assuming 1 in tested means that, you did not state such). The Lag function returns the value of a variable from the previous time the condition executes (use of Lag in conditional statements like IF gets tricky) as used here it will have the value of the previous record's Adm_date which is what I have to assume is an encounter.

 

The INTCK function is used to return the number of intervals between two date, time or datetime values and is the basic tool for testing duration. Your definition of "12 months" is a bit incomplete as you do not clearly state whether the interval from 31Jan2016 to 1Feb2016 would one month or not, i.e. counting month boundaries. I have guessed that you mean more of the duration and used the 'C' switch, for continuous, so that dates within the month are considered.

If you have a different interpretation of "month" you may want a different switch or option. Note that Intck will do 'Year' as well. In which case you might want to use that and compare to 1 instead of 12.

 

https://communities.sas.com/t5/SAS-Communities-Library/Working-with-Dates-and-Times-in-SAS-Tutorial/... has a PDF with much information about dates.

 

tarheel13
Rhodochrosite | Level 12

guessing this is hospital data so encounter is all the visits when they are admitted to the hospital. 

novicesas74532
Fluorite | Level 6

Thank you @ballardw for this information. I will remember your tips for a better post next time. Sorry for a poor explanation. Each row in the dataset is an encounter and the adm_date = the test date. Yes 0=not tested or visit not within 12 months of the previous visit. 1=tested or visit within 12 months of the previous visit. And within 12 months should equal within the year. So for example, between 12/20/2014 and 12/19/2015 would be within 12 months. 

 

Your response almost works perfectly! Except the 2nd to last row of the original data (modified below to include all of the encounters for the last patient) returns a 1 for tested when it should be a 0 because the last test date (adm_date) was 12/20/2014 and new adm date )1/27/2016) is more than 12 months from when the person was tested on 12/20/2014. I can't seem to figure out why because the code seems right. Am I missing something?

 

3100000238	12/20/2014	1	0	1
3100000238 7/30/2015 0 1 1
3100000238 1/27/2016 0 1 1 (should be 0)
3100000238 7/7/2016 0 1 0

 

tarheel13
Rhodochrosite | Level 12

please try my code. it gets your desired output...

 
Obs patient_mrn adm_date tested prev_date prev_ID prev_test visit12mo tested_12
11 3100000238 12/20/2014 1 07/31/2015 100293 0 0 1
12 3100000238 07/30/2015 0 12/20/2014 3100000238 1 1 1
13 3100000238 01/27/2016 0 07/30/2015 3100000238 0 1 0
14 3100000238 07/07/2016 0 01/27/2016 3100000238 0 1 0
novicesas74532
Fluorite | Level 6

Hi @tarheel13 ! Thank you for your code! It works on the last patient scenario but it doesn't work for this patient scenario below. I think because of the prev_test variable. Is there a way to make the prev_Test variable retain for 12 months? That would solve it. I've made the column in blue what it should be. The tested_12_new is your results with code as-is. Because the patient was tested on 10/20/2016 the new tested variable should be tested=1 through any encounters by 10/19/2017.

PATIENT_MRNADM_DATETestedVisit12moTested_12prev_dateprev_IDprev_testtested_12_new
19000037025-Feb-1600011/12/201816700063500
19000037020-Oct-161112/25/201619000037001
19000037026-Dec-1601110/20/201619000037011
1900003708-Jan-1701112/26/201619000037000
1900003708-Apr-170111/8/201719000037000
19000037016-Jun-170114/8/201719000037000
19000037011-Sep-170116/16/201719000037000
19000037022-Sep-170119/11/201719000037000
19000037014-Feb-180109/22/201719000037000
1900003701-Jun-180102/14/201819000037000
tarheel13
Rhodochrosite | Level 12

@novicesas74532 sorry I didn't catch all cases. please try this one. 

patient_mrn adm_date tested prev_date prev_ID retain test_date test_dateCF visit12mo tested_12
11 190000370 02/25/2016 0 07/31/2015 100293 . . . 0 0
12 190000370 10/20/2016 1 02/25/2016 190000370 10/20/2016 10/20/2016 10/20/2016 1 1
13 190000370 12/26/2016 0 10/20/2016 190000370 10/20/2016 . 10/20/2016 1 1
14 190000370 01/08/2017 0 12/26/2016 190000370 10/20/2016 . 10/20/2016 1 1
15 190000370 04/08/2017 0 01/08/2017 190000370 10/20/2016 . 10/20/2016 1 1
16 190000370 06/16/2017 0 04/08/2017 190000370 10/20/2016 . 10/20/2016 1 1
17 190000370 09/11/2017 0 06/16/2017 190000370 10/20/2016 . 10/20/2016 1 1
18 190000370 09/22/2017 0 09/11/2017 190000370 10/20/2016 . 10/20/2016 1 1
19 190000370 02/14/2018 0 09/22/2017 190000370 10/20/2016 . 10/20/2016 1 0
20 190000370 06/01/2018 0 02/14/2018 190000370 10/20/2016 . 10/20/2016 1 0
data have2;
	set have;
	by patient_mrn adm_date;
	retain prev_date prev_ID prev_test retain;
	if first.patient_mrn then do;
		prev_date=.;
		prev_ID=.;
		retain=.;
	end;
	if tested=1 then test_date=adm_date;
	if test_date ne . then retain=test_Date;
		if test_date=. then do;
		test_dateCF=retain;
		end;
	else test_dateCF=test_date;

	prev_date=lag(adm_date);
	prev_ID=lag(patient_mrn);
	
	visit12mo=(patient_mrn=prev_ID and (adm_date <=prev_date+365));
	if (tested=1) or (patient_mrn=prev_ID and adm_date <=test_dateCF+365) then tested_12=1;
	else tested_12=0;
/* 	if (tested=1) or (patient_mrn=prev_ID and prev_test=1 and visit12mo=1) then tested_12=1; */
/* 	else tested_12=0; */
	format prev_date test_date retain test_dateCF mmddyy10.;
run;
tarheel13
Rhodochrosite | Level 12

I'm happy I could do it and thanks for pointing out the error in my previous code 😄

tarheel13
Rhodochrosite | Level 12

I only tested the first 10 rows. it seems to get your desired output. it helps us more if you post the data as datalines. 

data have;
input patient_mrn adm_date :mmddyy10. tested;
format adm_date mmddyy10.;
datalines;
10521 7/21/2016 0
10764 7/5/2013 0
10764 5/1/2015 1	
10764 1/21/2016 0	
100002 12/27/2012 1	
100088 1/2/2014 1	
100293 10/16/2013 1	
100293 5/15/2015 0	
100293 5/18/2015 1	
100293 7/31/2015 0	
;
run;

proc sort data=have;
	by patient_mrn adm_date;
run;

data have2;
	set have;
	by patient_mrn adm_date;
	retain prev_date prev_ID prev_test;
	if first.patient_mrn then do;
		prev_date=.;
		prev_ID=.;
		prev_test=.;
	end;
	prev_date=lag(adm_date);
	prev_ID=lag(patient_mrn);
	prev_test=lag(tested);
	visit12mo=(patient_mrn=prev_ID and (adm_date <=prev_date+365));
	if (tested=1) or (patient_mrn=prev_ID and prev_test=1 and visit12mo=1) then tested_12=1;
	else tested_12=0;
	format prev_date mmddyy10.;
run;

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 9 replies
  • 800 views
  • 1 like
  • 3 in conversation