Hi I want to know the days between diagnosis and treatment date.
this is the code i used.
/*Calculating timeframe between treatment and diagnosis*/
data CDCSTVST.CDC_Site_Timeframe;
set CDCSTVST.CDC_Site_PositiveOnly;
days=(INTCK('day', colldate1, Treatment_Date1));
put days=;
run;
data CDCSTVST.CDC_Site_Timeframe1;
set CDCSTVST.CDC_Site_Timeframe;
where days NE .;
run;
proc freq data=CDCSTVST.CDC_Site_Timeframe1;
table incidentID days;
run;
data CDCSTVST.CDC_Site_Timeframe2;
set CDCSTVST.CDC_Site_Timeframe1;
if days=0 then timeframe="sameday";
if days < 7 and days >0 then timeframe="Week";
if days < 15 and days >7 then timeframe="2Weeks";
if days < 21 and days >15 then timeframe="3weeks";
if days < 28 and days >21 then timeframe="4weeks";
if days = >28 then timeframe=">4weeks";
run;
/*Checking to see if the code above worked*/
Proc freq data=CDCSTVST.CDC_Site_Timeframe2;
table days*timeframe/norow nocol nopercent;
run;
this is th eoutput i got:
days timeframe
2Weeks 3weeks 4weeks >4weeks Week sameday Total
-31 0 0 0 0 0 0 0
-21 0 0 0 0 0 0 0
-13 0 0 0 0 0 0 0
-6 0 0 0 0 0 0 0
-2 0 0 0 0 0 0 0
-1 0 0 0 0 0 0 0
0 0 0 0 0 0 653 0
1 0 0 0 0 0 0 0
2 0 0 0 0 0 0 0
3 0 0 0 0 0 0 0
4 0 0 0 0 0 0 0
5 0 0 0 0 0 0 0
6 0 0 0 0 8 0 0
12 0 0 0 0 13 0 0
14 0 0 0 0 2 0 0
Help me please.
Hi, when posting code, you ought to click on the running man icon and paste the code into the window that appears. This will preserve your formatting and make the code much more readable. Thanks!
INTCK returns negative values when the first date (time or datetime) parameter is later than the second.
Brief example:
data example; date1= '01Jan2019'd; date2= '01Feb2019'd; days1 = intck('day',date1,date2); days2 = intck('day',date2,date1); put days1= days2= ; run;
If you have a "treatment date" prior to your "diagnosis date" then you may have typos or transposed values in the data.
Real world data is messy and seldom to be completely trusted. I routinely test to see if dates of medical tests are after the date of birth, prior to today and whether the patients sex, date of birth, race or ethnicity changes between tests.
I have some data collected almost 20 years ago that indicate medical tests will be performed sometime in the next 2 years...
Instead of a bunch of if/then/else statements you might consider using a format such as:
proc format library=work; value mydays 0 = 'Same' 1 - 7 = 'Week' 8 - 14= '2 Weeks' 15- 21= '3 Weeks' 22- 28= '4 Weeks' 29-high='>4 Weeks' ; proc freq data=have; tables days; format days mydays.; run;
Which would show negative values if they appear and the formatted value otherwise.
What do you want to get from this code?
Try this code after your data step to identify the records where your treatment is before the diagnosis.
data problems; set CDCSTVST.CDC_Site_Timeframe; where days<0; run;
Then go back to the data source, if practical, for corrections.
Or assume the treatment and diagnosis date values got transposed (very dangerous to assume such) and flip them.
Or perhaps there is something in the data system that may have more than one "diagnosis" or "treatment" date and for a few records the values you used to calculate days is inappropriate.
Or perhaps treatment started with a presumption of the diagnosis before confirmation for some reason and the actual confirmed diagnosis came later. Though I wouldn't expect large differences in this case.
But there is a data problem at the heart of the negative values for days.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.