Hi guys,
suppose to have the following dataset:
data DB;
input ID :$20.Admission:date9. Discharge:date9. Diagnosis :$20.;
format Admission date9. Discharge date9.;
cards;
0001 06DEC2014 14DEC2014 VIRUS_A
0001 08NOV2020 11NOV2020 FLU
0004 14MAY2014 02JUN2014 FLU
0004 30JUN2015 15AUG2015 FLU
0004 16FEB2019 18FEB2019 VIRUS_A
0005 10AUG2019 11SEPT2019 FLU
....
;
I have to fit a time-series model to estimate the weekly number of hospitalizations for VIRUS_A. The dataset shown is just an example of the real dataset. I don't know how to set the "weekly" from the admission dates I have (I have also discharge dates). The study starts on 2014 but some patients are hospitalized after the start others at different months of the 2014. Moreover, does the week number start from the 01 January? If yes, what's happens if 01 Jan is in the middle of the week?
Apart the practical SAS programming, I also have not clear the theory behind mapping dates to weeks. It's the first time I deal with this data and questions.
Thank you in advance
For a time series analysis, it is counterproductive to define an arbitrary "week" scale that runs 1 through N. Instead, use the week beginning date to define your time periods. With your data, you could approach the problem in this way:
data admits;
set have;
where diagnosis = 'VIRUS_A';
week_begin = intnx('week', admission, 0);
run;
proc freq data=admits;
tables week_begin / out=admissions (keep=week_begin count rename=(count=n_admissions));
run;
This gives you a variable named N_ADMISSIONS with the number of admissions for the week. Steps to be taken at a later point:
Also note, SAS weeks run from Sunday through Saturday,. You might want to change this, if you decide that your study doesn't really begin on a Sunday. In that case, a parameter to the INTNX function can be set to determine which day of the week you would like your week definitions to begin. But all of this should be considered now, before the programming takes place.
To get help, you will need to understand the question being asked. You might have access to the person asking the question, but we certainly don't. These types of questions readily pop out:
Basically, what is it you are trying to count.
The programming is relatively straightforward. But the key to solving it correcly is understanding the question.
Hi, thank you for your attention to my post. It doesn't matter the hospital stay duration. I added the discharge but the admission is sufficient for the count. So:
Edit: need to count new admissions. Discharge date is there but admissions are enough.
It would be worth your while reviewing official definitions of week numbering since you are unsure about that. One such definition is contained in ISO standard 8601. You can read about it here: https://en.wikipedia.org/wiki/ISO_week_date
SAS has implemented this definition in the WEEK function using the 'V' option. Note there are also WEEK formats, so if you wanted to group your data using the ISO 8601 week standard all you would need to do is apply the WEEKV SAS format to your admission date.
Try this:
proc freq data = DB;
table admission;
format admission weekv.;
run;
I have to fit a time-series model to estimate the weekly number of hospitalizations for VIRUS_A. The dataset shown is just an example of the real dataset. I don't know how to set the "weekly" from the admission dates I have (I have also discharge dates). The study starts on 2014 but some patients are hospitalized after the start others at different months of the 2014. Moreover, does the week number start from the 01 January? If yes, what's happens if 01 Jan is in the middle of the week?
You really need to answer those questions yourself. And probably many more such what are the other variables you want to include in your timeseries analysis.
If you are just interested in calendar dates (rather than patient specific dates relative to some type of initial diagnosis or vaccination etc) then you should be able to just use the date variable itself. I doubt that it matters if you start counting weeks on Jan 1 such as by using INTCK() function.
week=intck('week','01JAN2014'd,admission);
You could even just divide by the number of days since your starting date by 7 if you want to break the admission dates into weeks.
week=(admission-'01JAN2014'd)/7;
Perhaps converting to an integer?
week=ceil((admission-'01JAN2014'd)/7);
For a time series analysis, it is counterproductive to define an arbitrary "week" scale that runs 1 through N. Instead, use the week beginning date to define your time periods. With your data, you could approach the problem in this way:
data admits;
set have;
where diagnosis = 'VIRUS_A';
week_begin = intnx('week', admission, 0);
run;
proc freq data=admits;
tables week_begin / out=admissions (keep=week_begin count rename=(count=n_admissions));
run;
This gives you a variable named N_ADMISSIONS with the number of admissions for the week. Steps to be taken at a later point:
Also note, SAS weeks run from Sunday through Saturday,. You might want to change this, if you decide that your study doesn't really begin on a Sunday. In that case, a parameter to the INTNX function can be set to determine which day of the week you would like your week definitions to begin. But all of this should be considered now, before the programming takes place.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.