BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
NewUsrStat
Pyrite | Level 9

Hi guys, 

suppose to have the following dataset: 

data DB;
  input ID :$20.Admission:date9. Discharge:date9. Diagnosis :$20.;
  format Admission date9. Discharge date9.;
cards;
0001 06DEC2014   14DEC2014  VIRUS_A
0001 08NOV2020   11NOV2020  FLU
0004 14MAY2014   02JUN2014   FLU
0004 30JUN2015   15AUG2015   FLU
0004 16FEB2019   18FEB2019   VIRUS_A
0005 10AUG2019  11SEPT2019   FLU
....
;

I have to fit a time-series model to estimate the weekly number of hospitalizations for VIRUS_A. The dataset shown is just an example of the real dataset. I don't know how to set the "weekly" from the admission dates I have (I have also discharge dates). The study starts on 2014 but some patients are hospitalized after the start others at different months of the 2014. Moreover, does the week number start from the 01 January? If yes, what's happens if 01 Jan is in the middle of the week? 

 

Apart the practical SAS programming, I also have not clear the theory behind mapping dates to weeks. It's the first time I deal with this data and questions. 

 

Thank you in advance

1 ACCEPTED SOLUTION

Accepted Solutions
Astounding
PROC Star

For a time series analysis, it is counterproductive to define an arbitrary "week" scale that runs 1 through N.  Instead, use the week beginning date to define your time periods.  With your data,  you could approach the problem in this way:

data admits;
set have;
where diagnosis = 'VIRUS_A';
week_begin = intnx('week', admission, 0);
run;
proc freq data=admits;
tables week_begin / out=admissions (keep=week_begin count rename=(count=n_admissions));
run;

This gives you a variable named N_ADMISSIONS with the number of  admissions for the week.  Steps to be taken at a later point:

  • running the same process for all your variables, to generate a time series data set
  • filling in weeks that don't appear in your data with a zero for your n_admissions variable

Also note, SAS weeks run from Sunday through Saturday,.  You might want to change this, if you decide that your study doesn't really begin on a Sunday.  In that case, a parameter to the INTNX function can be set to determine which day of the week you would like your week definitions to begin.  But all of this should be considered now, before the programming takes place.

View solution in original post

6 REPLIES 6
Astounding
PROC Star

To get help,  you will need to understand the question being asked.  You might have access to the person asking the question, but we certainly don't.  These types of questions readily pop out:

  • If a hospitalization lasts for 3 weeks does that count as one hospitalization or as three hospitalizations?
  • If a hospitalization lasts forr 10 days, does that count as one, two, or one-and-a-half hospitalizations?
  • Does any of the counting depend on the day of the week in which a hospitalization begins?

Basically, what is it you are trying to count.

The programming is relatively straightforward.  But the key to solving it correcly is understanding the question.

NewUsrStat
Pyrite | Level 9

Hi, thank you for your attention to my post. It doesn't matter the hospital stay duration. I added the discharge but the admission is sufficient for the count. So:

  • point 1: "If a hospitalization lasts for 3 weeks does that count as one hospitalization or as three hospitalizations?" It counts as 1 because my objective is the number of admissions (new). The patient is already there during the second and third week;
  • point 2: "If a hospitalization lasts for 10 days, does that count as one, two, or one-and-a-half hospitalizations?" It counts as 1 because as before, I need to count new admissions;
  • point 3: "Does any of the counting depend on the day of the week in which a hospitalization begins?". The counting is independent from the day of the week in which a hospitalization begins. My post is really "skinny" because I have no other details to add (based on the request).
NewUsrStat
Pyrite | Level 9

Edit: need to count new admissions. Discharge date is there but admissions are enough.

SASKiwi
PROC Star

It would be worth your while reviewing official definitions of week numbering since you are unsure about that. One such definition is contained in ISO standard 8601. You can read about it here: https://en.wikipedia.org/wiki/ISO_week_date

 

SAS has implemented this definition in the WEEK function using the 'V' option. Note there are also WEEK formats, so if you wanted to group your data using the ISO 8601 week standard all you would need to do is apply the WEEKV SAS format to your admission date.

 

Try this:

proc freq data = DB;
  table admission;
  format admission weekv.;
run;
Tom
Super User Tom
Super User

I have to fit a time-series model to estimate the weekly number of hospitalizations for VIRUS_A. The dataset shown is just an example of the real dataset. I don't know how to set the "weekly" from the admission dates I have (I have also discharge dates). The study starts on 2014 but some patients are hospitalized after the start others at different months of the 2014. Moreover, does the week number start from the 01 January? If yes, what's happens if 01 Jan is in the middle of the week? 

You really need to answer those questions yourself.  And probably many more such what are the other variables you want to include in your timeseries analysis.

 

If you are just interested in calendar dates (rather than patient specific dates relative to some type of initial diagnosis or vaccination etc) then you should be able to just use the date variable itself.  I doubt that it matters if you start counting weeks on Jan 1 such as by using INTCK() function.

 

week=intck('week','01JAN2014'd,admission);

 

 

 

You could even just divide by the number of days since your starting date by 7 if you want to break the admission dates into weeks.  

 

week=(admission-'01JAN2014'd)/7;

Perhaps converting to an integer?

week=ceil((admission-'01JAN2014'd)/7);

 

 

 

Astounding
PROC Star

For a time series analysis, it is counterproductive to define an arbitrary "week" scale that runs 1 through N.  Instead, use the week beginning date to define your time periods.  With your data,  you could approach the problem in this way:

data admits;
set have;
where diagnosis = 'VIRUS_A';
week_begin = intnx('week', admission, 0);
run;
proc freq data=admits;
tables week_begin / out=admissions (keep=week_begin count rename=(count=n_admissions));
run;

This gives you a variable named N_ADMISSIONS with the number of  admissions for the week.  Steps to be taken at a later point:

  • running the same process for all your variables, to generate a time series data set
  • filling in weeks that don't appear in your data with a zero for your n_admissions variable

Also note, SAS weeks run from Sunday through Saturday,.  You might want to change this, if you decide that your study doesn't really begin on a Sunday.  In that case, a parameter to the INTNX function can be set to determine which day of the week you would like your week definitions to begin.  But all of this should be considered now, before the programming takes place.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 6 replies
  • 507 views
  • 1 like
  • 4 in conversation