BookmarkSubscribeRSS Feed
sjarvis847
Fluorite | Level 6

Hello all! 

 

I used the following code to create a database which had the number of patients enrolled per day, that way I could compare the average patients per day during each enrollment period. However it's not accounting for the days with 0 patients enrolled. Anyone have an idea on how to add rows for days with 0 patients?

 

proc sql;
create table strcovid.new as
select arrdate, count (mrn) as patients
from strcovid.cov
group by arrdate;

create table average as
select month(arrdate) as month, day(arrdate) as day, avg (patients)
from strcovid.new
group by month, day;
quit;

 

Thank you! 

Stephanie

 

12 REPLIES 12
Reeza
Super User
You need to ensure you have a record for every day or do a custom calculation for the average where you specify the denominator.
Is this over a fixed period each time?
sjarvis847
Fluorite | Level 6

Yes, we are comparing during COVID this year, to same period in 2019 and same period in 2018. There exists no rows for dates without a patient enrolled which is why those dates don't have a value....

Kurt_Bremser
Super User

Insert a data step with a look-ahead that adds the missing days:

proc sql;
create table strcovid.new as
select arrdate, count (mrn) as patients
from strcovid.cov
group by arrdate;
quit;

data strcovid.complete;
merge
  strcovid.new
  strcovid.new (firstobs=2 keep=arrdate rename=(arrdate=_arrdate))
;
output;
patients = 0;
do arrdate = arrdate + 1 to _arrdate - 1;
  output;
end;
drop _arrdate;
run;

proc sql;
create table average as
select month(arrdate) as month, day(arrdate) as day, avg (patients)
from strcovid.complete
group by month, day;
quit;
sjarvis847
Fluorite | Level 6

Thanks for your reply! 

 

I just tried this and it gave me an error that there is an invalid DO loop, either the initial or TO expression is missing or the by expression is missing, zero or invalid. 

Kurt_Bremser
Super User

@sjarvis847 wrote:

Thanks for your reply! 

 

I just tried this and it gave me an error that there is an invalid DO loop, either the initial or TO expression is missing or the by expression is missing, zero or invalid. 


Then you must have made a mistake copying the code. See this example with some made-up data:

data cov;
input arrdate:yymmdd10. mrn $;
format arrdate yymmdd10.;
datalines;
2020-08-01 A
2020-08-01 B
2020-08-01 C
2020-08-02 A
2020-08-02 B
2020-08-05 D
2020-08-05 E
2020-08-05 F
;

proc sql;
create table new as
select arrdate, count (mrn) as patients
from cov
group by arrdate;
quit;

data complete;
merge
  new
  new (firstobs=2 keep=arrdate rename=(arrdate=_arrdate))
;
output;
patients = 0;
do arrdate = arrdate + 1 to _arrdate - 1;
  output;
end;
drop _arrdate;
run;

proc sql;
create table average as
select month(arrdate) as month, day(arrdate) as day, avg (patients)
from complete
group by month, day;
quit;

proc print data=complete noobs;
run;

Result:

arrdate	patients
2020-08-01	3
2020-08-02	2
2020-08-03	0
2020-08-04	0
2020-08-05	3

Please post the log of your failed code; use this button to post the log:

Bildschirmfoto 2020-04-07 um 08.32.59.jpg

PaigeMiller
Diamond | Level 26

You could determine how many days in a month, this is a known fact. You compute the sum of patients enrolled in a month (this is the numerator of the mean enrollments per month) and then divide by the number of days in the month (the denominator) to give you the mean enrollments per day in that month.

--
Paige Miller
sjarvis847
Fluorite | Level 6

Hi Paige, 

 

Thanks for your reply. I am hoping to look at patients/ day not per month & eventually want to use statistical methods to compare the patients per day enrolled per group. I know how to calculate an average by hand. I need it in sas so I can determine the p-value when I compare. Thanks for the suggestion though! 

 

Steph

PaigeMiller
Diamond | Level 26

@sjarvis847 wrote:

Hi Paige, 

 

Thanks for your reply. I am hoping to look at patients/ day not per month & eventually want to use statistical methods to compare the patients per day enrolled per group. I know how to calculate an average by hand. I need it in sas so I can determine the p-value when I compare. Thanks for the suggestion though! 

 

Steph


I'm afraid I don't understand. Your code will compute average per day, which is the same as the number of patients in a day. The averaging does nothing.

--
Paige Miller
sjarvis847
Fluorite | Level 6

Hi Paige, 

 

The code creates the number of patients enrolled per day. Then I take the average and compare if the average number of patients per day changed during the enrollment periods. Currently there is a difference observed in the average number of patients enrolled per day (in 2019 = 4 patients came in per day, during COVID = 2 patients came in per day). However because the days with 0 patients per day don't exist, the difference could be even greater that what I previously wrote. 

 

As there are not statistical tools to compare raw counts (I have the total volume [#] of patients during each enrollment period), I am comparing the number of patients who came in by day. I am sorry to hear this doesn't make sense to you. 

 

Thanks anyway, 

Steph

 

Steph

PaigeMiller
Diamond | Level 26

The code creates the number of patients enrolled per day.

If there are 80 patients enrolled on a given day, it creates an average of 80. Why do this?

--
Paige Miller
Reeza
Super User

Your code is currently operating at a daily level so your average will always be the same as the daily value. What are you really trying to do here overall? Get the average # of days over the month into your data set?

 


@sjarvis847 wrote:

Hello all! 

 

I used the following code to create a database which had the number of patients enrolled per day, that way I could compare the average patients per day during each enrollment period. However it's not accounting for the days with 0 patients enrolled. Anyone have an idea on how to add rows for days with 0 patients?

 

proc sql;
create table strcovid.new as
select arrdate, count (mrn) as patients
from strcovid.cov
group by arrdate;

create table average as
select month(arrdate) as month, day(arrdate) as day, avg (patients)
from strcovid.new
group by month, day;
quit;

 

Thank you! 

Stephanie

 


 

sjarvis847
Fluorite | Level 6

Hi Reeza, 

 

Thanks for your reply! 

 

Yes it is operating at the daily level and is currently providing the accurate number of patients enrolled per day, except for days when no patients are enrolled. The values are not the same for each day. For example, 4/1/2018 4 patients were enrolled, 4/2/2018 1 patient was enrolled, 4/8/2018 no patients were enrolled but currently there exists no value for 4/8/2018 in the dataset for 4/82018 as it's only count days with patients enrolled. 

 

The goal is get a dataset with number of patients per day. 

Then assign the groups using enrollment date. 

Check normality of data, determine median or mean patients per day for each group. Use proper statistics to compare the mean or median patients per day in each group. 

Without the 0 values entered I won't have the true average number of patients per day, I will have the average number of patients enrolled per day for days which enrollment occured if that makes sense (which is not what I want). 

 

We are comparing admission of patients before and after COVID, assumption that COVID resulted in fewer admissions/

 

Thanks again for your help!!! 

 

Stephanie

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 12 replies
  • 2275 views
  • 1 like
  • 4 in conversation