Weeks are always the same length and comparable week over week.
A month is a non standard unit that should not be used for analysis IMO - use 30 day intervals if needed.
200k observations is not a lot at all and I would strongly recommend against aggregation solely to have smaller data. If you chose to analyze at a monthly level that would also be wrong so I strongly recommend using a bi-weekly or 3 day intervals instead.
Are you running into processing issues that make you think you need to reduce the size of your data set?
@natbee wrote:
I have a data set with weekly case counts of a disease by municipality by country. I need to aggregate it to monthly case counts to reduce the number of observations for future analysis. I've looked around and can't get the various tips to work for me. Any advise? I currently have 200,000 observations and I need to reduce them, thus the desire to have a data set by monthly counts.
Variables I have
country - character, 3 levels
municipality - character, ~6000 levels
date - numeric, MMDDYY10.
cases, numeric
habitat, character, 5 levels
PROC SQL; *taking weekly data and aggregating to monthly;
create table monthly as
select id,country,month,date,habitat,SUM(cases)
as Monthly_cases
FROM weekly *name of dataset;
GROUP BY ID,DATE;
QUIT;
Running this code still leaves me with the 200,000 observations...
... View more