BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
BlueNose
Quartz | Level 8

Dear all,

I have daily data on how many people entered a certain shopping center, and the weather on that day (temperature). I wish to find out if there is a relation between the weather and the number of people who entered the shopping center.

In addition, I have covariates such as the average income in that region.

 

  1. Number of people entering the shopping center (daily)
  2. Temperature (daily)
  3. Mean monthly income in the region (monthly)

 

The problem is, the covariates, such as mean income, are monthly, not daily. So for my main dependent and independent variables, I have a daily time series, while the covariates are monthly.

How should I handle this situation ?

I thought of several options, not sure which is best:

1. Aggregate the daily variables using means, to make them monthly - I will lose information
2. Make the monthly data daily, i.e., for each day in this month, the income will be the same. This will lead to a model with random effect, won't it ?

How would you handle this problem and which model would you use ? (regression, time series, mixed model)

Thank you in advance !

 

(Using SAS 9.4)

1 ACCEPTED SOLUTION

Accepted Solutions
rselukar
SAS Employee

From your description I feel that a time series model could be a reasonable choice.  Time series models will permit the capturing of time varying level, day of the week seasonality, and regression effects like temperature and the monthly income.  The issue of monthly income being constant during the days of a month is not particularly troublesome, as long as it is an informative predictor for the overall series.  You could use procedures such as ARIMA, AUTOREG, or UCM in SAS/ETS for such analysis.  Just to get you started, I am going to provide a sample program for UCM.  Assume that your daily data are stored in a data set "shopping" and has the following columns: date, NPeople, temp, and income.

 

proc ucm data=shopping;

   id date interval=day;

   model NPeople = temp income;

   /* specifies a smooth trend component */

   level variance=0 noest plot=smooth;

   slope;

   /* specifies a day of the week component */

   season length=7 type=trig plot=smooth;

   /* noise component */

   irregular;

   /* residual diagnostics */

   estimate plot=panel;

   forecast plot=(forecasts decomp);

run;

 

In your example, the temp effect could be nonlinear.  You could capture that by using the SPLINEREG statement (see "Example 42.6 Using Splines to Incorporate Nonlinear Effects" in the UCM doc: 

https://go.documentation.sas.com/?docsetId=etsug&docsetTarget=etsug_ucm_examples06.htm&docsetVersion... ). 

 

Hope this helps.

View solution in original post

4 REPLIES 4
KachiM
Rhodochrosite | Level 12

Do you find monthly variations in the regional income? To me it could very well be treated as a constant and hence you may use the temperature alone. If there is wide monthly variations, you may group income into 4 or 5 groups which may yield to Analysis of Covariance.

 

Cheers,

DATASP

BlueNose
Quartz | Level 8

I see what you mean, but even if I group it, the same problem remains, which is that for each day within the month, the income will be the same, a constant within a month. So the temperature and number of people vary by day, while income by month.

 

 

KachiM
Rhodochrosite | Level 12

 

I was thinking like:

Suppose you have made 3 groups. Then you will have a linear regression for temperature with the number of people for each group.

Compare the slopes and intercepts using Analysis of Covariance.

rselukar
SAS Employee

From your description I feel that a time series model could be a reasonable choice.  Time series models will permit the capturing of time varying level, day of the week seasonality, and regression effects like temperature and the monthly income.  The issue of monthly income being constant during the days of a month is not particularly troublesome, as long as it is an informative predictor for the overall series.  You could use procedures such as ARIMA, AUTOREG, or UCM in SAS/ETS for such analysis.  Just to get you started, I am going to provide a sample program for UCM.  Assume that your daily data are stored in a data set "shopping" and has the following columns: date, NPeople, temp, and income.

 

proc ucm data=shopping;

   id date interval=day;

   model NPeople = temp income;

   /* specifies a smooth trend component */

   level variance=0 noest plot=smooth;

   slope;

   /* specifies a day of the week component */

   season length=7 type=trig plot=smooth;

   /* noise component */

   irregular;

   /* residual diagnostics */

   estimate plot=panel;

   forecast plot=(forecasts decomp);

run;

 

In your example, the temp effect could be nonlinear.  You could capture that by using the SPLINEREG statement (see "Example 42.6 Using Splines to Incorporate Nonlinear Effects" in the UCM doc: 

https://go.documentation.sas.com/?docsetId=etsug&docsetTarget=etsug_ucm_examples06.htm&docsetVersion... ). 

 

Hope this helps.

sas-innovate-2024.png

 

Secure your spot at the must-attend AI and analytics event of 2024: SAS Innovate 2024! Get ready for a jam-packed agenda featuring workshops, super demos, breakout sessions, roundtables, inspiring keynotes and incredible networking events.

 

Register by March 1 to snag the Early Bird rate of just $695! Don't miss out on this exclusive offer. 

 

Register now!

Multiple Linear Regression in SAS

Learn how to run multiple linear regression models with and without interactions, presented by SAS user Alex Chaplin.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 4 replies
  • 4595 views
  • 1 like
  • 3 in conversation