BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
Ishor
Calcite | Level 5

I would really appreciate if someone could provide me the syntax for conducting  Quasi- Poisson regression with spline. Please find the sample data.

 

Data have;
input Year month $ count;
cards;
2017 Jan 0
2017 Feb 0
2017 Mar 2
2017 Apr 3
2017 May 1
2017 Jun 6
2017 Jul 12
2017 Aug 23
2017 Sept 25
2017 Oct 39
2017 Nov 40
2017 Dec 45
2018 Jan 44
2018 Feb 40
2018 Mar 38
2018 Apr 37
2018 May 35
2018 Jun 34
2018 Jul 24
2018 Aug 23
2018 Sept 25
2018 Oct 39
2018 Nov 40
2018 Dec 45
2019 Jan 0
2019 Feb 0
2019 Mar 2
2019 Apr 3
2019 May 1
2019 Jun 6
2019 Jul 12
2019 Aug 23
2019 Sept 25
2019 Oct 39
2019 Nov 40
2019 Dec 45
; run;

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
PGStats
Opal | Level 21

a) The quasi-Poisson regression is requested with the RANDOM _RESIDUAL_ statement, as explained here.

b) IMHO, extrapolation using splines is nearly impossible, since the spline is fitted segment by segment (between knots), i.e. there is not even a function defined beyond the last knot.

 

PG

View solution in original post

4 REPLIES 4
sbxkoenk
SAS Super FREQ

Hello,

 

I'm not giving any guarantees here, but I think

you can estimate Quasi-Poisson regression by using PROC GLIMMIX and directly specifying
the functional relationship between the variance and the mean
and making no distributional assumption in the MODEL statement, as demonstrated below :

( Note also the spline effect ! )

proc glimmix data = work.have;
  effect spl = spline(X);
  model Y = spl / link = log solution;
  _variance_ = _mu_;
  random _residual_;
run;

 

The Poisson model has been criticized for its restrictive property that the conditional variance must equal the conditional mean. Real-life data are often characterized by overdispersion (that is, the variance exceeds the mean). Allowing for overdispersion can improve model predictions because the Poisson restriction of equal mean and variance results in the underprediction of zeros when overdispersion exists. The most commonly used model that accounts for overdispersion is the negative binomial model. Conway-Maxwell-Poisson regression enables you to model both overdispersion and underdispersion.

 

Cheers,

Koen

PGStats
Opal | Level 21

Adapted from the SAS code provided with this publication, you could start with:

 

Data have;
input year $ month $ count;
monthDate = input(cats("01", substr(month, 1, 3), year), date9.);
drop year month;
format monthDate yymmdd10.;
cards;
2017 Jan 0
2017 Feb 0
2017 Mar 2
2017 Apr 3
2017 May 1
2017 Jun 6
2017 Jul 12
2017 Aug 23
2017 Sept 25
2017 Oct 39
2017 Nov 40
2017 Dec 45
2018 Jan 44
2018 Feb 40
2018 Mar 38
2018 Apr 37
2018 May 35
2018 Jun 34
2018 Jul 24
2018 Aug 23
2018 Sept 25
2018 Oct 39
2018 Nov 40
2018 Dec 45
2019 Jan 0
2019 Feb 0
2019 Mar 2
2019 Apr 3
2019 May 1
2019 Jun 6
2019 Jul 12
2019 Aug 23
2019 Sept 25
2019 Oct 39
2019 Nov 40
2019 Dec 45
; 
proc glimmix data=have;
effect splDate = spline(monthDate);
model count = splDate / dist=Poisson covb ddfm = residual;
random _residual_;
output out=glmixout pred(noblup ilink) = p 
                        lcl(noblup ilink)= l 
                        ucl(noblup ilink)= u;
run;

proc sgplot data=glmixout;
band x=monthDate lower=l upper=u;
scatter y=count x=monthDate;
series y=p x=monthDate;
xaxis type=time;
run;

PGStats_0-1661631879971.png

(Not sure the confidence interval makes much sense at the beginning of the series... )

PG
Ishor
Calcite | Level 5
Thank you so much for the quick response,
I have few more issue to follow-up.
a. I am looking for the quasi-Poisson regression, however you are using just the Poisson regression. I was not able to find how we can define quasi-Poisson regression in the distribution assumption with proc glimmix.

b. Can i forecast the number of count in 2020 based on the model?

PGStats
Opal | Level 21

a) The quasi-Poisson regression is requested with the RANDOM _RESIDUAL_ statement, as explained here.

b) IMHO, extrapolation using splines is nearly impossible, since the spline is fitted segment by segment (between knots), i.e. there is not even a function defined beyond the last knot.

 

PG

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
Multiple Linear Regression in SAS

Learn how to run multiple linear regression models with and without interactions, presented by SAS user Alex Chaplin.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 4 replies
  • 1573 views
  • 8 likes
  • 3 in conversation