BookmarkSubscribeRSS Feed
Ameurgen
Obsidian | Level 7

Hi everyone,

I am working on a linear mixed model using PROC HPMIXED, but I have encountered an issue with my dependent variable, which is not normally distributed as it is a count trait. Since I am dealing with count data, I considered using a zero-inflated model with PROC NLMIXED. However, this is not feasible due to the large size of my dataset (over 100k observations).

I am exploring other approaches, such as Gibbs sampling or Bayesian methods, for resampling. My question is: Does SAS have a procedure that can handle this approach, as I do not have time to explore R or other programs?

Thank you for your suggestions.

Best regards,

13 REPLIES 13
StatDave
SAS Super FREQ

For a count response, you can fit appropriate models (Poisson, negative binomial, or zero-inflated versions of either) in PROC GENMOD. They can also be fit PROC HPGENSELECT as well as in PROC COUNTREG and PROC HPCOUNTREG in SAS/ETS. Zero-inflated models can also be fit using PROC FMM. 

Ksharp
Super User
@SteveDave is right.
GEE model(PROC GEE or PROC GENMOD) is good(high efficient) for your BIG table.
But GEE model is a little different with Mixed model.You can use both of them.
Ameurgen
Obsidian | Level 7

Hi everyone,

I am working on a linear mixed model using PROC HPMIXED, but I have encountered an issue with my dependent variable, which is not normally distributed as it is a count trait. Since I am dealing with count data, I considered using a zero-inflated model with PROC NLMIXED. However, this is not feasible due to the large size of my dataset (over 100k observations).

I am exploring other approaches, such as Gibbs sampling or Bayesian methods, for resampling. My question is: Does SAS have a procedure that can handle this approach, as I do not have time to explore R or other programs?

Thank you for your suggestions.

Best regards,

ballardw
Super User

Did you actually run the zero-inflated and encounter problems? If so what problems?

 

 

Ameurgen
Obsidian | Level 7

yes,  i used zero inflate, but as i mentioned i musing quiet large data so proc NLmixed can handle the large matrice x dimension, so now im looking for another method, so i said may be the bayesian method can do the job,

My data is around 200 000 records

 

Thank you for your response

 

regards

Ksharp
Super User
1) You can use PROC GLIMMIX to fit Possion Distribution of Mixed model. But you have a very large table.
2)If you want use Bayesian method of Mixed model ,Check PROC MCMC .

And better post your statistic question at Stat Forum:
https://communities.sas.com/t5/Statistical-Procedures/bd-p/statistical_procedures
Ameurgen
Obsidian | Level 7

thank you for your response  Ksharp, ill ckeck mcmc procedure , its quiet complex 

what do you think about Proc BGLIMM (Bayesian Generalized Linear Mixed model, can do the job.? what is the differences?

regards

Ksharp
Super User
Yes. You can do Bayesian method of Mixed model by PROC BGLIMM.
I almost forgot this PROC when @SteveDenham mentioned it before.
Ksharp
Super User
"what is the differences?"
No different.
PROC BGLIMM is for rookie of Bayesian method.
PROC MCMC is for Bayes expert to do some customize method.
Ameurgen
Obsidian | Level 7

Thank you Sir Ksharp,  follow-up  my questions, 

I do like proc bglimm, it is straight forward application for LMM using bayesian method, but in my case i asking if any one cant run it in sas because , i have error ' proc bglimm not found ' im using 9.4 version of sas but this procedure is not available.

Thank you 

 

Regards

Ksharp
Super User

I can ruing proc bglimm without problem.

49   data MultiCenter;
50   input Center Group$ N SideEffect @@;
51   datalines;

NOTE: INPUT 语句到达一行的末尾时,SAS 转到新的一行。
NOTE: 数据集 WORK.MULTICENTER 有 30 个观测和 4 个变量。
NOTE: “DATA 语句”所用时间(总处理时间):
      实际时间          0.01 秒
      CPU 时间          0.00 秒


67   ;
68
69   proc bglimm data=MultiCenter nmc=10000 thin=2 seed=976352
NOTE: 正在写入 HTML Body(主体)文件: sashtml.htm
70   plots=all;
71   class Center Group;
72   model SideEffect/N = Group / noint;
73   random int / subject = Center;
74   run;

NOTE: Generating the burn-in samples.
NOTE: Beginning sample generation.
NOTE: Beginning calculation of summary and diagnostics statistics.
NOTE: Generating diagnostic plots.
NOTE: “PROCEDURE BGLIMM”所用时间(总处理时间):
      实际时间          2.59 秒
      CPU 时间          0.31 秒

What version of sas are you using? Mine is SAS9.4M7 .

78   %put &=sysvlong. ;
SYSVLONG=9.04.01M7P080520

If your sas version is too low to run the proc bglimm. you could try @StatDave  's suggestion GEE model by PROC GEE or PROC GENMOD .

Ameurgen
Obsidian | Level 7

Hi;

Thank for the response, 

My version sas is : SYSVLONG=9.04.01M5P091317

I think i am far way back from the recent one, do have any idea how can i upgrade mine,  i think this is the reason why i cant run the proc bglimm

Thank you so much

 

 

 

Ksharp
Super User
You could freely use SAS by SAS OnDemand for Academic :

https://welcome.oda.sas.com/login

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 13 replies
  • 1068 views
  • 6 likes
  • 4 in conversation