BookmarkSubscribeRSS Feed
Dan4
Calcite | Level 5

I would like to estimate the Mean and Standard Deviation of a Latent Variable using an Ordered Probit Model.

 

I simulated a simple data set, where the Latent Variable is normal (mean = 2), and created a "likert" ordinal variable with 5 categories labelled 1 to 5.

 

Using QLIM I am able to recover what I believe is the mean of the Latent Variable by placing restrictions on the thresholds ( Ie setting the lower threshold to 1.5 and the upper threshold to 4.5).  

 

The estimate of the mean is close (but seems to be underestimating the mean).  the Threshold values look reasonable.

 

The table Parameter Estimates includes two estimates that I am not sure what they refer to:

RESTRICT1 and RESTRICT2.

 

I wasn't able to find any explanation of these lines in the documentation.

 

Any thoughts and pointers would be greatly appreciated.

 

Dan

8 REPLIES 8
sbxkoenk
SAS Super FREQ

Hello,

 

Because you do Ordered Data Modeling with a latent dependent variable, there are some inherent restrictions of course (like those on the limit parameters / thresholds for example).
I guess the RESTRICT variables are linked to that.

 

If your data are simulated, you can maybe publish your code here.
Then I can probably tell you more.

 

Thanks,

Koen

sbxkoenk
SAS Super FREQ

Hello,

 

On top of my reply above, I add this:

 

This usage note could be beneficial to you:

Usage Note 22871: Types of logistic (or logit) models that can be fit using SAS®

https://support.sas.com/kb/22/871.html

 

Cheers,

Koen

Dan4
Calcite | Level 5

Thank you Koen,

 

I am interested in estimating the mean and std of the latent variable, and am comfortable setting some restrictions on the thresholds to do this.

 

I am using QLIM as I believe it will do what I want.  However if there is a better procedure to do this in SAS please let me know.

 

The program I am using to understand what SAS is doing is below:

 

**************************************************************************
* Sample program which simulates a basic data set, and uses the
* restrict statement to allow estimation of the latent parameters
**************************************************************************;

* Simulates a single latent variable with specified mean and sd;
data temp;
call streaminit(235);
nits = 200;

latent_mean = 4;
latent_sd = 1.1;

do id = 1 to nits;
latent_y = rand("normal",latent_mean,latent_sd);

if latent_y < 1.5 then obs_likert_r = 1;
else if latent_y < 2.5 then obs_likert_r = 2;
else if latent_y < 3.5 then obs_likert_r = 3;
else if latent_y < 4.5 then obs_likert_r = 4;
else obs_likert_r = 5;
output;
end;

* Plots the latent variable as a check of the simulation;
proc sgplot data=temp;
histogram latent_y;
run;

* Plots the "observed Likert Ratings" as a check of the conversion;
proc freq data=temp;
table obs_likert_r;

run;

* Runs QLIM with restrict statment;

proc qlim data=temp;
endogenous obs_likert_r ~ discrete (dist=Probit);
model obs_likert_r = ; * Fits model with only intercept;
restrict _limit2=2.5;
restrict _limit4=4.5;

run;

 

The output for the parameter estimates:

Parameter EstimatesParameter DF Estimate StandardError t Value ApproxPr > |t|Intercept_Limit2_Limit3_Limit4Restrict1Restrict2
13.8566170.08226846.88<.0001
12.5000000..
13.6651760.08037445.60<.0001
14.5000000..
-110.6846326.7688651.580.1147*
-19.7313026.9001381.410.1590*

 

 

In this example, the estimate of the intercept is reasonable, but not as good as I would like.

The thresholds make sense

But I am not sure what the Restrict statements refer to as they are not part of the model.

 

Thank you for your thoughts.


Dan

 

sbxkoenk
SAS Super FREQ

Hello @Dan4 ,

 

I think PROC QLIM (SAS/ETS) is the right choice.

There's also a PROC HPQLIM (HP = High-Performance) and a PROC CQLIM (SAS VIYA Econometrics) but the latter 2 procedures have no added value for your case.

You can also try PROC PROBIT (SAS/STAT) and PROC GLIMMIX (SAS/STAT) to see if you get closer to the expected result.

 

I have to check what RESTRICT1 and RESTRICT2 are exactly representing and how they are calculated.

Will do that tomorrow.

Dinner time here in Western-Europe. 😉

 

Kind regards,

Koen

Dan4
Calcite | Level 5

Thank you Koen,

 

I will check out PROC PROBIT and the others.

 

I was able to get this running in R using oglmx, but would prefer SAS.

 

Looking forward to hearing about the RESTRICT statement.

 

Thank you,


Dan

 

SASCom1
SAS Employee

Hi @Dan4 ,

 

When you impose restrictions on parameters using RESTRICT or BOUNDS statement, the RESTRICT1, RESTRICT2 parameters in the the parameter estimates output table are the Lagrange multipliers corresponding to the active restrictions. Since your code imposed two restrictions on the _Limit parameters, and both are actively imposed, you will get two lagrange multipliers parameters, namely RESTRICT1, and RESTRICT2. 

The PROC QLIM documentation seems to have missed this information, but other SAS/ETS procedures with the RESTRICT/BOUNDS statement functionality has documentation on this, see for example in PROC AUTOREG documentation:

 

https://go.documentation.sas.com/doc/en/pgmsascdc/v_016/etsug/etsug_autoreg_syntax14.htm

 

Lagrange multipliers are reported in the "Parameter Estimates" table for all the active linear constraints. They are identified by the names Restrict1Restrict2, and so on. The probabilities of these Lagrange multipliers are computed using a beta distribution (LaMotte 1994). Nonactive (nonbinding) restrictions have no effect on the estimation results and are not noted in the output.

 

The QLIM documentation should be updated with this information in the future.

 

I hope this helps.

 

 

Dan4
Calcite | Level 5

Thank you.

 

This is very helpful, and helps me on my way.

 

My apologies for the late Thank You!

 

Dan

SASCom1
SAS Employee

You are welcome! and no problem:-) 

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 8 replies
  • 1323 views
  • 2 likes
  • 3 in conversation