BookmarkSubscribeRSS Feed
jhmoon
Obsidian | Level 7

Dear all,

 

My problem is that when I did a  event study analysis with code 'glm' and 'glmselect', the results are not the same, and I want to know why.

Specifically, I use event-study method, and the outcome variable is earnings, RHS variables are 9 event time dummies, 18 year dummies and 18 age dummies.

Everything is fine, but the results become weird when the weights are applied. (in a way that reference dummies are not omitted.)

 

Here is the code I used.

/*Overall results */
/*glmselect has a problem in this case*/
PROC GLM DATA=SAMPLE_CHT70  ;
CLASS EVENT_TIME(REF='-1') AGE(REF='39') STD_YYYY(REF='2019');
MODEL EARNINGS = EVENT_TIME AGE STD_YYYY/ SOLUTION CLPARM;
WEIGHT C1_WEIGHT;
RUN;

PROC GLMSELECT DATA=SAMPLE_CHT70 ;
CLASS EVENT_TIME(REF='-1')  AGE(REF='39') STD_YYYY(REF='2019') ;
MODEL EARNINGS = EVENT_TIME AGE STD_YYYY/ SELECTION=NONE ;
WEIGHT C1_WEIGHT;
RUN;

 

The results are like below:

1) With glm, reference dummy, "Event_time -1", "Age 39", "Std_yyyy 2019", are omitted, so that other estimates are well-estimated I think.

2) With glmselect, "Event_time -1", "Age 39" are omitted, but "Std_yyyy 2019" are not.

 

And without applying weight, there are all right.

 

Does anyone know about this type of issue?

 

9 REPLIES 9
sbxkoenk
SAS Super FREQ

Hello,

 

There's a different parameterization method for the classification variables in
PROC GLM and PROC GLMSELECT

  • PROC GLM --> PARAM=GLM (no choice, param= does not even exist as an option) 
  • PROC GLMSELECT --> the default is PARAM=EFFECT.
    (Effect coding = deviation-from-the-mean coding)

You can specify PARAM=GLM in PROC GLMSELECT CLASS statement (after forward slash).

PARAM=GLM

specifies less-than-full-rank, reference-cell coding; this option can be used only as a global option (i.e. for all class effects).

You need to specify PARAM=GLM in PROC GLMSELECT to mimic the design matrix from PROC GLM.

 

Koen

Rick_SAS
SAS Super FREQ

No, @sbxkoenk is not correct. The GLM parameterization is the default parameterization for PROC GLM and PROC GLMSELECT. I think Koen might be confusing GLMSELECT with PROC LOGISTIC, which uses the EFFECT parameterization. For more information about default parameterizations for CLASS variables, see 

Encodings of CLASS variables in SAS regression procedures: A cheat sheet - The DO Loop

sbxkoenk
SAS Super FREQ

Hello @jhmoon ,

 

@Rick_SAS is correct.
I was wrong in my 1st reply.

I had read the GLMSELECT doc to quickly.
https://go.documentation.sas.com/doc/en/pgmsascdc/9.4_3.5/statug/statug_glmselect_syntax03.htm#statu...

 

This is what it says:
===========

PARAM=keyword

specifies the parameterization method for the classification variable or variables. Design matrix columns are created from CLASS variables according to the following coding schemes. If the PARAM= option is not specified with any individual CLASS variable, by default, PARAM=GLM. Otherwise, the default is PARAM=EFFECT. If PARAM=ORTHPOLY or PARAM=POLY, and the CLASS levels are numeric, then the ORDER= option in the CLASS statement is ignored, and the internal, unformatted values are used. See the section CLASS Variable Parameterization and the SPLIT Option for further details.
===========

Koen

 

Rick_SAS
SAS Super FREQ

> the results become weird when the weights are applied. (in a way that reference dummies are not omitted.)

I think we need more details than "become weird." What you describe should NOT be happening because both procedures should be solving the same OLS problem.

 

1. So that we can see what is happening, please use

   ODS SELECT ParameterEstimates;
   in each procedure and copy/paste the parameter estimates into this thread.

 

2. You say that the issue only occurs when you use a weight variable. I can't think of why that might be, but it would be interesting to see the pattern of missing values in your data. Please run the following code to show us the missing value structure in your data:

/* for reporting, map all invalid weights to missing */
data sample / view=sample;
   set SAMPLE_CHT70;
   if C1_WEIGHT <= 0 then C1_WEIGHT = .;
run;

proc mi data=sample nimpute=0 displaypattern=nomeans;   /* SAS 9.4M5 option */
   ods select MissPattern;
   var EARNINGS EVENT_TIME AGE STD_YYYY C1_WEIGHT;
run;

 

 

jhmoon
Obsidian | Level 7

Thank you for your help.

 

I followed your recommendation 2 and confirmed that there are no negative weight, or other missing that affects the regression result.

 

The results table for each regression codes are like below:

 

PROC GLM DATA=SAMPLE_CHT70  ;
CLASS EVENT_TIME(REF='-1') AGE(REF='39') STD_YYYY(REF='2019');
MODEL EARNINGS = EVENT_TIME AGE STD_YYYY/ SOLUTION CLPARM;
WEIGHT C1_WEIGHT;
OUTPUT OUT= PRED_CALC_CPI_WC1 PREDICTED=P RESIDUAL=R;
ODS OUTPUT ParameterEstimates=PARAM_CALC_CPI_WC1;
RUN;
The GLM Procedure
Class Level Information
ClassLevelsValues
EVENT_TIME9-3 -2 0 1 2 3 4 5 -1
AGE1822 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39
STD_YYYY182002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019
Number of Observations Read5882292 
Number of Observations Used5713157 
Weight: C1_WEIGHT
SourceDFSum of SquaresMean SquareF ValuePr > F   
Model422.40E+145.71E+125122.4<.0001   
Error5.71E+066.37E+151115212709     
Corrected Total5.71E+066.61E+15      
R-SquareCoeff VarRoot MSEEARNINGS Mean     
0.036291147.913333394.822577.28     
SourceDFType I SSMean SquareF ValuePr > F   
EVENT_TIME85.76E+137.20E+126459.7<.0001   
AGE171.73E+141.02E+139143.87<.0001   
STD_YYYY178.94E+125.25956E+11471.62<.0001   
SourceDFType III SSMean SquareF ValuePr > F   
EVENT_TIME81.90E+142.38E+1321351.3<.0001   
AGE173.09E+131.82E+121632.25<.0001   
STD_YYYY178.94E+125.25956E+11471.62<.0001   
ParameterEstimate Standardt ValuePr > |t|95% Confidence LimitsExpected Value
Error
Intercept57493.58092B598.536091896.06<.000156320.4714958666.69036Intercept + [EVENT_TIME -1] + [AGE 39] + [STD_YYYY 2019]
EVENT_TIME -31151.43048B67.74640917<.00011018.649931284.21103[EVENT_TIME -3] - [EVENT_TIME -1]
EVENT_TIME -21290.48506B63.740961620.25<.00011165.555051415.41508[EVENT_TIME -2] - [EVENT_TIME -1]
EVENT_TIME 0-6709.90206B63.2582534-106.07<.0001-6833.88599-6585.91814[EVENT_TIME 0] - [EVENT_TIME -1]
EVENT_TIME 1-16152.14249B65.8422001-245.32<.0001-16281.19086-16023.09412[EVENT_TIME 1] - [EVENT_TIME -1]
EVENT_TIME 2-18193.03079B67.8515907-268.13<.0001-18326.01749-18060.04408[EVENT_TIME 2] - [EVENT_TIME -1]
EVENT_TIME 3-19273.77713B70.6162547-272.94<.0001-19412.18248-19135.37179[EVENT_TIME 3] - [EVENT_TIME -1]
EVENT_TIME 4-20520.90805B73.7837183-278.12<.0001-20665.52151-20376.29459[EVENT_TIME 4] - [EVENT_TIME -1]
EVENT_TIME 5-21161.09299B77.6282235-272.6<.0001-21313.24154-21008.94444[EVENT_TIME 5] - [EVENT_TIME -1]
EVENT_TIME -10B..... 
AGE 22-33011.33226B453.9691843-72.72<.0001-33901.0957-32121.56882[AGE 22] - [AGE 39]
AGE 23-31030.13856B417.0497167-74.4<.0001-31847.54116-30212.73596[AGE 23] - [AGE 39]
AGE 24-28456.20384B404.3707065-70.37<.0001-29248.75603-27663.65165[AGE 24] - [AGE 39]
AGE 25-26188.46721B398.0201918-65.8<.0001-26968.57262-25408.36181[AGE 25] - [AGE 39]
AGE 26-24184.50521B394.2489704-61.34<.0001-24957.21915-23411.79126[AGE 26] - [AGE 39]
AGE 27-22544.71583B391.5407827-57.58<.0001-23312.12182-21777.30983[AGE 27] - [AGE 39]
AGE 28-20991.74814B389.4779457-53.9<.0001-21755.11105-20228.38523[AGE 28] - [AGE 39]
AGE 29-19253.74308B387.720454-49.66<.0001-20013.66136-18493.82479[AGE 29] - [AGE 39]
AGE 30-17464.85106B386.093841-45.23<.0001-18221.58124-16708.12087[AGE 30] - [AGE 39]
AGE 31-15496.64241B384.8382356-40.27<.0001-16250.91165-14742.37317[AGE 31] - [AGE 39]
AGE 32-13496.68391B383.4668709-35.2<.0001-14248.26533-12745.1025[AGE 32] - [AGE 39]
AGE 33-11624.09823B381.945376-30.43<.0001-12372.69757-10875.49889[AGE 33] - [AGE 39]
AGE 34-9663.06563B380.3173421-25.41<.0001-10408.47408-8917.65718[AGE 34] - [AGE 39]
AGE 35-7642.13794B377.3506788-20.25<.0001-8381.73183-6902.54404[AGE 35] - [AGE 39]
AGE 36-5740.0934B374.7912853-15.32<.0001-6474.67097-5005.51582[AGE 36] - [AGE 39]
AGE 37-3900.21078B373.8001485-10.43<.0001-4632.84576-3167.57579[AGE 37] - [AGE 39]
AGE 38-1949.43689B377.7374853-5.16<.0001-2689.78891-1209.08486[AGE 38] - [AGE 39]
AGE 390B..... 
STD_YYYY 2002-16249.4725B723.7739728-22.45<.0001-17668.04372-14830.90128[STD_YYYY 2002] - [STD_YYYY 2019]
STD_YYYY 2003-13821.10917B715.5893301-19.31<.0001-15223.63878-12418.57956[STD_YYYY 2003] - [STD_YYYY 2019]
STD_YYYY 2004-11888.85238B712.2320858-16.69<.0001-13284.80191-10492.90285[STD_YYYY 2004] - [STD_YYYY 2019]
STD_YYYY 2005-11398.11108B710.2950105-16.05<.0001-12790.26401-10005.95815[STD_YYYY 2005] - [STD_YYYY 2019]
STD_YYYY 2006-9609.6762B708.9337937-13.56<.0001-10999.1612-8220.19121[STD_YYYY 2006] - [STD_YYYY 2019]
STD_YYYY 2007-7473.76677B707.8289783-10.56<.0001-8861.08636-6086.44717[STD_YYYY 2007] - [STD_YYYY 2019]
STD_YYYY 2008-5956.85432B706.8830583-8.43<.0001-7342.31995-4571.38869[STD_YYYY 2008] - [STD_YYYY 2019]
STD_YYYY 2009-5790.10949B706.004101-8.2<.0001-7173.85239-4406.36658[STD_YYYY 2009] - [STD_YYYY 2019]
STD_YYYY 2010-5811.82232B705.1423246-8.24<.0001-7193.87617-4429.76847[STD_YYYY 2010] - [STD_YYYY 2019]
STD_YYYY 2011-5341.59152B704.5219164-7.58<.0001-6722.42939-3960.75364[STD_YYYY 2011] - [STD_YYYY 2019]
STD_YYYY 2012-4584.69534B703.8424324-6.51<.0001-5964.20145-3205.18923[STD_YYYY 2012] - [STD_YYYY 2019]
STD_YYYY 2013-4165.11988B703.2508086-5.92<.0001-5543.46643-2786.77333[STD_YYYY 2013] - [STD_YYYY 2019]
STD_YYYY 2014-3306.63579B702.6738482-4.71<.0001-4683.85151-1929.42006[STD_YYYY 2014] - [STD_YYYY 2019]
STD_YYYY 2015-2796.6873B700.2987696-3.99<.0001-4169.24796-1424.12665[STD_YYYY 2015] - [STD_YYYY 2019]
STD_YYYY 2016-2641.52688B700.5926249-3.770.0002-4014.66349-1268.39028[STD_YYYY 2016] - [STD_YYYY 2019]
STD_YYYY 2017-2145.88732B704.80958-3.040.0023-3527.28901-764.48564[STD_YYYY 2017] - [STD_YYYY 2019]
STD_YYYY 2018-1733.12014B722.7610174-2.40.0165-3149.70601-316.53428[STD_YYYY 2018] - [STD_YYYY 2019]
STD_YYYY 20190B..... 
Note: The X'X matrix has been found to be singular, and a generalized inverse was used to solve the normal equations.  Terms whose estimates are followed by the letter 'B' are not uniquely estimable.

 

 

PROC GLMSELECT DATA=SAMPLE_CHT70 ;
CLASS EVENT_TIME(REF='-1') AGE STD_YYYY ;
MODEL CALC_CPI = EVENT_TIME AGE STD_YYYY/ SELECTION=NONE ;
WEIGHT C1_WEIGHT;
OUTPUT OUT= PRED_CALC_CPI_WC1 PREDICTED=P RESIDUAL=R;
ODS OUTPUT ParameterEstimates=PARAM_CALC_CPI_WC1;
RUN;
The GLMSELECT Procedure
Data SetSAMPLE_CHT70    
WeightC1_WEIGHT    
Dependent VariableEARNINGS    
Selection MethodNone    
Number of Observations Read5882292    
Number of Observations Used5713157    
Class Level Information   
ClassLevelsValues   
EVENT_TIME9-3 -2 0 1 2 3 4 5 -1   
AGE1822 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39   
STD_YYYY182002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019   
Dimensions    
Number of Effects4    
Number of Parameters46    
The GLMSELECT Procedure
Least Squares Summary 
StepEffectNumberNumberSBC 
EnteredEffects InParms In 
0Intercept11119229427 
1EVENT_TIME29119179530 
2AGE326119026637 
3STD_YYYY444119018905* 
* Optimal Value of Criterion 
The GLMSELECT Procedure
Least Squares Model (No Selection)
Analysis of Variance
SourceDFSum ofMeanF ValuePr > F
SquaresSquare
Model432.40E+145.58E+125003.28<.0001
Error5.71E+066.37E+151.115E+09  
Corrected Total5.71E+066.61E+15   
Root MSE33395    
Dependent Mean22577    
R-Square0.0363    
Adj R-Sq0.0363    
AIC124731467    
AICC124731467    
SBC119018905    
Parameter Estimates
ParameterDFEstimateStandardt ValuePr > |t|
Error
Intercept15106.229845557811300.9993
EVENT_TIME -311151.43048267.74641517<.0001
EVENT_TIME -211290.48507263.74096720.25<.0001
EVENT_TIME 01-6709.90205363.258259-106.07<.0001
EVENT_TIME 11-1615265.842206-245.32<.0001
EVENT_TIME 21-1819367.851597-268.13<.0001
EVENT_TIME 31-1927470.616261-272.94<.0001
EVENT_TIME 41-2052173.783725-278.12<.0001
EVENT_TIME 51-2116177.62823-272.6<.0001
EVENT_TIME -100...
AGE 221-33011453.96922-72.72<.0001
AGE 231-31030417.04975-74.4<.0001
AGE 241-28456404.37074-70.37<.0001
AGE 251-26188398.02023-65.8<.0001
AGE 261-24185394.249-61.34<.0001
AGE 271-22545391.54082-57.58<.0001
AGE 281-20992389.47798-53.9<.0001
AGE 291-19254387.72049-49.66<.0001
AGE 301-17465386.09387-45.23<.0001
AGE 311-15497384.83827-40.27<.0001
AGE 321-13497383.4669-35.2<.0001
AGE 331-11624381.94541-30.43<.0001
AGE 341-9663.065559380.31737-25.41<.0001
AGE 351-7642.137867377.35071-20.25<.0001
AGE 361-5740.093332374.79132-15.32<.0001
AGE 371-3900.210716373.80018-10.43<.0001
AGE 381-1949.436833377.73752-5.16<.0001
AGE 3900...
STD_YYYY 200213613855781130.010.9948
STD_YYYY 200313856655781130.010.9945
STD_YYYY 200414049855781130.010.9942
STD_YYYY 200514098955781130.010.9941
STD_YYYY 200614277855781130.010.9939
STD_YYYY 200714491455781130.010.9936
STD_YYYY 200814643055781130.010.9934
STD_YYYY 200914659755781130.010.9933
STD_YYYY 201014657655781130.010.9933
STD_YYYY 201114704655781130.010.9933
STD_YYYY 201214780355781130.010.9932
STD_YYYY 201314822255781130.010.9931
STD_YYYY 201414908155781130.010.993
STD_YYYY 201514959155781130.010.9929
STD_YYYY 201614974655781130.010.9929
STD_YYYY 201715024155781130.010.9928
STD_YYYY 201815065455781130.010.9928
STD_YYYY 201915238755781130.010.9925

 

As you can see the second result (glmselect), variables for year dummies and intercept has same and high standard error.

And to specify C1_WEIGHT, I construct it to match the structure of age at event time t=0, so that I can compare the result to that of other samples (SAMPLE_CHT80).

I think this problem has something to do with the weight, because each observations' weight value is same if they had same ages at event time t=0.

 

 

jhmoon
Obsidian | Level 7

Thank you for help!

 

I followed your second recommendation first, and I confirmed that there are no negative weights or missing value affecting the results.

 

And below, I upload the results for glm and glmselect, respectively.

 

The GLM Procedure  
Class Level Information  
ClassLevelsValues
EVENT_TIME9-3 -2 0 1 2 3 4 5 -1
AGE1822 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39
STD_YYYY182002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019
Number of Observations Read5882292 
Number of Observations Used5713157 

 

The GLM Procedure        
         
Dependent Variable: EARNINGS        
Weight: C1_WEIGHT        
SourceDFSum of SquaresMean SquareF ValuePr > F   
Model422.40E+145.71E+125122.4<.0001   
Error5.71E+066.37E+151.115E+09     
Corrected Total5.71E+066.61E+15      
R-SquareCoeff VarRoot MSEEARNINGS Mean     
0.036291147.913333394.822577.28     
SourceDFType I SSMean SquareF ValuePr > F   
EVENT_TIME85.76E+137.20E+126459.7<.0001   
AGE171.73E+141.02E+139143.87<.0001   
STD_YYYY178.94E+125.26E+11471.62<.0001   
SourceDFType III SSMean SquareF ValuePr > F   
EVENT_TIME81.90E+142.38E+1321351.3<.0001   
AGE173.09E+131.82E+121632.25<.0001   
STD_YYYY178.94E+125.26E+11471.62<.0001   
ParameterEstimate Standardt ValuePr > |t|95% Confidence LimitsExpected Value
Error
Intercept57493.581B598.5360996.06<.000156320.47158666.69Intercept + [EVENT_TIME -1] + [AGE 39] + [STD_YYYY 2019]
EVENT_TIME -31151.4305B67.74640917<.00011018.64991284.211[EVENT_TIME -3] - [EVENT_TIME -1]
EVENT_TIME -21290.4851B63.74096220.25<.00011165.55511415.4151[EVENT_TIME -2] - [EVENT_TIME -1]
EVENT_TIME 0-6709.9021B63.258253-106.07<.0001-6833.886-6585.9181[EVENT_TIME 0] - [EVENT_TIME -1]
EVENT_TIME 1-16152.142B65.8422-245.32<.0001-16281.191-16023.094[EVENT_TIME 1] - [EVENT_TIME -1]
EVENT_TIME 2-18193.031B67.851591-268.13<.0001-18326.017-18060.044[EVENT_TIME 2] - [EVENT_TIME -1]
EVENT_TIME 3-19273.777B70.616255-272.94<.0001-19412.182-19135.372[EVENT_TIME 3] - [EVENT_TIME -1]
EVENT_TIME 4-20520.908B73.783718-278.12<.0001-20665.522-20376.295[EVENT_TIME 4] - [EVENT_TIME -1]
EVENT_TIME 5-21161.093B77.628224-272.6<.0001-21313.242-21008.944[EVENT_TIME 5] - [EVENT_TIME -1]
EVENT_TIME -10B..... 
AGE 22-33011.332B453.96918-72.72<.0001-33901.096-32121.569[AGE 22] - [AGE 39]
AGE 23-31030.139B417.04972-74.4<.0001-31847.541-30212.736[AGE 23] - [AGE 39]
AGE 24-28456.204B404.37071-70.37<.0001-29248.756-27663.652[AGE 24] - [AGE 39]
AGE 25-26188.467B398.02019-65.8<.0001-26968.573-25408.362[AGE 25] - [AGE 39]
AGE 26-24184.505B394.24897-61.34<.0001-24957.219-23411.791[AGE 26] - [AGE 39]
AGE 27-22544.716B391.54078-57.58<.0001-23312.122-21777.31[AGE 27] - [AGE 39]
AGE 28-20991.748B389.47795-53.9<.0001-21755.111-20228.385[AGE 28] - [AGE 39]
AGE 29-19253.743B387.72045-49.66<.0001-20013.661-18493.825[AGE 29] - [AGE 39]
AGE 30-17464.851B386.09384-45.23<.0001-18221.581-16708.121[AGE 30] - [AGE 39]
AGE 31-15496.642B384.83824-40.27<.0001-16250.912-14742.373[AGE 31] - [AGE 39]
AGE 32-13496.684B383.46687-35.2<.0001-14248.265-12745.103[AGE 32] - [AGE 39]
AGE 33-11624.098B381.94538-30.43<.0001-12372.698-10875.499[AGE 33] - [AGE 39]
AGE 34-9663.0656B380.31734-25.41<.0001-10408.474-8917.6572[AGE 34] - [AGE 39]
AGE 35-7642.1379B377.35068-20.25<.0001-8381.7318-6902.544[AGE 35] - [AGE 39]
AGE 36-5740.0934B374.79129-15.32<.0001-6474.671-5005.5158[AGE 36] - [AGE 39]
AGE 37-3900.2108B373.80015-10.43<.0001-4632.8458-3167.5758[AGE 37] - [AGE 39]
AGE 38-1949.4369B377.73749-5.16<.0001-2689.7889-1209.0849[AGE 38] - [AGE 39]
AGE 390B..... 
STD_YYYY 2002-16249.473B723.77397-22.45<.0001-17668.044-14830.901[STD_YYYY 2002] - [STD_YYYY 2019]
STD_YYYY 2003-13821.109B715.58933-19.31<.0001-15223.639-12418.58[STD_YYYY 2003] - [STD_YYYY 2019]
STD_YYYY 2004-11888.852B712.23209-16.69<.0001-13284.802-10492.903[STD_YYYY 2004] - [STD_YYYY 2019]
STD_YYYY 2005-11398.111B710.29501-16.05<.0001-12790.264-10005.958[STD_YYYY 2005] - [STD_YYYY 2019]
STD_YYYY 2006-9609.6762B708.93379-13.56<.0001-10999.161-8220.1912[STD_YYYY 2006] - [STD_YYYY 2019]
STD_YYYY 2007-7473.7668B707.82898-10.56<.0001-8861.0864-6086.4472[STD_YYYY 2007] - [STD_YYYY 2019]
STD_YYYY 2008-5956.8543B706.88306-8.43<.0001-7342.32-4571.3887[STD_YYYY 2008] - [STD_YYYY 2019]
STD_YYYY 2009-5790.1095B706.0041-8.2<.0001-7173.8524-4406.3666[STD_YYYY 2009] - [STD_YYYY 2019]
STD_YYYY 2010-5811.8223B705.14232-8.24<.0001-7193.8762-4429.7685[STD_YYYY 2010] - [STD_YYYY 2019]
STD_YYYY 2011-5341.5915B704.52192-7.58<.0001-6722.4294-3960.7536[STD_YYYY 2011] - [STD_YYYY 2019]
STD_YYYY 2012-4584.6953B703.84243-6.51<.0001-5964.2015-3205.1892[STD_YYYY 2012] - [STD_YYYY 2019]
STD_YYYY 2013-4165.1199B703.25081-5.92<.0001-5543.4664-2786.7733[STD_YYYY 2013] - [STD_YYYY 2019]
STD_YYYY 2014-3306.6358B702.67385-4.71<.0001-4683.8515-1929.4201[STD_YYYY 2014] - [STD_YYYY 2019]
STD_YYYY 2015-2796.6873B700.29877-3.99<.0001-4169.248-1424.1267[STD_YYYY 2015] - [STD_YYYY 2019]
STD_YYYY 2016-2641.5269B700.59262-3.770.0002-4014.6635-1268.3903[STD_YYYY 2016] - [STD_YYYY 2019]
STD_YYYY 2017-2145.8873B704.80958-3.040.0023-3527.289-764.48564[STD_YYYY 2017] - [STD_YYYY 2019]
STD_YYYY 2018-1733.1201B722.76102-2.40.0165-3149.706-316.53428[STD_YYYY 2018] - [STD_YYYY 2019]
STD_YYYY 20190B..... 
Note: The X'X matrix has been found to be singular, and a generalized inverse was used to solve the normal equations.  Terms whose estimates are followed by the letter 'B' are not uniquely estimable.

 

 

 

 

 

 

The GLMSELECT Procedure  
Data SetSAMPLE_CHT70 
WeightC1_WEIGHT 
Dependent VariableEARNINGS 
Selection MethodNone 
Number of Observations Read5882292 
Number of Observations Used5713157 
Class Level Information  
ClassLevelsValues
EVENT_TIME9-3 -2 0 1 2 3 4 5 -1
AGE1822 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39
STD_YYYY182002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019
Dimensions     
Number of Effects4    
Number of Parameters46    
The GLMSELECT Procedure     
Least Squares Summary     
StepEffectNumberNumberSBC 
 EnteredEffects InParms In  
0Intercept11119229427 
1EVENT_TIME29119179530 
2AGE326119026637 
3STD_YYYY444119018905* 
* Optimal Value of Criterion     
The GLMSELECT Procedure     
Least Squares Model (No Selection)     
Analysis of Variance     
SourceDFSum ofMeanF ValuePr > F
Model43SquaresSquare  
Error5.71E+062.40E+145.58E+125003.28<.0001
Corrected Total5.71E+066.37E+151.115E+09  
Root MSE333956.61E+15   
Dependent Mean22577    
R-Square0.0363    
Adj R-Sq0.0363    
AIC124731467    
AICC124731467    
SBC119018905    
Parameter Estimates  Standardt ValuePr > |t|
ParameterDFEstimateError  
Intercept15106.229845557811300.9993
EVENT_TIME -311151.43048267.74641517<.0001
EVENT_TIME -211290.48507263.74096720.25<.0001
EVENT_TIME 01-6709.90205363.258259-106.07<.0001
EVENT_TIME 11-1615265.842206-245.32<.0001
EVENT_TIME 21-1819367.851597-268.13<.0001
EVENT_TIME 31-1927470.616261-272.94<.0001
EVENT_TIME 41-2052173.783725-278.12<.0001
EVENT_TIME 51-2116177.62823-272.6<.0001
EVENT_TIME -100...
AGE 221-33011453.96922-72.72<.0001
AGE 231-31030417.04975-74.4<.0001
AGE 241-28456404.37074-70.37<.0001
AGE 251-26188398.02023-65.8<.0001
AGE 261-24185394.249-61.34<.0001
AGE 271-22545391.54082-57.58<.0001
AGE 281-20992389.47798-53.9<.0001
AGE 291-19254387.72049-49.66<.0001
AGE 301-17465386.09387-45.23<.0001
AGE 311-15497384.83827-40.27<.0001
AGE 321-13497383.4669-35.2<.0001
AGE 331-11624381.94541-30.43<.0001
AGE 341-9663.065559380.31737-25.41<.0001
AGE 351-7642.137867377.35071-20.25<.0001
AGE 361-5740.093332374.79132-15.32<.0001
AGE 371-3900.210716373.80018-10.43<.0001
AGE 381-1949.436833377.73752-5.16<.0001
AGE 3900...
STD_YYYY 200213613855781130.010.9948
STD_YYYY 200313856655781130.010.9945
STD_YYYY 200414049855781130.010.9942
STD_YYYY 200514098955781130.010.9941
STD_YYYY 200614277855781130.010.9939
STD_YYYY 200714491455781130.010.9936
STD_YYYY 200814643055781130.010.9934
STD_YYYY 200914659755781130.010.9933
STD_YYYY 201014657655781130.010.9933
STD_YYYY 201114704655781130.010.9933
STD_YYYY 201214780355781130.010.9932
STD_YYYY 201314822255781130.010.9931
STD_YYYY 201414908155781130.010.993
STD_YYYY 201514959155781130.010.9929
STD_YYYY 201614974655781130.010.9929
STD_YYYY 201715024155781130.010.9928
STD_YYYY 201815065455781130.010.9928
STD_YYYY 201915238755781130.010.9925

 

 

As you can see the second table, dummies for year (STD_YYYY) are not omitted, and the intercept and STD_YYYY have high and same standard error.

I think the problem has something to do with the weight.

To specify C1_WEIGHT, I construct it to match the structure of age at event_time t = 0 with other samples (SAMPLE_CHT80).

Therefore, samples within the same 'age at event_time t=0' have same value of weight, and I think it is related to event_time, age and year dummy.

 

Rick_SAS
SAS Super FREQ
  • What release of SAS are you running? Do you have SAS 9.4 or SAS Viya? Are you running from SAS Studio? Enterprise Guide? Are you typing the code and submitting it yourself or are you using a GUI to build the model in a point-and-click fashion?

 

  • It is VERY DIFFICULT to try to READ the output that you've pasted. It did not get formatted in a tabular form. Perhaps delete the previous posts and try again?

 

  • Please get rid of the CLPARM option in the PROC GLM code. It just makes the table wide and causes it to wrap.

 

  • For the code you pasted:
    • the PROC GLM code uses EARNINGS as the response variable.
    • the PROC GLMSELECT code uses CALC_CPI as the response variable. I assume this was a pasting mistake.

 

For the output you pasted for the GLMSELECT model, the ParameterEstimates table should contain an estimate of 0 for the reference level. That is, the last row of the table should be 

STD_YYYY 2019 0 0 . . .

I can't tell if you made a copy/paste error, or if you are running a version of the code that had a bug, or something else.

 

I've never seen output that contains a column "Expected Value" that has values such as 

   Intercept + [EVENT_TIME -1] + [AGE 39] + [STD_YYYY 2019]

where did this column come from?

 

I am willing to try one more time. Please carefully display the tables in a readable format and make sure that the PROC code you post goes with the output. 

jhmoon
Obsidian | Level 7
Thank you Rick,
I will upload the post soon after I make it clear to see.

sas-innovate-white.png

🚨 Early Bird Rate Extended!

Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9.

 

Lock in the best rate now before the price increases on April 1.

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 9 replies
  • 2470 views
  • 5 likes
  • 3 in conversation