Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Home
- /
- Analytics
- /
- Stat Procs
- /
- Re: Fixed effects dummy variables industry and time

Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Posted 03-25-2017 04:47 AM
(4625 views)

Hello,

I have the following code for my logistic regression. However, I need to add industry fixed effects, year fixed effects with dummy variables. Could someone help me with this? I am trying to build a longitudinal data for 4 years. I used strata statement but all the dummy variables are dropped because of redundancy. Is that correct? Some of two dummy variables do not change over time whereas one does.

Thank you very much in advance.

Proc logistic data=Exam.Alltables

plots(only)=(effect oddsratio);

Strata Year Hight_sensitive EPA Emissions_trading;

class High_sensitive (param= ref ref=' No ') EPA (param=ref ref='No ') Emissions_trading (param=ref ref='No ') ;

model disclosure (event= '1 ')= High_sensitive Div_emissions EPA Emissions_trading Assets_LN /

selection=backward;

run;

8 REPLIES 8

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Hi,

without some sample data, it is difficult to tell for sure if the variables should be dropping out, however I did notice one thing in the code that I wanted to ask about.

In the "strata" statement you have the following:

Strata Year Hight_sensitive EPA Emissions_trading;

but in the class and model statements you have:

class High_sensitive ...

is it possible that a simple typo of "Hight_sensitive" rather than "High_sensitive" may be the source of your issues?

best of luck!

Ryan

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Hello Ryan,

Thanks for your answer. Yes, I mistyped here the variable high_sensitive but not in the SAS program.

In fact, I tried with this code for year fixed effect after reviewing the book Fixed Effects Regression Methods for Longitudinal data with SAS.

Proc logistic data=Exam.Alltables Desc;

class Year /PARAM=REF ;

model disclosure = Year Div_emissions Emissions_trading_num Assets_LN High_sensitive_num EPA_num;

STRATA Year;

run;

However, my concern is about Firm fixed effect. There are around 600 companies for each year (4 years). The results are truly different. All the variables are dropped out.

Proc logistic data=Exam.Alltables Desc;

class Company_ID /PARAM=REF ;

model disclosure = Company_ID Div_emissions Emissions_trading_num Assets_LN High_sensitive_num EPA_num;

STRATA Company_ID;

run;

Please find enclosed sample data

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

I'm not 100% sure of my answer, so take my answer with a grain of salt, but I think you should have 1 fewer dummy variables (i.e. k-1) than number of levels (k) in the variable you want fixed effects for.

I.e. 4 years -> 3 year dummy variables (i.e. the 1 year not assigned a dummy variable is =0,0,0 for the 3 dummies),

600 firms -> 599 firm dummy variables (the 1 firm not assigned a dummy variable has values of 0 for all 599 of the firm dummy variables),

etc.

You would put the names of the dummy variables on both your CLASS line of code and your MODEL line of code.

i.e. if your year dummies are y1 to y3, and firm dummies are f1 to f599:

CLASS y1 y2 y3 f1...f599 (and any other categorical explanatory variables) / PARAM=REF ;

MODEL disclosure = Div_emissions etc. etc. y1 y2 y3 f1...f599 ;

I wouldn't include Company_ID above because that's not an explanatory variable in the regression model. However, the firm fixed effects DO explain, so they ARE included.

I found some code on how to automate the dummy variable creation process in SAS:

https://blogs.sas.com/content/iml/2016/02/24/create-a-design-matrix-in-sas.html

J.J.

Our lives are enriched by the people around us.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

I am interested in your statement that "All the variables are dropped out." Are you doing some sort of variable selection method? If so, I am not surprised that all of the variables drop out, as there are more variables than observations per Company_ID - only four years, but you are trying to fit coefficients for five variables as the STRATA statement has an effect. This statement in the documentation of the STRATA statement applies:

STRATA variables can also be specified in the MODEL statement as classification or continuous covariates; however, the effects are nondegenerate only when crossed with a nonstratification variable.

In this case the effects are degenerate.

SteveDenham

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

If I understand correctly, we use Stratification __AND / OR__ Matching (like Propensity Score Matching, etc.) to deal with CONFOUNDING, right?

If we're using BOTH Strata __AND__ Matching, it's called a Conditional Logistic regression? Is that correct?

For example, if testing a drug (given/not given--treatment variable) to treat a patient (health improves/doesn't improve--the dependent variable), you could match by smoker (yes/no--1 matching criterion), eats healthy (yes/no--another matching criterion), and exerciser (yes/no--another matching criterion). Is that right? Are these matching criteria also called strata?

And what does being degenerate and non-degenerate mean in this context?

Thanks,

J.J.

Our lives are enriched by the people around us.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

I'll take a try at the last part about degenerate/non-degenerate. Suppose you list a group of variables as STRATA variables, and include the exact same variables in the MODEL statement (without interactions etc.) While fitting the model, you would now have a design matrix where the levels of the variables in the STRATA statement are such that the matrix is singular. To find out about this, add the CHECKDEPENDENCY option to the STRATA statement, probably with the =covariates or =all keyword. Then covariates that are dependent on the strata variables are eliminated from the analysis. This identifies the degenerate variables.

Does that make any sense at all?

SteveDenham

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

**SAS Innovate 2025** is scheduled for May 6-9 in Orlando, FL. Sign up to be **first to learn** about the agenda and registration!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.