How do I build a logistic model for each categorical value of some categorical variable of a dataset

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 10
Accepted Solution

How do I build a logistic model for each categorical value of some categorical variable of a dataset

proc logistic data=BTS201506 ; 
class Carrier ; 
model DepDelayInd(Descending) = CRSDepTime seqnum DepDelayLagInd DepDelayLag DepDelayLagCum ArrDelayLagInd ArrDelayLag ArrDelayLagCum DepDelayLag2 ArrDelayLag2;
where cancelled=0 ;
run ;

So basically, Carrier is my categorical variable. I want to partition the dataset based on the values of this categorical variable and then build a logistic regression for each of the values of this categorical variable, Carrier.

 

But the code above does not work.

 

Help


Accepted Solutions
Solution
‎04-13-2016 07:09 PM
Grand Advisor
Posts: 10,251

Re: How do I build a logistic model for each categorical value of some categorical variable of a dat

Instead of Class Carrier use By Carrier. This will do a complete separate analysis for each level of Carrier. The data must be sorted by Carrier first though.

View solution in original post


All Replies
Solution
‎04-13-2016 07:09 PM
Grand Advisor
Posts: 10,251

Re: How do I build a logistic model for each categorical value of some categorical variable of a dat

Instead of Class Carrier use By Carrier. This will do a complete separate analysis for each level of Carrier. The data must be sorted by Carrier first though.

Occasional Contributor
Posts: 10

Re: How do I build a logistic model for each categorical value of some categorical variable of a dat

But Carrier is a character variable.

Occasional Contributor
Posts: 10

Re: How do I build a logistic model for each categorical value of some categorical variable of a dat

When I use By Carrier.

 

this is error message

 

ERROR: Variable Carrier should be either numeric or specified in the CLASS statement

Trusted Advisor
Posts: 1,203

Re: How do I build a logistic model for each categorical value of some categorical variable of a dat

First sort your data set by Carrier then use the sorted data set in logistic regression with 

 

By Carrier; 

Occasional Contributor
Posts: 10

Re: How do I build a logistic model for each categorical value of some categorical variable of a dat

Actually it worked.

 

My final question is,

 

It seems that the

 

"where cancelled = 0"

 

statement has no effect.

 

I'm still getting warnings like these in the ouput window:

 

"Note: 2898 observations were deleted due to missing values for the response or explanatory variables."

 

But that shouldn't be the case because all the missing values are where cancelled = 1.

 

By stating where cancelled = 1, i am selecting those observations without missing values.

 

 

Grand Advisor
Posts: 10,251

Re: How do I build a logistic model for each categorical value of some categorical variable of a dat


junlue wrote:

Actually it worked.

 

My final question is,

 

It seems that the

 

"where cancelled = 0"

 

statement has no effect.

 

I'm still getting warnings like these in the ouput window:

 

"Note: 2898 observations were deleted due to missing values for the response or explanatory variables."

 

But that shouldn't be the case because all the missing values are where cancelled = 1.

 

By stating where cancelled = 1, i am selecting those observations without missing values.

 

 


Try Running this code and see what you get:

Proc freq data=BTS201506;

   tables cancelled* (DepDelayInd  CRSDepTime seqnum DepDelayLagInd DepDelayLag DepDelayLagCum ArrDelayLagInd ArrDelayLag ArrDelayLagCum DepDelayLag2 ArrDelayLag2) / list missing;

run;

See if you have any rows with cancelled=0 and something else missing. Or possibly you really meant to keep cancelled=1??

 

Occasional Contributor
Posts: 10

Re: How do I build a logistic model for each categorical value of some categorical variable of a dat

There was a typo in my previous post. 

 

I meant "where cancelled = 0", not 1.

 

However, this was a non-issue and you are right. There are indeed still missing values in for rows with the value of 0 for the cancelled variable

Grand Advisor
Posts: 10,251

Re: How do I build a logistic model for each categorical value of some categorical variable of a dat

[ Edited ]

The procedure, as do most of the regressions, excludes any record with a missing value for the variables on the model statement. It may be that you have some mis-coded variables such as missing should have become 0 or similar that was intended but skipped.

Grand Advisor
Posts: 17,462

Re: How do I build a logistic model for each categorical value of some categorical variable of a dat

@junlue it sounds like you have your question answered. Please mark the appropriate solution as the correct answer.

Grand Advisor
Posts: 17,462

Re: How do I build a logistic model for each categorical value of some categorical variable of a dat

If it doesn't work, please post your code AND log.

You're likely putting something in the wrong place, using a BY statement is the correct answer.

 

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 10 replies
  • 544 views
  • 5 likes
  • 4 in conversation