BookmarkSubscribeRSS Feed
ineedhelpidgi
Calcite | Level 5

Hello, I'm really useless with this stuff but I kind of had to use it so I'm trying to figure things out before I submit my data.

I'm trying to determine if the (survey)logistic or (survey)freq procedures are more appropriate, along with the appropriate statements to use however I'm struggling to understand what I'm reading and what's important.

So far this is my code:

proc surveylogistic data=pain;

weight weight;

by sex race;

class gunownership;

model suicideinjury(event="injured")=gunownership;

run;

 

proc surveyfreq data=pain;

weight weight;

by sex race;

tables suicideinjury*gunownership / chisq relrisk;

run;

I'm using odds ratios here because the sample size is quite small (as in some cells are below 5). I'm running into unexpected odds ratios and I'm wondering if it's because I've input the data incorrectly or the results are truly just unexpected or the small sample size is affecting the result. The last two are fine, I just need to make sure it's not my own error obviously haha. Sample sizes are the same, 

My questions:

  1. "Class variable information" shows that "injured" is -1, and "not injured" is 1. This seems counterintuitive to me. I have the variables coded such that all positives are 1 and all negatives are 0 for the 2x2 tables. Should I change my coding or is there something I should do in the class statement for "correct" formatting? Or is this fine?

  2. The surveyfreq and surveylogistic procedures give 2 different ORs, one is below 1 (surveyfreq) and the other is above 1 (surveylogistic). Could someone please explain what's happening here? The "surveyfreq" procedure is the same except without the "event=" statement and obviously instead of "model" I've used "tables".

Thank you in advance for anyone who offers help, I'm sorry for taking up time I just wasn't sure where else to go. 

 
4 REPLIES 4
ballardw
Super User

Part of your issues may come from the use of BY groups. The procedure supports that but there is a discussion in the documentation that for most analysis actually involving complex samples that the results are unreliable for By groups.

 

If you want to have an analysis in Surveylogistic for subpopulations you want to use DOMAIN instead of BY.

At a guess, you might want to replace your BY sex race with

 

Domain sex race sex*race;

That should provide an analysis within sex and race plus the sex*race requests the combination of the two variables.

 

 

With Surveyfreq you want to include the variables on the Tables request instead of By statement. Perhaps

 

tables (sex race sex*race)* ( suicideinjury*gunownership) / chisq relrisk;

 

 

What I am not seeing is anything related to complex sample. I would typically expect to see Strata and/or Cluster describing the complex sample structure and possibly Rate or Total options on the proc statement.

 

"Class variable information" shows that "injured" is -1, and "not injured" is 1. This seems counterintuitive to me. I have the variables coded such that all positives are 1 and all negatives are 0 for the 2x2 tables. Should I change my coding or is there something I should do in the class statement for "correct" formatting? Or is this fine?

 

The -1 is the "base" that other values are compared to. So the odds are "compared to injured".

 

The surveyfreq and surveylogistic procedures give 2 different ORs, one is below 1 (surveyfreq) and the other is above 1 (surveylogistic). Could someone please explain what's happening here? The "surveyfreq" procedure is the same except without the "event=" statement and obviously instead of "model" I've used "tables".

I don't see your Surveyfreq requesting odds ratios.  The order of the rows in the Surveyfreq when using the OR option may be reversed from the Surveylogistic results. If the numbers also vary in the decimals that often happens because of differences in the internal calculations. If one show a ratio of 1.333 and the other 1.336 (or similar) consider if the meaning  is practically significant.

 

ineedhelpidgi
Calcite | Level 5

Thank you so much for your response. It was all very helpful and I will adjust my code accordingly to see what happens.

possibly Rate or Total options on the proc statement

I'm not sure what you mean by this however, or what they would result in as data?

ballardw
Super User

The Rate or Total options on the Proc statement provide additional information to procedure that describe the population. A major result, when used properly, is a finite population correction.

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 4 replies
  • 867 views
  • 1 like
  • 2 in conversation