Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Home
- /
- Analytics
- /
- Stat Procs
- /
- Re: Case counts with logistic regression

Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

🔒 This topic is **solved** and **locked**.
Need further help from the community? Please
sign in and ask a **new** question.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Posted 10-13-2021 01:44 PM
(982 views)

I am running a logistic regression on 1714 variables (PheWAS). I followed this guide (https://blogs.sas.com/content/iml/2017/02/13/run-1000-regressions.html) to run the regression the "by way."

In my final table, I would like to have the number of cases for each predictor (the predictor/exposure is a SNP (genetic variant) yes/no). In my final logistic table I have removed the reference row. Each row is one logistic regression and unique on varname.

Table that I get

Varname | p-value | odds ratio |

_001 | .002 | 10.2 |

_002 | .6 | 1 |

the table that I want

Varname | p-value | odds ratio | cases_SNP_yes | cases_SNP_no |

_001 | 0.002 | 10.2 | 100 | 5 |

_002 | 0.6 | 1.0 | 30 | 30 |

The way I currently get cases is to run a proc means step on the input data set (one row per patient (obs=264,000), one column per variable, and a column that indicates exposure) and then merge it with the logistic output by varname. I then repeat the step to get the number of cases for the other predictor. However, this takes a long time and I would think there is a better way to do this. I am wondering if there is an option statement in the proc logistic statement.

Sample code is below

```
* code for how I get my logistic table;
proc logistic data = have / alpha=0.00002927;
by VarName; *this is the "by way" ;
class SNP ;
model value = SNP / rsq expb;
ods output ParameterEstimates=model ;
quit;
data model_formated;
set model (rename=(expest=odds_ratio));
where variable = 'SNP'; *keep the row that contain the p value
run;
proc means data=have sum;
by varname ;
where SNP=1;
var value;
output out=cases
sum=count;
run;
data logistic_with_counts;
merge model_formated cases(keep=varname count);
by varname;
run;
```

- Tags:
- logistic regression

1 ACCEPTED SOLUTION

Accepted Solutions

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

You should always specify the EVENT= response option to be sure you are modeling the probability of the event level and not the nonevent level. For example: model value(event="Yes") = ... . The number of cases (events) and nonevents is in the Response Profile table that is automatically displayed. You can save it by also saving the ResponseProfile table in your ODS OUTPUT statement.

6 REPLIES 6

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

In my final table, I would like to have the number of cases for each class (exposure is one of two drugs - studyrx is the column name). In my final logistic table, I have removed the reference row. Each row is one logistic regression and unique on varname.

Varname p-value odds ratio _001 .002 10.2 _002 .6 1

I ask for clarification here. What do you mean by "number of cases"? What do you mean by "each class"? Can you show us the table you would like, even if the numbers are fake and explain wehre the real numbers come from?

As far as the overall problem that it takes too long is concerned, please tell me, what are you going to do with these 1714 logistic regression results once you have them. There may be smarter ways to do this, rather than ways to speed up the time it takes to do 1714 regressions.

--

Paige Miller

Paige Miller

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

I am trying to run a PhewAS (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4666492/) in SAS; for reasons, I can't run it in R. Therefore, the multiple regression is the procedure.

Each variable has a response 1=yes and 0=no. The number of cases is the number of "yeses". I want to know the number of "yeses" broken down by each predictor. I don't __need__ to know this, but displaying this information is the standard.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Number of cases can be computed via PROC FREQ and then added into the PROC LOGISTIC output.

With >1700 variables, the logistic regressions should take a while, and I am not aware of a method to speed this up, as you are using the fastest method I know of.

--

Paige Miller

Paige Miller

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

That is exactly what I was looking for. Thank you.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Consider adding the SIMPLE option to your PROC LOGISITIC and then capture it in an ODS statement as well.

**SAS Innovate 2025** is scheduled for May 6-9 in Orlando, FL. Sign up to be **first to learn** about the agenda and registration!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.