Solved: Using the estimates from PROC LOGISTIC in another data set

antor82 · Posted 07-22-2019 07:47 AM

Hi all

I would like to automatically use a value, derived from an output database, into a new data step.

I explain with an example:

proc logistic data=sample plots=none;
model Y(event=1)=X / outroc=rocX;
ods output ParameterEstimates=param;
run;

/*param DBS has two columns, Variable (in this example named Intercept and X) and Estimate*/

I would like to automatically include the value of Intercept and X, taken from the dbs param, into the following DATA Step

data rocX2(keep=cutoff prob Sensitivity Specificity Youden);
	set rocX;
	logit=log(_prob_/(1-_prob_));
	cutoff=(logit-/*Intercept*/)//*X*/;
	prob=_prob_;
	Sensitivity=_SENSIT_;
	Specificity=1-_1MSPEC_;
	Youden=_SENSIT_+ (1-_1MSPEC_)-1;
run;

tsk in advance

PaigeMiller · Posted 07-22-2019 12:45 PM

@antor82 wrote:
From the first PROC logistic I derived two parameter estimates for the model Y=X, namely Intercept and X.

I need to put the value of Intercept and X into the formula
cutoff=(logit-valueofintercept)/betaofX

The values of intercept and of beta for X have been saved in
ods output ParameterEstimates=param

I’ve already done it by hand
The formula became
cutoff=(logit-13.043)/-0.537

But I wonder if I could do this automatically

data rocX2;
    if _n_=1 then merge param(where=(variable='Intercept') rename=(estimate=intercept))
    param(where=(variable='X') rename=(estimate=beta1));
	set rocX;
	logit=log(_prob_/(1-_prob_));
	cutoff=(logit-intercept/beta1);

--
Paige Miller

View solution in original post

PaigeMiller · Posted 07-22-2019 07:56 AM

data rocX2(keep=cutoff prob Sensitivity Specificity Youden);
    if _n_=1 then set param;
	set rocX;
	logit=log(_prob_/(1-_prob_));
	cutoff=(logit-/*Intercept*/)//*X*/;
	prob=_prob_;
	Sensitivity=_SENSIT_;
	Specificity=1-_1MSPEC_;
	Youden=_SENSIT_+ (1-_1MSPEC_)-1;
run;

--
Paige Miller

antor82 · Posted 07-22-2019 08:17 AM

Thank you @PaigeMiller!

What should I change /*Intercept*/ with?

PaigeMiller · Posted 07-22-2019 08:21 AM

@antor82 wrote:

What should I change /*Intercept*/ with?

I don't know, since I don't know what you want to do. I don't know what the formula for CUTOFF is supposed to do.

--
Paige Miller

antor82 · Posted 07-22-2019 09:15 AM

From the first PROC logistic I derived two parameter estimates for the model Y=X, namely Intercept and X.

I need to put the value of Intercept and X into the formula
cutoff=(logit-valueofintercept)/betaofX

The values of intercept and of beta for X have been saved in
ods output ParameterEstimates=param

I’ve already done it by hand
The formula became
cutoff=(logit-13.043)/-0.537

But I wonder if I could do this automatically

Reeza · Posted 07-22-2019 12:02 PM

You likely need to transpose your parameter estimates first - if I recall correctly they come with each estimate on a separate line. Then once you merge them in you'll have them on the same line to do your calculations and can refer to the variables instead of typing out the numbers.

I illustrate this in this code where I show how to verify that the output from logistic regression is correct:
https://communities.sas.com/t5/Statistical-Procedures/How-to-determine-logistic-regression-formula-f...

The Neuralgia2 data set is available in the documentation for PROC LOGISTIC so you can get it to run that code if you'd like.

PaigeMiller · Posted 07-22-2019 12:45 PM

@antor82 wrote:
From the first PROC logistic I derived two parameter estimates for the model Y=X, namely Intercept and X.

I need to put the value of Intercept and X into the formula
cutoff=(logit-valueofintercept)/betaofX

The values of intercept and of beta for X have been saved in
ods output ParameterEstimates=param

I’ve already done it by hand
The formula became
cutoff=(logit-13.043)/-0.537

But I wonder if I could do this automatically

data rocX2;
    if _n_=1 then merge param(where=(variable='Intercept') rename=(estimate=intercept))
    param(where=(variable='X') rename=(estimate=beta1));
	set rocX;
	logit=log(_prob_/(1-_prob_));
	cutoff=(logit-intercept/beta1);

--
Paige Miller

antor82 · Posted 07-22-2019 01:42 PM

Tks!

antor82 · Posted 07-23-2019 05:15 AM

A little problem...

In the dataset param, intercept and X are printed twice (probably because PROC logistic displays twice the parameter estimates...)
How could I select only one value for intercept and X?

Tks again

PaigeMiller · Posted 07-23-2019 06:36 AM

I have no idea what you mean by "printed twice". Show me.

--
Paige Miller

antor82 · Posted 07-23-2019 07:31 AM

Tks for support!

Following is the result of a proc print from dataset param (consider VS_LVEF as X in our previous post)

Replicate Variable Estimate

1	Intercept	11.5259
1	VS_LVEF	-0.2393
1	Intercept	11.5259
1	VS_LVEF	-0.2393
2	Intercept	24.7332
2	VS_LVEF	-0.4754
2	Intercept	24.7332
2	VS_LVEF	-0.4754
3	Intercept	7.8117
3	VS_LVEF	-0.1531
3	Intercept	7.8117
3	VS_LVEF	-0.1531
4	Intercept	27.1324
4	VS_LVEF	-0.5119
4	Intercept	27.1324
4	VS_LVEF	-0.5119
5	Intercept	17.5613
5	VS_LVEF	-0.3322
5	Intercept	17.5613
5	VS_LVEF	-0.3322

the variable Replicate comes from a bootstrap.

As You can see, for Replicate n.1 I have twice a Estimate for the Variable Intercept and twice for VS_LVEF. While I need only 1 Intercept estimate and only 1 VS_LVEF estimate. And so on for the other Replicates...

Furthermore, when I run the suggested code

data rocX2;
    if _n_=1 then merge param(where=(variable='Intercept') rename=(estimate=intercept))
    param(where=(variable='X') rename=(estimate=beta1));
	set rocX;
	logit=log(_prob_/(1-_prob_));
	cutoff=(logit-intercept/beta1);

the result is attached to this reply.

Briefly:

1) first 39 obs are ok. Obs from 40 to 79 duplicate the previous obs for Replicate n.1;

2) As You can see, the value for Intercept and Beta are the same for ALL replicates!

Tks a lot!

A

PaigeMiller · Posted 07-23-2019 08:03 AM

Replicate has never been mentioned in this thread before your last post. Please explain.

Whatever code I gave you is not going to work with a Replicate variable.

It seems as if we need to start from scratch, and you need to show me the code (and part of the data) that contains replicate.

--
Paige Miller

antor82 · Posted 07-23-2019 10:06 AM

I tried to manage this way... and worked

/*put estimates in a single row per each replicate*/
/*one column per "intercept", another per "beta"*/
PROC SORT data=param out=paramsorted;
BY Replicate;
RUN;

PROC TRANSPOSE data=paramsorted 
out=transposed name=Variable;
	BY replicate;
	VAR estimate;
RUN;

data transposed;
set transposed;
	rename COL1=Intercept COL2=Beta;
        keep replicate intercept beta;
	run;

/*remove duplicate data from output dataset (have both Model and X in the _SOURCE_ column)*/	
data rocX2;
set rocX2;
where _SOURCE_='Model';
run;

Attached is a print of the transposed dataset

Then I performed the desired analysis

data rocX3 (keep=cutoff Sensitivity Specificity Youden intercept beta replicate);
	merge transposed rocX2;
	by replicate;
	logit=log(_prob_/(1-_prob_));
	cutoff=(logit-intercept)/beta;
	prob=_prob_;
	Sensitivity=_SENSIT_;
	Specificity=1-_1MSPEC_;
	Youden=_SENSIT_+ (1-_1MSPEC_)-1;
run;

and this worked too.

Using the estimates from PROC LOGISTIC in another data set

Re: usa a value from another database

Re: usa a value from another database

Re: usa a value from another database

Re: usa a value from another database

Re: usa a value from another database

Re: usa a value from another database

Re: usa a value from another database

Re: Using the estimates from PROC LOGISTIC in another data set

Re: Using the estimates from PROC LOGISTIC in another data set

Re: Using the estimates from PROC LOGISTIC in another data set

Re: Using the estimates from PROC LOGISTIC in another data set

Re: Using the estimates from PROC LOGISTIC in another data set

Re: Using the estimates from PROC LOGISTIC in another data set