BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
FlorianM
Fluorite | Level 6

Good afternoon everyone,

 

Here is my problem: I data with a binary dependent variable (Yes/No) that is the response to a treatment and a categorical explanatory variable that is the dose level of a treatment (5 classes: [0-2gr], [2-4gr], [4-6gr], [6-8gr] and [8-10gr] ).


By performing a descriptive analysis I can see that when the dose increases, the percentage of positive response increases. There would therefore be a linear trend between the response to treatment and the dose of treatment.
The Cochran Armitage test confirms this relationship with a significant p (<.0001).
However, I would like to obtain the slope of this linear trend so as to say: With each increase in the dose level, the percentage of positive response increases by X%.


I thought I could get this by using a proc logistic and declaring the dose level as a continuous variable but the given OR tells me : With each dose increase, the probability of a positive response increases by X%, which is not what I am looking for.

 

I may not be very clear and apologize for this, but do you have any idea to solve my problem?

 

Thank you for your help.

 

1 ACCEPTED SOLUTION

Accepted Solutions
PaigeMiller
Diamond | Level 26

But this is substantively different than your earlier example, where the response was a percent between 0 and 20. Now your response is binary. In this case you would want to do a logistic regression, although logistic regression would not compute a linear slope in probability of response being Yes. It computes a linear regression with the response log-odds ratio. This can be converted to a (non-linear) effect on the probability of being Yes. Use PROC LOGISTIC.

 

You have earlier described a summarization of the data where the binary Y values are now percents (which are not binary) in each category of X (0-2gr, 2-4gr, etc.) In this summarized data set, you can obtain a linear model of Y using PROC GLM or PROC REG.

 

So, it's your choice, which one fits your problem best.

--
Paige Miller

View solution in original post

8 REPLIES 8
PaigeMiller
Diamond | Level 26

I would like to obtain the slope of this linear trend so as to say: With each increase in the dose level, the percentage of positive response increases by X%.

 

With each dose increase, the probability of a positive response increases by X%, which is not what I am looking for.

 

I find these two statements very confusing and contradictory. Please explain further. Provide an example.

 

 

--
Paige Miller
FlorianM
Fluorite | Level 6

Thank you for your interest.

 

So as an example:

 

Response (% of Yes)         Dose level

5                                         [0-2gr]

10                                       [2-4gr]

15                                       [4-6gr]

20                                       [6-8gr]

25                                       [8-10gr] 

 

Here, i would like to say : For each increase in the dose level, the percentage of positive response increases by 5%. So the slope of the linear trend would be 5. It is what i want to estimate but i don't know how.

 

I have done the same analysis but with a quantitative dependant variable (dosage of a protein in the blood) and it was more simple.

The programm was:

 

proc glm data=analyse; 
model dosage_protein_blood=dose_level;
run;quit;

The estimate was 0.36 and meant:  For each increase in the dose level, the dosage of the protein in the blood increases by 0.36 point. 0.36 is the slope.

 

Florian

PaigeMiller
Diamond | Level 26

@FlorianM wrote:

Thank you for your interest.

 

So as an example:

 

Response (% of Yes)         Dose level

5                                         [0-2gr]

10                                       [2-4gr]

15                                       [4-6gr]

20                                       [6-8gr]

25                                       [8-10gr] 

 

Here, i would like to say : For each increase in the dose level, the percentage of positive response increases by 5%. So the slope of the linear trend would be 5. It is what i want to estimate but i don't know how.


Simple linear regression, where the x-variable dose level is now an integer (0-2gr represented by the integer 1, etc.) and response is Y.

--
Paige Miller
FlorianM
Fluorite | Level 6

Thank you for your response.

 

Originally it is was i wanted to do but i wonder if i can do a linear regression with a binary dependant variable. Is there any problem with that?

 

I get a coefficient of 0.0212 : does it mean that for each increase in the dose level, the percentage of positive response increases by 2.12%.

 

Florian

 

 

 

 

PaigeMiller
Diamond | Level 26

It is not a binary dependent variable. The values of 5% and 10% and so on are numeric, not binary.

 

I get a coefficient of 0.0212

 

So this does not pertain to the example you showed earlier?

 

For the data you have shown, I can't explain this. Show your work. Show the data. Show the code. Show the output.

 

For other data, the 0.0212 means that Y increases 0.0212 for every 1 unit change in X.

--
Paige Miller
FlorianM
Fluorite | Level 6

Indeed, it does not correspond to the data from earlier because I am not authorized to distribute them. So I created an fictive example but the problem was the same.

 

In the fictive example, date would be in this form :

 

ID   Response (Yes/No)    Dosage_protein_blood       Dose_level            

1               Yes                                0.23mg/l                   [8-10gr]

2               Yes                                0.05mg/l                     [4-6gr]

3               No                                 0.40mg/l                     [0-2gr]

4               No                                 1.46mg/l                     [2-4gr]

5               No                                 1.18mg/l                     [6-8gr] 

6               Yes                                0.17mg/l                   [8-10gr]

7               No                                 0.98mg/l                     [4-6gr]

8               Yes                                0.08mg/l                     [6-8gr]

....(there is thousand of ID)

 

I would compute this to obtain de slope of the linear relation between dosage_protein_blood and dose_level.

proc glm data=analyse; 
model dosage_protein_blood=dose_level;
run;quit; 

And I would like to obtain the slope of the linear relation between the response and dose_level.

 

I hope it is more clear.

 

Florian

PaigeMiller
Diamond | Level 26

But this is substantively different than your earlier example, where the response was a percent between 0 and 20. Now your response is binary. In this case you would want to do a logistic regression, although logistic regression would not compute a linear slope in probability of response being Yes. It computes a linear regression with the response log-odds ratio. This can be converted to a (non-linear) effect on the probability of being Yes. Use PROC LOGISTIC.

 

You have earlier described a summarization of the data where the binary Y values are now percents (which are not binary) in each category of X (0-2gr, 2-4gr, etc.) In this summarized data set, you can obtain a linear model of Y using PROC GLM or PROC REG.

 

So, it's your choice, which one fits your problem best.

--
Paige Miller
FlorianM
Fluorite | Level 6

Thank you, i will use PROC GLM.

sas-innovate-wordmark-2025-midnight.png

Register Today!

Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.


Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 8 replies
  • 2015 views
  • 0 likes
  • 2 in conversation