BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
FlorianM
Fluorite | Level 6

Good afternoon everyone,

 

Here is my problem: I data with a binary dependent variable (Yes/No) that is the response to a treatment and a categorical explanatory variable that is the dose level of a treatment (5 classes: [0-2gr], [2-4gr], [4-6gr], [6-8gr] and [8-10gr] ).


By performing a descriptive analysis I can see that when the dose increases, the percentage of positive response increases. There would therefore be a linear trend between the response to treatment and the dose of treatment.
The Cochran Armitage test confirms this relationship with a significant p (<.0001).
However, I would like to obtain the slope of this linear trend so as to say: With each increase in the dose level, the percentage of positive response increases by X%.


I thought I could get this by using a proc logistic and declaring the dose level as a continuous variable but the given OR tells me : With each dose increase, the probability of a positive response increases by X%, which is not what I am looking for.

 

I may not be very clear and apologize for this, but do you have any idea to solve my problem?

 

Thank you for your help.

 

1 ACCEPTED SOLUTION

Accepted Solutions
PaigeMiller
Diamond | Level 26

But this is substantively different than your earlier example, where the response was a percent between 0 and 20. Now your response is binary. In this case you would want to do a logistic regression, although logistic regression would not compute a linear slope in probability of response being Yes. It computes a linear regression with the response log-odds ratio. This can be converted to a (non-linear) effect on the probability of being Yes. Use PROC LOGISTIC.

 

You have earlier described a summarization of the data where the binary Y values are now percents (which are not binary) in each category of X (0-2gr, 2-4gr, etc.) In this summarized data set, you can obtain a linear model of Y using PROC GLM or PROC REG.

 

So, it's your choice, which one fits your problem best.

--
Paige Miller

View solution in original post

8 REPLIES 8
PaigeMiller
Diamond | Level 26

I would like to obtain the slope of this linear trend so as to say: With each increase in the dose level, the percentage of positive response increases by X%.

 

With each dose increase, the probability of a positive response increases by X%, which is not what I am looking for.

 

I find these two statements very confusing and contradictory. Please explain further. Provide an example.

 

 

--
Paige Miller
FlorianM
Fluorite | Level 6

Thank you for your interest.

 

So as an example:

 

Response (% of Yes)         Dose level

5                                         [0-2gr]

10                                       [2-4gr]

15                                       [4-6gr]

20                                       [6-8gr]

25                                       [8-10gr] 

 

Here, i would like to say : For each increase in the dose level, the percentage of positive response increases by 5%. So the slope of the linear trend would be 5. It is what i want to estimate but i don't know how.

 

I have done the same analysis but with a quantitative dependant variable (dosage of a protein in the blood) and it was more simple.

The programm was:

 

proc glm data=analyse; 
model dosage_protein_blood=dose_level;
run;quit;

The estimate was 0.36 and meant:  For each increase in the dose level, the dosage of the protein in the blood increases by 0.36 point. 0.36 is the slope.

 

Florian

PaigeMiller
Diamond | Level 26

@FlorianM wrote:

Thank you for your interest.

 

So as an example:

 

Response (% of Yes)         Dose level

5                                         [0-2gr]

10                                       [2-4gr]

15                                       [4-6gr]

20                                       [6-8gr]

25                                       [8-10gr] 

 

Here, i would like to say : For each increase in the dose level, the percentage of positive response increases by 5%. So the slope of the linear trend would be 5. It is what i want to estimate but i don't know how.


Simple linear regression, where the x-variable dose level is now an integer (0-2gr represented by the integer 1, etc.) and response is Y.

--
Paige Miller
FlorianM
Fluorite | Level 6

Thank you for your response.

 

Originally it is was i wanted to do but i wonder if i can do a linear regression with a binary dependant variable. Is there any problem with that?

 

I get a coefficient of 0.0212 : does it mean that for each increase in the dose level, the percentage of positive response increases by 2.12%.

 

Florian

 

 

 

 

PaigeMiller
Diamond | Level 26

It is not a binary dependent variable. The values of 5% and 10% and so on are numeric, not binary.

 

I get a coefficient of 0.0212

 

So this does not pertain to the example you showed earlier?

 

For the data you have shown, I can't explain this. Show your work. Show the data. Show the code. Show the output.

 

For other data, the 0.0212 means that Y increases 0.0212 for every 1 unit change in X.

--
Paige Miller
FlorianM
Fluorite | Level 6

Indeed, it does not correspond to the data from earlier because I am not authorized to distribute them. So I created an fictive example but the problem was the same.

 

In the fictive example, date would be in this form :

 

ID   Response (Yes/No)    Dosage_protein_blood       Dose_level            

1               Yes                                0.23mg/l                   [8-10gr]

2               Yes                                0.05mg/l                     [4-6gr]

3               No                                 0.40mg/l                     [0-2gr]

4               No                                 1.46mg/l                     [2-4gr]

5               No                                 1.18mg/l                     [6-8gr] 

6               Yes                                0.17mg/l                   [8-10gr]

7               No                                 0.98mg/l                     [4-6gr]

8               Yes                                0.08mg/l                     [6-8gr]

....(there is thousand of ID)

 

I would compute this to obtain de slope of the linear relation between dosage_protein_blood and dose_level.

proc glm data=analyse; 
model dosage_protein_blood=dose_level;
run;quit; 

And I would like to obtain the slope of the linear relation between the response and dose_level.

 

I hope it is more clear.

 

Florian

PaigeMiller
Diamond | Level 26

But this is substantively different than your earlier example, where the response was a percent between 0 and 20. Now your response is binary. In this case you would want to do a logistic regression, although logistic regression would not compute a linear slope in probability of response being Yes. It computes a linear regression with the response log-odds ratio. This can be converted to a (non-linear) effect on the probability of being Yes. Use PROC LOGISTIC.

 

You have earlier described a summarization of the data where the binary Y values are now percents (which are not binary) in each category of X (0-2gr, 2-4gr, etc.) In this summarized data set, you can obtain a linear model of Y using PROC GLM or PROC REG.

 

So, it's your choice, which one fits your problem best.

--
Paige Miller
FlorianM
Fluorite | Level 6

Thank you, i will use PROC GLM.

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 8 replies
  • 2888 views
  • 0 likes
  • 2 in conversation