turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- Analytics
- /
- Stat Procs
- /
- Logistic analysis of cumulative percentages

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

07-08-2016 12:34 PM - edited 07-08-2016 02:18 PM

Hi everyone I am trying to analyze cumulative percentages and get a fitted curve. Here is what my data looks like:

n | n_cum | pct | time |

0 | 0 | 0 | 15 |

2 | 2 | 0.05 | 45.5 |

10 | 12 | 0.3 | 75.5 |

17 | 29 | 0.725 | 120.5 |

4 | 33 | 0.825 | 165.5 |

5 | 38 | 0.95 | 225.5 |

2 | 40 | 1 | 318 |

0 | 40 | 1 | 410 |

data have;

input n n_cum pct time;

datalines;

0 0 0 15

2 2 0.05 45.5

10 12 0.3 75.5

17 29 0.725 120.5

4 33 0.825 165.5

5 38 0.95 225.5

2 40 1 318

0 40 1 410

;

What I would like to do is fit a curve to the pct (defined as n_cum/40) over time. My initial attempt was to run a logistic regression (where trials is a variable equal to 40)

data have; set have; trials=40; run;

proc logistic data=have plots(only)=effect;

model n_cum/trials=time / rsquare;

run;

Here is my output.

My question to everyone is given that the number of trials is not independent at each time point (40 trials were not performed at each time point) is this still a valid way to run the analysis. My initial thoughts are that the model fit is correct but the standard errors and corresponding confidence intervals and inferential tests are not.

Thoughts?

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to spirto

07-08-2016 11:13 PM

Why would you do that ? LOGISTIC is only for sparse data ,not continuous data. If it is not time series analysis , you can use PROC LOESS, PROC ADPDTIVE ..... some non-parameter regression model.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to spirto

07-09-2016 12:28 AM

You are right. What you are trying to estimate is the distribution of *time. *One way to do such estimation is survival (or reliability) analysis.

```
data have;
input n n_cum pct time;
datalines;
0 0 0 15
2 2 0.05 45.5
10 12 0.3 75.5
17 29 0.725 120.5
4 33 0.825 165.5
5 38 0.95 225.5
2 40 1 318
0 40 1 410
;
proc lifereg data=have;
model time = / distribution=llogistic;
weight n;
probplot / pupper=99.5 plower=0.5 ppout;
inset scale;
run;
```

You would probably get a better fit if your times were not so coarsely binned.

PG