BookmarkSubscribeRSS Feed
spirto
Obsidian | Level 7

Hi everyone I am trying to analyze cumulative percentages and get a fitted curve. Here is what my data looks like:

 

n n_cum pct time
0 0 0 15
2 2 0.05 45.5
10 12 0.3 75.5
17 29 0.725 120.5
4 33 0.825 165.5
5 38 0.95 225.5
2 40 1 318
0 40 1 410

 

data have;
    input n n_cum pct time;
datalines;
0    0    0    15
2    2    0.05    45.5
10    12    0.3    75.5
17    29    0.725    120.5
4    33    0.825    165.5
5    38    0.95    225.5
2    40    1    318
0    40    1    410
;

 

What I would like to do is fit a curve to the pct (defined as n_cum/40) over time.  My initial attempt was to run a logistic regression (where trials is a variable equal to 40)

 

data have; set have; trials=40; run;

 

proc logistic data=have plots(only)=effect;
    model n_cum/trials=time / rsquare;
run;

 

Here is my output.

 

fit.png

 

My question to everyone is given that the number of trials is not independent at each time point (40 trials were not performed at each time point) is this still a valid way to run the analysis. My initial thoughts are that the model fit is correct but the standard errors and corresponding confidence intervals and inferential tests are not.

 

Thoughts?

2 REPLIES 2
Ksharp
Super User
Why would you do that ? LOGISTIC is only for sparse data ,not continuous data.
If it is not time series analysis , you can use PROC LOESS, PROC ADPDTIVE .....   
some non-parameter regression model.

PGStats
Opal | Level 21

You are right. What you are trying to estimate is the distribution of time. One way to do such estimation is survival (or reliability) analysis. 

 


data have;
    input n n_cum pct time;
datalines;
0    0    0    15
2    2    0.05    45.5
10    12    0.3    75.5
17    29    0.725    120.5
4    33    0.825    165.5
5    38    0.95    225.5
2    40    1    318
0    40    1    410
;

proc lifereg data=have;
model time = / distribution=llogistic;
weight n;
probplot / pupper=99.5 plower=0.5 ppout;
inset scale;
run;

ProbPlot20.png

You would probably get a better fit if your times were not so coarsely binned.

PG

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 2 replies
  • 1386 views
  • 1 like
  • 3 in conversation