I am running conditional logit on matched data using spline variable. The percentiles using PROC MEANS do not match the percentiles from the PROC LOGISTIC output for the spline variable. PROC MEANS gives p25, p50 and p75 as 29, 56 and 84. PROC LOGISTIC gives the corrsponding numbers as 26.5,51,75.5. Shouldn't both the numbers match? Given below is code for sample data.
data Data1; input ID cancer gall hyper @@;
cards;
1 1 0 0 1 0 0 0 2 1 0 0 2 0 0 0
3 1 0 1 3 0 0 1 4 1 0 0 4 0 1 0
5 1 1 0 5 0 0 1 6 1 0 1 6 0 0 0
7 1 1 0 7 0 0 0 8 1 1 1 8 0 0 1
9 1 0 0 9 0 0 0 10 1 0 0 10 0 0 0
11 1 1 0 11 0 0 0 12 1 0 0 12 0 0 1
13 1 1 0 13 0 0 1 14 1 1 0 14 0 1 0
15 1 1 0 15 0 0 1 16 1 0 1 16 0 0 0
17 1 0 0 17 0 1 1 18 1 0 0 18 0 1 1
19 1 0 0 19 0 0 1 20 1 0 1 20 0 0 0
21 1 0 0 21 0 1 1 22 1 0 1 22 0 0 1
23 1 0 1 23 0 0 0 24 1 0 0 24 0 0 0
25 1 0 0 25 0 0 0 26 1 0 0 26 0 0 1
27 1 1 0 27 0 0 1 28 1 0 0 28 0 0 1
29 1 1 0 29 0 0 0 30 1 0 1 30 0 0 0
31 1 0 1 31 0 0 0 32 1 0 1 32 0 0 0
33 1 0 1 33 0 0 0 34 1 0 0 34 0 0 0
35 1 1 1 35 0 1 1 36 1 0 0 36 0 0 1
37 1 0 1 37 0 0 0 38 1 0 1 38 0 0 1
39 1 0 1 39 0 0 1 40 1 0 1 40 0 0 0
41 1 0 0 41 0 0 0 42 1 0 1 42 0 1 0
43 1 0 0 43 0 0 1 44 1 0 0 44 0 0 0
45 1 1 0 45 0 0 0 46 1 0 0 46 0 0 0
47 1 1 1 47 0 0 0 48 1 0 1 48 0 0 0
49 1 0 0 49 0 0 0 50 1 0 1 50 0 0 1
51 1 0 0 51 0 0 0 52 1 0 1 52 0 0 1
53 1 0 1 53 0 0 0 54 1 0 1 54 0 0 0
55 1 1 0 55 0 0 0 56 1 0 0 56 0 0 0
57 1 1 1 57 0 1 0 58 1 0 0 58 0 0 0
59 1 0 0 59 0 0 0 60 1 1 1 60 0 0 0
61 1 1 0 61 0 1 0 62 1 0 1 62 0 0 0
63 1 1 0 63 0 0 0
;
run;
DATA data1; SET data1; call streaminit(123); age= ceil( 100*rand("Uniform") ); run;
proc means n p25 p50 p75 min max;var age;run;
/*conditional logistic regression code defining event of interest*/
proc logistic data = Data1;
strata ID;
effect myspline=spline(age/ basis=tpf(noint) NATURALCUBIC details knotmethod=rangefractions(0.25,0.50,0.75));
model cancer(event = '1') = myspline;
run;
Are you aware that Proc MEANS has 5 different definitions of computing quantiles such as P25? And 2 methods? The options QNTLDEF QMETHOD set these. Something to ponder on occasion.
You may want to try running this code:
proc logistic data = Data1; title "Knotmethod Percentiles"; strata ID; effect myspline=spline(age/ basis=tpf(noint) NATURALCUBIC details knotmethod=percentiles(3)); model cancer(event = '1') = myspline; run;title;
And then carefully re-read the documentation on Rangefractions and Percentiles for knotmethod.
Are you aware that Proc MEANS has 5 different definitions of computing quantiles such as P25? And 2 methods? The options QNTLDEF QMETHOD set these. Something to ponder on occasion.
You may want to try running this code:
proc logistic data = Data1; title "Knotmethod Percentiles"; strata ID; effect myspline=spline(age/ basis=tpf(noint) NATURALCUBIC details knotmethod=percentiles(3)); model cancer(event = '1') = myspline; run;title;
And then carefully re-read the documentation on Rangefractions and Percentiles for knotmethod.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.