hi, i run the proc phreg procedure (see below) for stepwise regression. My dataset has 71 observations but SAS reads only the 33 of them. I have seen the dataset and there are no missing values. This is the note from log :
"NOTE: 38 observations were deleted due either to missing or invalid values for the time, censoring, frequency or explanatory
variables or to invalid operations in generating the values for some of the explanatory variables."
I found someone else had similar issue and included the :
output out=resOut resmart=resmart;
in the proc phreg (i did the same below) and found all the 38 lines of data that have not been considered by SAS. i am not sure why these 38 observations have been excluded by proc phreg procedure and i wonder how i can fix this in order to be included?
proc phreg data=test1;
model &var6 = &var17 &var8 D1 D2 D3/ selection=stepwise slentry=0.05 slstay=0.05 details rl;
output out=resOut resmart=resmart;
run;
proc print data=resOut; where resmart is missing; run;
What sort of unit is the variable DDEPPC supposed to be? Have you counted how many negative values you have for that variable? Does it happen to be 37 with the missing values of those other two variables on a record where DDEPPC is positive?
I don't think Phreg expects negative time values. In details of the documentation for Phreg in Failure Time Distribution it starts with the following. The modeled variable is that T
Let T be a nonnegative random variable representing the failure time of an individual from a homogeneous population.
You're missing values most likely.
Please post the output from the following:
proc freq data=test1;
table &var6. &var17. &var8. D1 D2 D3 / MISSING;
run;
hi, thanks. there are no missing values, see below :
DDEPPC | Frequency | Percent | Cumulative | Cumulative |
Frequency | Percent | |||
-1.23 | 1 | 1.41 | 1 | 1.41 |
-1.07 | 1 | 1.41 | 2 | 2.82 |
-0.9611 | 1 | 1.41 | 3 | 4.23 |
-0.9462 | 1 | 1.41 | 4 | 5.63 |
-0.9238 | 1 | 1.41 | 5 | 7.04 |
-0.9123 | 1 | 1.41 | 6 | 8.45 |
-0.8908 | 1 | 1.41 | 7 | 9.86 |
-0.8867 | 1 | 1.41 | 8 | 11.27 |
-0.8598 | 1 | 1.41 | 9 | 12.68 |
-0.7335 | 1 | 1.41 | 10 | 14.08 |
-0.7327 | 1 | 1.41 | 11 | 15.49 |
-0.7321 | 1 | 1.41 | 12 | 16.9 |
-0.7266 | 1 | 1.41 | 13 | 18.31 |
-0.6903 | 1 | 1.41 | 14 | 19.72 |
-0.6788 | 1 | 1.41 | 15 | 21.13 |
-0.6647 | 1 | 1.41 | 16 | 22.54 |
-0.6346 | 1 | 1.41 | 17 | 23.94 |
-0.5943 | 1 | 1.41 | 18 | 25.35 |
-0.5399 | 1 | 1.41 | 19 | 26.76 |
-0.519 | 1 | 1.41 | 20 | 28.17 |
-0.5013 | 1 | 1.41 | 21 | 29.58 |
-0.427 | 1 | 1.41 | 22 | 30.99 |
-0.4238 | 1 | 1.41 | 23 | 32.39 |
-0.4135 | 1 | 1.41 | 24 | 33.8 |
-0.411 | 1 | 1.41 | 25 | 35.21 |
-0.3765 | 1 | 1.41 | 26 | 36.62 |
-0.2412 | 1 | 1.41 | 27 | 38.03 |
-0.2204 | 1 | 1.41 | 28 | 39.44 |
-0.1523 | 1 | 1.41 | 29 | 40.85 |
-0.1156 | 1 | 1.41 | 30 | 42.25 |
-0.1072 | 1 | 1.41 | 31 | 43.66 |
-0.1018 | 1 | 1.41 | 32 | 45.07 |
-0.0858 | 1 | 1.41 | 33 | 46.48 |
-0.0734 | 1 | 1.41 | 34 | 47.89 |
-0.0145 | 1 | 1.41 | 35 | 49.3 |
-0.0121 | 1 | 1.41 | 36 | 50.7 |
-0.0079 | 1 | 1.41 | 37 | 52.11 |
0.0141 | 1 | 1.41 | 38 | 53.52 |
0.0795 | 1 | 1.41 | 39 | 54.93 |
0.1086 | 1 | 1.41 | 40 | 56.34 |
0.1386 | 1 | 1.41 | 41 | 57.75 |
0.1465 | 1 | 1.41 | 42 | 59.15 |
0.1472 | 1 | 1.41 | 43 | 60.56 |
0.1545 | 1 | 1.41 | 44 | 61.97 |
0.2226 | 1 | 1.41 | 45 | 63.38 |
0.2306 | 1 | 1.41 | 46 | 64.79 |
0.2623 | 1 | 1.41 | 47 | 66.2 |
0.2644 | 1 | 1.41 | 48 | 67.61 |
0.2706 | 1 | 1.41 | 49 | 69.01 |
0.2946 | 1 | 1.41 | 50 | 70.42 |
0.3479 | 1 | 1.41 | 51 | 71.83 |
0.3584 | 1 | 1.41 | 52 | 73.24 |
0.3657 | 1 | 1.41 | 53 | 74.65 |
0.4343 | 1 | 1.41 | 54 | 76.06 |
0.4469 | 1 | 1.41 | 55 | 77.46 |
0.475 | 1 | 1.41 | 56 | 78.87 |
0.495 | 1 | 1.41 | 57 | 80.28 |
0.515 | 1 | 1.41 | 58 | 81.69 |
0.5492 | 1 | 1.41 | 59 | 83.1 |
0.562 | 1 | 1.41 | 60 | 84.51 |
0.7407 | 1 | 1.41 | 61 | 85.92 |
0.7527 | 1 | 1.41 | 62 | 87.32 |
0.7548 | 1 | 1.41 | 63 | 88.73 |
0.7624 | 1 | 1.41 | 64 | 90.14 |
0.7959 | 1 | 1.41 | 65 | 91.55 |
0.8133 | 1 | 1.41 | 66 | 92.96 |
0.8181 | 1 | 1.41 | 67 | 94.37 |
0.8266 | 1 | 1.41 | 68 | 95.77 |
0.9507 | 1 | 1.41 | 69 | 97.18 |
1.4172 | 1 | 1.41 | 70 | 98.59 |
1.4174 | 1 | 1.41 | 71 | 100 |
resid01_lag1 | Frequency | Percent | Cumulative | Cumulative |
Frequency | Percent | |||
. | 1 | 1.41 | 1 | 1.41 |
-2.34387 | 1 | 1.41 | 2 | 2.82 |
-1.92675 | 1 | 1.41 | 3 | 4.23 |
-1.66221 | 1 | 1.41 | 4 | 5.63 |
-1.66076 | 1 | 1.41 | 5 | 7.04 |
-1.51416 | 1 | 1.41 | 6 | 8.45 |
-1.31411 | 1 | 1.41 | 7 | 9.86 |
-1.30036 | 1 | 1.41 | 8 | 11.27 |
-1.17059 | 1 | 1.41 | 9 | 12.68 |
-1.16493 | 1 | 1.41 | 10 | 14.08 |
-1.0268 | 1 | 1.41 | 11 | 15.49 |
-0.98757 | 1 | 1.41 | 12 | 16.9 |
-0.79868 | 1 | 1.41 | 13 | 18.31 |
-0.78508 | 1 | 1.41 | 14 | 19.72 |
-0.74925 | 1 | 1.41 | 15 | 21.13 |
-0.74815 | 1 | 1.41 | 16 | 22.54 |
-0.73537 | 1 | 1.41 | 17 | 23.94 |
-0.73498 | 1 | 1.41 | 18 | 25.35 |
-0.72178 | 1 | 1.41 | 19 | 26.76 |
-0.67466 | 1 | 1.41 | 20 | 28.17 |
-0.66885 | 1 | 1.41 | 21 | 29.58 |
-0.59144 | 1 | 1.41 | 22 | 30.99 |
-0.56058 | 1 | 1.41 | 23 | 32.39 |
-0.49143 | 1 | 1.41 | 24 | 33.8 |
-0.45108 | 1 | 1.41 | 25 | 35.21 |
-0.41997 | 1 | 1.41 | 26 | 36.62 |
-0.378 | 1 | 1.41 | 27 | 38.03 |
-0.30964 | 1 | 1.41 | 28 | 39.44 |
-0.2536 | 1 | 1.41 | 29 | 40.85 |
-0.18538 | 1 | 1.41 | 30 | 42.25 |
-0.13725 | 1 | 1.41 | 31 | 43.66 |
-0.0802 | 1 | 1.41 | 32 | 45.07 |
-0.07967 | 1 | 1.41 | 33 | 46.48 |
-0.07443 | 1 | 1.41 | 34 | 47.89 |
-0.02852 | 1 | 1.41 | 35 | 49.3 |
0.022919 | 1 | 1.41 | 36 | 50.7 |
0.032978 | 1 | 1.41 | 37 | 52.11 |
0.062794 | 1 | 1.41 | 38 | 53.52 |
0.11275 | 1 | 1.41 | 39 | 54.93 |
0.163709 | 1 | 1.41 | 40 | 56.34 |
0.183303 | 1 | 1.41 | 41 | 57.75 |
0.198967 | 1 | 1.41 | 42 | 59.15 |
0.22923 | 1 | 1.41 | 43 | 60.56 |
0.263706 | 1 | 1.41 | 44 | 61.97 |
0.279418 | 1 | 1.41 | 45 | 63.38 |
0.377233 | 1 | 1.41 | 46 | 64.79 |
0.40091 | 1 | 1.41 | 47 | 66.2 |
0.434839 | 1 | 1.41 | 48 | 67.61 |
0.458864 | 1 | 1.41 | 49 | 69.01 |
0.475435 | 1 | 1.41 | 50 | 70.42 |
0.57037 | 1 | 1.41 | 51 | 71.83 |
0.621019 | 1 | 1.41 | 52 | 73.24 |
0.657644 | 1 | 1.41 | 53 | 74.65 |
0.676903 | 1 | 1.41 | 54 | 76.06 |
0.699543 | 1 | 1.41 | 55 | 77.46 |
0.706565 | 1 | 1.41 | 56 | 78.87 |
0.72955 | 1 | 1.41 | 57 | 80.28 |
0.899265 | 1 | 1.41 | 58 | 81.69 |
1.073671 | 1 | 1.41 | 59 | 83.1 |
1.098557 | 1 | 1.41 | 60 | 84.51 |
1.145171 | 1 | 1.41 | 61 | 85.92 |
1.157797 | 1 | 1.41 | 62 | 87.32 |
1.248145 | 1 | 1.41 | 63 | 88.73 |
1.320147 | 1 | 1.41 | 64 | 90.14 |
1.386017 | 1 | 1.41 | 65 | 91.55 |
1.408262 | 1 | 1.41 | 66 | 92.96 |
1.497769 | 1 | 1.41 | 67 | 94.37 |
1.504349 | 1 | 1.41 | 68 | 95.77 |
1.526251 | 1 | 1.41 | 69 | 97.18 |
1.817184 | 1 | 1.41 | 70 | 98.59 |
1.91372 | 1 | 1.41 | 71 | 100 |
DPSDEPPC | Frequency | Percent | Cumulative | Cumulative |
Frequency | Percent | |||
. | 1 | 1.41 | 1 | 1.41 |
-1.81 | 1 | 1.41 | 2 | 2.82 |
-1.5 | 1 | 1.41 | 3 | 4.23 |
-1.184 | 1 | 1.41 | 4 | 5.63 |
-1.136 | 1 | 1.41 | 5 | 7.04 |
-1.001 | 1 | 1.41 | 6 | 8.45 |
-0.9896 | 1 | 1.41 | 7 | 9.86 |
-0.928 | 1 | 1.41 | 8 | 11.27 |
-0.8819 | 1 | 1.41 | 9 | 12.68 |
-0.8656 | 1 | 1.41 | 10 | 14.08 |
-0.8478 | 1 | 1.41 | 11 | 15.49 |
-0.7794 | 1 | 1.41 | 12 | 16.9 |
-0.6462 | 1 | 1.41 | 13 | 18.31 |
-0.6338 | 1 | 1.41 | 14 | 19.72 |
-0.6045 | 1 | 1.41 | 15 | 21.13 |
-0.5168 | 1 | 1.41 | 16 | 22.54 |
-0.5144 | 1 | 1.41 | 17 | 23.94 |
-0.4742 | 1 | 1.41 | 18 | 25.35 |
-0.4275 | 1 | 1.41 | 19 | 26.76 |
-0.4263 | 1 | 1.41 | 20 | 28.17 |
-0.409 | 1 | 1.41 | 21 | 29.58 |
-0.4078 | 1 | 1.41 | 22 | 30.99 |
-0.3606 | 1 | 1.41 | 23 | 32.39 |
-0.3447 | 1 | 1.41 | 24 | 33.8 |
-0.3137 | 1 | 1.41 | 25 | 35.21 |
-0.3107 | 1 | 1.41 | 26 | 36.62 |
-0.2972 | 1 | 1.41 | 27 | 38.03 |
-0.2877 | 1 | 1.41 | 28 | 39.44 |
-0.2181 | 1 | 1.41 | 29 | 40.85 |
-0.1146 | 1 | 1.41 | 30 | 42.25 |
-0.0914 | 1 | 1.41 | 31 | 43.66 |
-0.0889 | 1 | 1.41 | 32 | 45.07 |
-0.0753 | 1 | 1.41 | 33 | 46.48 |
-0.065 | 1 | 1.41 | 34 | 47.89 |
-0.0635 | 1 | 1.41 | 35 | 49.3 |
-0.0625 | 1 | 1.41 | 36 | 50.7 |
-0.054 | 1 | 1.41 | 37 | 52.11 |
-0.0434 | 1 | 1.41 | 38 | 53.52 |
-0.0358 | 1 | 1.41 | 39 | 54.93 |
-0.0291 | 1 | 1.41 | 40 | 56.34 |
-0.0028 | 1 | 1.41 | 41 | 57.75 |
0.0001 | 1 | 1.41 | 42 | 59.15 |
0.0226 | 1 | 1.41 | 43 | 60.56 |
0.0377 | 1 | 1.41 | 44 | 61.97 |
0.0433 | 1 | 1.41 | 45 | 63.38 |
0.0491 | 1 | 1.41 | 46 | 64.79 |
0.0906 | 1 | 1.41 | 47 | 66.2 |
0.1026 | 1 | 1.41 | 48 | 67.61 |
0.1664 | 1 | 1.41 | 49 | 69.01 |
0.1821 | 1 | 1.41 | 50 | 70.42 |
0.2077 | 1 | 1.41 | 51 | 71.83 |
0.2491 | 1 | 1.41 | 52 | 73.24 |
0.2759 | 1 | 1.41 | 53 | 74.65 |
0.3911 | 1 | 1.41 | 54 | 76.06 |
0.402 | 1 | 1.41 | 55 | 77.46 |
0.4138 | 1 | 1.41 | 56 | 78.87 |
0.4245 | 1 | 1.41 | 57 | 80.28 |
0.4455 | 1 | 1.41 | 58 | 81.69 |
0.4659 | 1 | 1.41 | 59 | 83.1 |
0.5259 | 1 | 1.41 | 60 | 84.51 |
0.5788 | 1 | 1.41 | 61 | 85.92 |
0.7056 | 1 | 1.41 | 62 | 87.32 |
0.7422 | 1 | 1.41 | 63 | 88.73 |
0.7596 | 1 | 1.41 | 64 | 90.14 |
0.8395 | 1 | 1.41 | 65 | 91.55 |
0.8925 | 1 | 1.41 | 66 | 92.96 |
0.9107 | 1 | 1.41 | 67 | 94.37 |
0.9144 | 1 | 1.41 | 68 | 95.77 |
1.037 | 1 | 1.41 | 69 | 97.18 |
1.5387 | 1 | 1.41 | 70 | 98.59 |
1.7481 | 1 | 1.41 | 71 | 100 |
D1 | Frequency | Percent | Cumulative | Cumulative |
Frequency | Percent | |||
0 | 53 | 74.65 | 53 | 74.65 |
1 | 18 | 25.35 | 71 | 100 |
D2 | Frequency | Percent | Cumulative | Cumulative |
Frequency | Percent | |||
0 | 53 | 74.65 | 53 | 74.65 |
1 | 18 | 25.35 | 71 | 100 |
D3 | Frequency | Percent | Cumulative | Cumulative |
Frequency | Percent | |||
0 | 53 | 74.65 | 53 | 74.65 |
1 | 18 | 25.35 | 71 | 100 |
hi thanks, the missing observations is due to the lags and differencing in the variables, i believe. Please see below the log
61 data test1;
SYMBOLGEN: Macro variable TD10 resolves to shortrun_dynamic
62 set &td10;
63 keep date &var6 &var17 &var8 D1 D2 D3;
SYMBOLGEN: Macro variable VAR6 resolves to DDEPPC
SYMBOLGEN: Macro variable VAR17 resolves to resid01_lag1
SYMBOLGEN: Macro variable VAR8 resolves to DPSDEPPC
64 run;
NOTE: There were 71 observations read from the data set WORK.SHORTRUN_DYNAMIC.
NOTE: The data set WORK.TEST1 has 71 observations and 7 variables.
NOTE: Compressing data set WORK.TEST1 increased size by 100.00 percent.
Compressed is 2 pages; un-compressed would require 1 pages.
NOTE: DATA statement used (Total process time):
real time 0.01 seconds
cpu time 0.01 seconds
65
66
67 proc phreg data=test1;
SYMBOLGEN: Macro variable VAR6 resolves to DDEPPC
68 model &var6 = &var17 &var8 D1 D2 D3/ selection=stepwise slentry=0.05 slstay=0.05 details rl;
SYMBOLGEN: Macro variable VAR17 resolves to resid01_lag1
SYMBOLGEN: Macro variable VAR8 resolves to DPSDEPPC
69 output out=resOut resmart=resmart;
70 run;
NOTE: 38 observations were deleted due either to missing or invalid values for the time, censoring, frequency or explanatory
variables or to invalid operations in generating the values for some of the explanatory variables.
NOTE: The data set WORK.RESOUT has 71 observations and 8 variables.
NOTE: Compressing data set WORK.RESOUT increased size by 100.00 percent.
Compressed is 2 pages; un-compressed would require 1 pages.
NOTE: PROCEDURE PHREG used (Total process time):
real time 0.10 seconds
cpu time 0.03 seconds
71
72 proc print data=resOut; where resmart is missing; run;
NOTE: There were 37 observations read from the data set WORK.RESOUT.
WHERE resmart is null;
NOTE: At least one W.D format was too small for the number to be printed. The decimal may be shifted by the "BEST" format.
NOTE: PROCEDURE PRINT used (Total process time):
real time 0.02 seconds
cpu time 0.03 seconds
73
74
75
thanks, i had a look in the RESOUT data set and the only this that looks strange is for &var6 all the 38 observations are negative. Does it mean anything ?
How many records had missing values for one or more of the variables on your model statement?
When there is no value for one of the variables the entire record is discarded from the model because the information is incomplete, the interaction between all of the variables cannot be modeled and is the most common cause of the record being dropped from the model.
there are 38 lines which have not been included in the calculation. However, there are no missing values (please see the output from the proc freq above)
What sort of unit is the variable DDEPPC supposed to be? Have you counted how many negative values you have for that variable? Does it happen to be 37 with the missing values of those other two variables on a record where DDEPPC is positive?
I don't think Phreg expects negative time values. In details of the documentation for Phreg in Failure Time Distribution it starts with the following. The modeled variable is that T
Let T be a nonnegative random variable representing the failure time of an individual from a homogeneous population.
i think you are right. After some examination of the missing observations i found that all the 37 are negative values of the DDEPPC variable. I also run the Phreg with a non-negative variable and it seems that worked. Do we have any workaround for this?
The DDEPPC is the difference for another variable and for this reason has negative values
yes, actually, it is not negative time. It is a reduction of the variable on between 2 periods.
I think, it is difficult for the purpose of my work to transform the data....
Thank you for your time and support 🙂
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.