Posted 11-04-2018 10:29 PM
(911 views)

Hi all,

I am looking for a code to split the data in the stepwise regression model into predicted and estimated.

Your valuable help is highly appreciated

Please post what SAS code you are working with and sample data that will assist in explaining what you want to achieve.

You can search the communities for regression models is that if what your question is and there should be many examples.

But if you provide samples and a clear statement of what you have (sample data) and what you expect possible sample data we can assist you faster because we will not be guessing.

What do you mean by 'predicted' and 'estimated'? Do you mean test/training splits?

@esraa wrote:

I am looking for a code to split the data in the stepwise regression model into predicted and estimated.

Your valuable help is highly appreciated

and prediction sets. then

a. Evaluate the statistical properties of these data sets.

b. Fit a model involving x 1 and x 6 to the estimation data. Do the coeffi cients

and fi tted values from this model seem reasonable?

c. Use this model to predict the observations in the prediction data set.

What is your evaluation of this model ’ s predictive performance?

~~Where do you think it's asking you to split the data set? It doesn't as far as I can tell.~~

My mistake, I believe you're referring to test and validation data sets then. Use PROC SURVEYSELECT to split your data into two different sample sets. They need to be entirely independent.

Another option is the 'manual' way which is to add a random number, sort the data by the random number and take the first split as the model fitting (prediction) and the second as the evaluation set.

yes exactly, I wrote the code but there is only one table appears in the results could you please help me with that

Title1 'problem 2’;

data math;

input x1 x2 x3 x4 x5 x6 x7 x8 x9 y;

cards;

350 170 275 8.5 2.56 199.6 72.9 3860 1 17

250 105 185 8.25 2.73 196.7 72.2 3510 1 20

351 143 255 8 3 199.9 74 3890 1 18.25

231 110 175 8 2.56 179.3 65.4 3020 1 22.12

262 110 200 8.5 2.56 179.3 65.4 3180 1 21.47

89.7 70 81 8.2 3.9 155.7 64 1905 0 34.7

96.9 75 83 9 4.3 165.2 65 2320 0 30.4

350 155 250 8.5 3.08 195.4 74.4 3885 1 16.5

85.3 80 83 8.5 3.89 160.6 62.2 2009 0 36.5

171 109 146 8.2 3.22 170.4 66.9 2655 0 21.5

258 110 195 8 3.08 171.5 77 3375 1 19.7

140 83 109 8.4 3.4 168.8 69.4 2700 0 20.3

302 129 220 8 3 199.9 74 3890 1 17.8

500 190 360 8.5 2.73 224.1 79.8 5290 1 14.39

440 215 330 8.2 2.71 231 79.7 5185 1 14.89

350 155 250 8.5 3.08 196.7 72.2 3910 1 17.8

318 145 255 8.5 2.45 197.6 71 3660 1 16.41

231 110 175 8 2.56 179.3 65.4 3050 1 23.54

360 180 290 8.4 2.45 214.2 76.3 4250 1 21.47

96.9 75 83 9 4.3 165.2 61.8 2275 0 31.9

460 223 366 8 3 228 79.8 5430 1 13.27

133.6 96 120 8.4 3.91 171.5 63.4 2535 0 23.9

318 140 255 8.5 2.71 215.3 76.3 4370 1 19.73

351 148 243 8 3.25 215.5 78.5 4540 1 13.9

351 148 243 8 3.26 216.1 78.5 4715 1 13.27

360 195 295 8.25 3.15 209.3 77.4 4215 1 13.77

360 165 255 8.5 2.73 185.2 69 3660 1 16.5

;

run;

proc surveyselect data=math outall out=split

samprate=0.7 seed=90284098 method=SRS;

run;

proc freq data=split;

table selected;

run;

data estimation prediction;

set split;

if selected=1 then output estimation;

else output prediction;

drop selected;

run;

proc reg data=estimation;

model y=x1x6;

run;

