BookmarkSubscribeRSS Feed
GeorgeSAS
Lapis Lazuli | Level 10

Hello everyone,

 

I have a dataset with two missing values.I hope I can predict these two value from other rows.(curved line)

so I want build a regression model (this is not a linear line,but a curved like line)to predict the missing values when var in (98,99).

 

please advise me.

 

Thank you very much!

data have;
input value var;
cards;
75.447     1
74.628     2
73.737     3
72.846     4
71.964     5
71.064     6
70.173     7
69.273     8
68.382     9
67.482    10
66.582    11
65.691    12
64.800    13
63.900    14
63.009    15
62.118    16
61.227    17
60.345    18
59.463    19
58.590    20
57.735    21
56.871    22
56.007    23
55.134    24
54.252    25
53.388    26
52.506    27
51.624    28
50.742    29
49.860    30
48.969    31
48.087    32
47.214    33
46.332    34
45.450    35
44.577    36
43.695    37
42.822    38
41.940    39
41.067    40
40.185    41
39.330    42
38.466    43
37.602    44
36.747    45
35.883    46
35.028    47
34.173    48
33.318    49
32.490    50
31.653    51
30.816    52
29.988    53
29.160    54
28.359    55
27.540    56
26.730    57
25.929    58
25.119    59
24.309    60
23.517    61
22.752    62
21.969    63
21.195    64
20.457    65
19.710    66
18.954    67
18.207    68
17.469    69
16.776    70
16.065    71
15.381    72
14.697    73
14.049    74
13.401    75
12.771     76
12.168     77
11.565     78
10.980     79
10.404     80
 9.846     81
 9.288     82
 8.784     83
 8.289     84
 7.794     85
 7.335     86
 6.903     87
 6.489     88
 6.102     89
 5.733     90
 5.346     91
 5.040     92
 4.734     93
 4.410     94
 4.077     95
 3.780     96
 3.483     97
 .         98
 .         99
 2.357      100
;
run;

 

 

let me get a start model;

proc sgscatter data=have;
  plot value*var/ 
        reg=(nogroup clm degree=2) grid ;
run;

 I want use regression like this line to estimate values when var in(98,99). please help me find out how to do this.

 

Thanks!

9 REPLIES 9
Ksharp
Super User

PROC ADAPTIVE

PROC GAML

PROC LOESS

Ksharp
Super User
proc loess data=have;
model value=var;
output out=want;
run;
proc sgscatter data=want;;
  plot Predicted*var/ grid ;
run;
GeorgeSAS
Lapis Lazuli | Level 10
Thank you Ksharp,
I can't run the program code on SAS 9.3,can you tell me the predicted value when var in (98,99) from your method?

Thanks!
Ksharp
Super User

SAS Output

SAS 系统

value var SmoothingParameter DepVar Obs Predicted
75.447 1 0.045918 value 1 75.4556
74.628 2 0.045918 value 2 74.6074
73.737 3 0.045918 value 3 73.7370
72.846 4 0.045918 value 4 72.8486
71.964 5 0.045918 value 5 71.9588
71.064 6 0.045918 value 6 71.0666
70.173 7 0.045918 value 7 70.1704
69.273 8 0.045918 value 8 69.2756
68.382 9 0.045918 value 9 68.3794
67.482 10 0.045918 value 10 67.4820
66.582 11 0.045918 value 11 66.5846
65.691 12 0.045918 value 12 65.6910
64.800 13 0.045918 value 13 64.7974
63.900 14 0.045918 value 14 63.9026
63.009 15 0.045918 value 15 63.0090
62.118 16 0.045918 value 16 62.1180
61.227 17 0.045918 value 17 61.2296
60.345 18 0.045918 value 18 60.3450
59.463 19 0.045918 value 19 59.4656
58.590 20 0.045918 value 20 58.5952
57.735 21 0.045918 value 21 57.7324
56.871 22 0.045918 value 22 56.8710
56.007 23 0.045918 value 23 56.0044
55.134 24 0.045918 value 24 55.1314
54.252 25 0.045918 value 25 54.2572
53.388 26 0.045918 value 26 53.3828
52.506 27 0.045918 value 27 52.5060
51.624 28 0.045918 value 28 51.6240
50.742 29 0.045918 value 29 50.7420
49.860 30 0.045918 value 30 49.8574
48.969 31 0.045918 value 31 48.9716
48.087 32 0.045918 value 32 48.0896
47.214 33 0.045918 value 33 47.2114
46.332 34 0.045918 value 34 46.3320
45.450 35 0.045918 value 35 45.4526
44.577 36 0.045918 value 36 44.5744
43.695 37 0.045918 value 37 43.6976
42.822 38 0.045918 value 38 42.8194
41.940 39 0.045918 value 39 41.9426
41.067 40 0.045918 value 40 41.0644
40.185 41 0.045918 value 41 40.1927
39.330 42 0.045918 value 42 39.3274
38.466 43 0.045918 value 43 38.4660
37.602 44 0.045918 value 44 37.6046
36.747 45 0.045918 value 45 36.7444
35.883 46 0.045918 value 46 35.8856
35.028 47 0.045918 value 47 35.0280
34.173 48 0.045918 value 48 34.1730
33.318 49 0.045918 value 49 33.3257
32.490 50 0.045918 value 50 32.4874
31.653 51 0.045918 value 51 31.6530
30.816 52 0.045918 value 52 30.8186
29.988 53 0.045918 value 53 29.9880
29.160 54 0.045918 value 54 29.1677
28.359 55 0.045918 value 55 28.3538
27.540 56 0.045918 value 56 27.5426
26.730 57 0.045918 value 57 26.7326
25.929 58 0.045918 value 58 25.9264
25.119 59 0.045918 value 59 25.1190
24.309 60 0.045918 value 60 24.3142
23.517 61 0.045918 value 61 23.5247
22.752 62 0.045918 value 62 22.7468
21.969 63 0.045918 value 63 21.9716
21.195 64 0.045918 value 64 21.2053
20.457 65 0.045918 value 65 20.4544
19.710 66 0.045918 value 66 19.7074
18.954 67 0.045918 value 67 18.9566
18.207 68 0.045918 value 68 18.2096
17.469 69 0.045918 value 69 17.4819
16.776 70 0.045918 value 70 16.7708
16.065 71 0.045918 value 71 16.0727
15.381 72 0.045918 value 72 15.3810
14.697 73 0.045918 value 73 14.7073
14.049 74 0.045918 value 74 14.0490
13.401 75 0.045918 value 75 13.4062
12.771 76 0.045918 value 76 12.7787
12.168 77 0.045918 value 77 12.1680
11.565 78 0.045918 value 78 11.5702
10.980 79 0.045918 value 79 10.9826
10.404 80 0.045918 value 80 10.4092
9.846 81 0.045918 value 81 9.8460
9.288 82 0.045918 value 82 9.3035
8.784 83 0.045918 value 83 8.7866
8.289 84 0.045918 value 84 8.2890
7.794 85 0.045918 value 85 7.8043
7.335 86 0.045918 value 86 7.3427
6.903 87 0.045918 value 87 6.9082
6.489 88 0.045918 value 88 6.4967
6.102 89 0.045918 value 89 6.1072
5.733 90 0.045918 value 90 5.7278
5.346 91 0.045918 value 91 5.3692
5.040 92 0.045918 value 92 5.0400
4.734 93 0.045918 value 93 4.7288
4.410 94 0.045918 value 94 4.4074
4.077 95 0.045918 value 95 4.0873
3.780 96 0.045918 value 96 3.7800
3.483 97 0.045918 value 97 3.4830
. 98 0.045918 value 98 3.1084
. 99 0.045918 value 99 2.7337
2.357 100 0.045918 value 100 2.3591
Rick_SAS
SAS Super FREQ

Unfortunately, PROC LOESS drops observations for which ANY variable has a missing value. However, you can use any other nonparametric regression procedure, such as PROC TPSPLINE or PROC TRANSREG, which handle spline fits. 

 

proc transreg data=have;
   model identity(value) = spline(var);
   output out=Want predicted;
run;

proc print data=want;
where Var > 96;
var Var Value PValue;
run;

 

 

GeorgeSAS
Lapis Lazuli | Level 10
Thank you!
Rick, may I ask what is the model theory of the transreg procedure? like general linear model? or something else?
Rick_SAS
SAS Super FREQ

It transforms data (nonlinearly) and then fits a linear model to the transformed data. 

For an overview of the different models and transformations, see

http://support.sas.com/documentation/cdl/en/statug/68162/HTML/default/viewer.htm#statug_introreg_sec...

 

GeorgeSAS
Lapis Lazuli | Level 10
Thank you Rick,
what is the mythology/theory behind this exact model? least square ? or something else?

Thank you!
Rick_SAS
SAS Super FREQ

See the TRANSREG doc for specifics.  For regular data analysis, it uses ordinary least squares (OLS). For optimal variable transformations, it iterates between OLS estimates of the parameters and OLS estimates of the transformation parameters (a method that is called the method of alternating least squares).

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 9 replies
  • 1663 views
  • 5 likes
  • 3 in conversation