BookmarkSubscribeRSS Feed
jsconte18
Fluorite | Level 6

Hi - I am fairly new to SAS Studio and am struggling with the homework from my 700 level STATs class.

 

This is the code/data I currently have:

 

DATA dataset1;
INPUT ID X Y;
DATALINES;
1 1.436 3.1449563
2 2.064 5.3079198
3 0.877 1.9750624
4 0.633 1.6945208
5 2.218 5.7511510
6 2.506 7.6890712
7 0.925 2.1880265
8 1.348 3.0876182
9 2.600 7.5836908
10 1.402 3.2563274
11 2.393 7.2311639
12 2.208 5.4816163
13 1.794 3.7704713
14 0.208 1.0559069
15 0.025 1.1537298
16 2.789 10.6036682
17 0.068 0.9161273
18 1.909 5.3079198
19 1.616 4.2240737
20 1.843 3.7636905
21 1.234 2.3052751
22 2.857 11.4569897
23 2.081 6.2388758
24 1.788 3.5092418
25 1.812 3.5736947
26 0.118 0.9070116
27 0.677 2.0846480
28 2.940 8.5762778
29 0.806 1.5492950
30 1.110 2.9891827
31 2.887 12.4236262
32 0.273 1.0074274
33 0.723 1.4367735
34 2.388 5.4054084
35 0.416 1.7714528
36 0.299 1.6147821
37 0.108 1.4182164
38 2.945 8.0606968
39 0.691 1.3175843
40 2.969 8.0542508
41 0.850 1.4405140
42 2.460 4.9332599
43 2.326 4.4008601
44 1.981 7.3213885
45 1.938 7.2456407
46 2.420 10.6973923
47 2.940 16.3953868
48 0.081 1.7573377
49 1.449 1.7615604
50 1.113 4.4611206
;
PROC PRINT DATA=dataset1;
VAR X Y;
RUN;
PROC MEANS DATA=dataset1 N SUM MEAN STD MIN MAX;
VAR X Y;
RUN;
PROC SGPLOT DATA=dataset1;
SCATTER X=X Y=Y;
RUN;
PROC REG DATA=dataset1;
MODEL X = Y / P CLM CLI CLB;
RUN;
DATA example;
INPUT x y;
x2 = x**2;
x3 = x**3;
overY = 1/y;
logy = log(y);
DATALINES;
1 1.436 3.1449563
2 2.064 5.3079198
3 0.877 1.9750624
4 0.633 1.6945208
5 2.218 5.7511510
6 2.506 7.6890712
7 0.925 2.1880265
8 1.348 3.0876182
9 2.600 7.5836908
10 1.402 3.2563274
11 2.393 7.2311639
12 2.208 5.4816163
13 1.794 3.7704713
14 0.208 1.0559069
15 0.025 1.1537298
16 2.789 10.6036682
17 0.068 0.9161273
18 1.909 5.3079198
19 1.616 4.2240737
20 1.843 3.7636905
21 1.234 2.3052751
22 2.857 11.4569897
23 2.081 6.2388758
24 1.788 3.5092418
25 1.812 3.5736947
26 0.118 0.9070116
27 0.677 2.0846480
28 2.940 8.5762778
29 0.806 1.5492950
30 1.110 2.9891827
31 2.887 12.4236262
32 0.273 1.0074274
33 0.723 1.4367735
34 2.388 5.4054084
35 0.416 1.7714528
36 0.299 1.6147821
37 0.108 1.4182164
38 2.945 8.0606968
39 0.691 1.3175843
40 2.969 8.0542508
41 0.850 1.4405140
42 2.460 4.9332599
43 2.326 4.4008601
44 1.981 7.3213885
45 1.938 7.2456407
46 2.420 10.6973923
47 2.940 16.3953868
48 0.081 1.7573377
49 1.449 1.7615604
50 1.113 4.4611206
;
PROC REG DATA=example;
MODEL logy = x;
MODEL overy = x;
MODEL x2 = y;
MODEL x3 = y;
RUN;

From the original data, you can see a U shape in the scatter plot and that the data is non-linear. I need to find the best linear model.

I would use the transformations x' = x^2, x' = x^3, y' - log(y), and y' = 1/y

I thought I had the code in correctly but when I run it, the data for these transformations looks off.

Can anyone help me see what I am doing wrong and how to get the best results for these model transformations? Thank you.
1 REPLY 1
PaigeMiller
Diamond | Level 26
PROC REG DATA=example;
MODEL logy = x; 
MODEL overy = x;
MODEL x2 = y;
MODEL x3 = y;
RUN;

 

In the last two models, you have reversed x and y, so that needs to be fixed. In addition, fitting a quadratic would be

 

model y = x x2;

 

and I leave it up to you as a homework assignment to figure out how to fit a cubic to this data.

--
Paige Miller

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 1 reply
  • 255 views
  • 5 likes
  • 2 in conversation