- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hello All,
I will be pleased to seek for assistance. I am analysis my dataset to check if the data is normally distributed or not. If its not normally distributed, I have to log or square root transform as I have done below. Because the data set is very long, I would like to have a macro that can log or square the values in column P.
P | SQR | LOG |
4,904 | 2,214 | 0,691 |
4,435 | 2,106 | 0,647 |
4,446 | 2,109 | 0,648 |
3,691 | 1,921 | 0,567 |
3,090 | 1,758 | 0,490 |
3,134 | 1,770 | 0,496 |
4,221 | 2,054 | 0,625 |
4,195 | 2,048 | 0,623 |
3,883 | 1,971 | 0,589 |
Thank you.
Edmund
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
You do not need macro. Base SAS is the programming language, you use that to process data. Look at the log():
http://support.sas.com/documentation/cdl/en/lrdict/64316/HTML/default/viewer.htm#a000245909.htm
And ** notations.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
No need for complexity. SAS has functions for that.
data want;
input p commax5.;
sqr = sqrt(p);
log = log10(p);
format _numeric_ commax5.3;
cards;
4,904
4,435
4,446
3,691
3,090
3,134
4,221
4,195
3,883
;
run;
proc print noobs;
run;
Result:
p sqr log 4,904 2,214 0,691 4,435 2,106 0,647 4,446 2,109 0,648 3,691 1,921 0,567 3,090 1,758 0,490 3,134 1,770 0,496 4,221 2,055 0,625 4,195 2,048 0,623 3,883 1,971 0,589
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
To establish normality the output from Proc univariate is useful.
Buildijg off @Kurt_Bremser example.
Proc univariate data=want;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thanks for the quick responds. My data is a little bit complex so will explian little more. From the data below (0only a piece of the full data), I need to check if the data is normally distribute, hence I had to manually add a row for sqr and log. In the model statement, when I use the manually created SQR_Kconc or LOG_Kconc, it works. But I want a way where I only supply the main data, and if there is the need to sqr, I just use the script.
data Nconc_S2;
input Trt plt block mgt qual quan Kconc SQR_Kconc LOG_Kconc;
datalines;
1 4 1 1 1 2 80.006 8.945 1.903
1 29 2 1 1 2 91.588 9.570 1.962
1 39 3 1 1 2 76.645 8.755 1.884
1 61 4 1 1 2 85.700 9.257 1.933
2 15 1 1 2 2 106.488 10.319 2.027
2 21 2 1 2 2 104.266 10.211 2.018
2 48 3 1 2 2 104.632 10.229 2.020
2 66 4 1 2 2 99.191 9.959 1.996
3 5 1 1 1 1 65.659 8.103 1.817
3 34 2 1 1 1 85.866 9.266 1.934
3 41 3 1 1 1 91.500 9.566 1.961
3 71 4 1 1 1 91.621 9.572 1.962
;
run;
*testing normility;
proc mixed data=Nconc_S2;
class block mgt qual quan;
model Kconc=block mgt qual quan mgt*qual mgt*quan qual*quan mgt*qual*quan/ddfm=KR OUTP=r residual;
random block*qual*quan;
run;
ods graphics on;
proc univariate data=r normal plot;
var studentresid;
qqplot studentresid/normal;
run;
Proc plot data=r;
plot studentresid*pred=mgt;
format mgt mgtgrp. qual qualgrp. quan quangrp. ;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Define manually?
Why not not add a data step between yours and the Proc similar to Kurts that calculates the log and sqrt.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
*data input step - minus the two manual fields;
*calculate the two extra fields;
data nconc_s2;
set nconc_s2;
SQR_Kconc = sqrt(kconc);
LOG_Kconc=log(kconc);
run;
*rest of your code;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thanks all for the responds yesterday. I manage to get it working with the code below. As can be seen, I had many variables that I put together and then indicated in the model to sqr and log all. Once this is set, I believe any time I write in a model statement "sqrN or logN", it will automatically use the log of the N value?
data Nconc_maize;
input Trt plt block mgt qual quan N P K Ca Mg;
sqrN = sqrt(N);
logN = log10(N);
sqrP = sqrt(P);
logP = log10(P);
sqrK = sqrt(K);
logK = log10(K);
sqrCa = sqrt(Ca);
logCa = log10(Ca);
sqrMg = sqrt(Mg);
logMg = log10(Mg);
datalines;
1 4 1 1 1 2 2.83 2.77 17.37 7.69 104.26
1 29 2 1 1 2 2.10 2.28 29.03 6.13 69.47
;
run
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Are trying to test if the data is normally distributed?
I would suggest checking that before adding any variables to use Proc Univariate with the NORMAL option to do the test.