Hello All,
I will be pleased to seek for assistance. I am analysis my dataset to check if the data is normally distributed or not. If its not normally distributed, I have to log or square root transform as I have done below. Because the data set is very long, I would like to have a macro that can log or square the values in column P.
P | SQR | LOG |
4,904 | 2,214 | 0,691 |
4,435 | 2,106 | 0,647 |
4,446 | 2,109 | 0,648 |
3,691 | 1,921 | 0,567 |
3,090 | 1,758 | 0,490 |
3,134 | 1,770 | 0,496 |
4,221 | 2,054 | 0,625 |
4,195 | 2,048 | 0,623 |
3,883 | 1,971 | 0,589 |
Thank you.
Edmund
You do not need macro. Base SAS is the programming language, you use that to process data. Look at the log():
http://support.sas.com/documentation/cdl/en/lrdict/64316/HTML/default/viewer.htm#a000245909.htm
And ** notations.
No need for complexity. SAS has functions for that.
data want;
input p commax5.;
sqr = sqrt(p);
log = log10(p);
format _numeric_ commax5.3;
cards;
4,904
4,435
4,446
3,691
3,090
3,134
4,221
4,195
3,883
;
run;
proc print noobs;
run;
Result:
p sqr log 4,904 2,214 0,691 4,435 2,106 0,647 4,446 2,109 0,648 3,691 1,921 0,567 3,090 1,758 0,490 3,134 1,770 0,496 4,221 2,055 0,625 4,195 2,048 0,623 3,883 1,971 0,589
To establish normality the output from Proc univariate is useful.
Buildijg off @Kurt_Bremser example.
Proc univariate data=want;
run;
Thanks for the quick responds. My data is a little bit complex so will explian little more. From the data below (0only a piece of the full data), I need to check if the data is normally distribute, hence I had to manually add a row for sqr and log. In the model statement, when I use the manually created SQR_Kconc or LOG_Kconc, it works. But I want a way where I only supply the main data, and if there is the need to sqr, I just use the script.
data Nconc_S2;
input Trt plt block mgt qual quan Kconc SQR_Kconc LOG_Kconc;
datalines;
1 4 1 1 1 2 80.006 8.945 1.903
1 29 2 1 1 2 91.588 9.570 1.962
1 39 3 1 1 2 76.645 8.755 1.884
1 61 4 1 1 2 85.700 9.257 1.933
2 15 1 1 2 2 106.488 10.319 2.027
2 21 2 1 2 2 104.266 10.211 2.018
2 48 3 1 2 2 104.632 10.229 2.020
2 66 4 1 2 2 99.191 9.959 1.996
3 5 1 1 1 1 65.659 8.103 1.817
3 34 2 1 1 1 85.866 9.266 1.934
3 41 3 1 1 1 91.500 9.566 1.961
3 71 4 1 1 1 91.621 9.572 1.962
;
run;
*testing normility;
proc mixed data=Nconc_S2;
class block mgt qual quan;
model Kconc=block mgt qual quan mgt*qual mgt*quan qual*quan mgt*qual*quan/ddfm=KR OUTP=r residual;
random block*qual*quan;
run;
ods graphics on;
proc univariate data=r normal plot;
var studentresid;
qqplot studentresid/normal;
run;
Proc plot data=r;
plot studentresid*pred=mgt;
format mgt mgtgrp. qual qualgrp. quan quangrp. ;
run;
Define manually?
Why not not add a data step between yours and the Proc similar to Kurts that calculates the log and sqrt.
*data input step - minus the two manual fields;
*calculate the two extra fields;
data nconc_s2;
set nconc_s2;
SQR_Kconc = sqrt(kconc);
LOG_Kconc=log(kconc);
run;
*rest of your code;
Thanks all for the responds yesterday. I manage to get it working with the code below. As can be seen, I had many variables that I put together and then indicated in the model to sqr and log all. Once this is set, I believe any time I write in a model statement "sqrN or logN", it will automatically use the log of the N value?
data Nconc_maize;
input Trt plt block mgt qual quan N P K Ca Mg;
sqrN = sqrt(N);
logN = log10(N);
sqrP = sqrt(P);
logP = log10(P);
sqrK = sqrt(K);
logK = log10(K);
sqrCa = sqrt(Ca);
logCa = log10(Ca);
sqrMg = sqrt(Mg);
logMg = log10(Mg);
datalines;
1 4 1 1 1 2 2.83 2.77 17.37 7.69 104.26
1 29 2 1 1 2 2.10 2.28 29.03 6.13 69.47
;
run
Are trying to test if the data is normally distributed?
I would suggest checking that before adding any variables to use Proc Univariate with the NORMAL option to do the test.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.