BookmarkSubscribeRSS Feed
Kydanso
Fluorite | Level 6

Hello All,

 

I will be pleased to seek for assistance. I am analysis my dataset to check if the data is normally distributed or not. If its not normally distributed, I have to log or square root transform as I have done below. Because the data set is very long, I would like to have a macro that can log or square the values in column P. 

 

PSQRLOG
4,9042,2140,691
4,4352,1060,647
4,4462,1090,648
3,6911,9210,567
3,0901,7580,490
3,1341,7700,496
4,2212,0540,625
4,1952,0480,623
3,8831,9710,589

 

 

Thank you.

 

Edmund

8 REPLIES 8
RW9
Diamond | Level 26 RW9
Diamond | Level 26

You do not need macro.  Base SAS is the programming language, you use that to process data.  Look at the log():

http://support.sas.com/documentation/cdl/en/lrdict/64316/HTML/default/viewer.htm#a000245909.htm

 

And ** notations.

 

 

Kurt_Bremser
Super User

No need for complexity. SAS has functions for that.

data want;
input p commax5.;
sqr = sqrt(p);
log = log10(p);
format _numeric_ commax5.3;
cards;
4,904
4,435
4,446
3,691
3,090
3,134
4,221
4,195
3,883
;
run;

proc print noobs;
run;

Result:

    p      sqr      log

4,904    2,214    0,691
4,435    2,106    0,647
4,446    2,109    0,648
3,691    1,921    0,567
3,090    1,758    0,490
3,134    1,770    0,496
4,221    2,055    0,625
4,195    2,048    0,623
3,883    1,971    0,589
Reeza
Super User

To establish normality the output from Proc univariate is useful. 

 

Buildijg off @Kurt_Bremser example. 

 

Proc univariate data=want;

run;

Kydanso
Fluorite | Level 6

Thanks for the quick responds. My data is a little bit complex so will explian little more. From the data below (0only a piece of the full data), I need to check if the data is normally distribute, hence I had to manually add a row for sqr and log. In the model statement, when I use the manually created SQR_Kconc or  LOG_Kconc, it works. But I want a way where I only supply the main data, and if there is the need to sqr, I just use the script.

 

data Nconc_S2;

input Trt plt block mgt qual quan Kconc SQR_Kconc LOG_Kconc;

datalines;

1 4 1 1 1 2 80.006 8.945 1.903

1 29 2 1 1 2 91.588 9.570 1.962

1 39 3 1 1 2 76.645 8.755 1.884

1 61 4 1 1 2 85.700 9.257 1.933

2 15 1 1 2 2 106.488 10.319 2.027

2 21 2 1 2 2 104.266 10.211 2.018

2 48 3 1 2 2 104.632 10.229 2.020

2 66 4 1 2 2 99.191 9.959 1.996

3 5 1 1 1 1 65.659 8.103 1.817

3 34 2 1 1 1 85.866 9.266 1.934

3 41 3 1 1 1 91.500 9.566 1.961

3 71 4 1 1 1 91.621 9.572 1.962

;

run;

 

*testing normility;

proc mixed data=Nconc_S2;

class block mgt qual quan;

model Kconc=block mgt qual quan mgt*qual mgt*quan qual*quan mgt*qual*quan/ddfm=KR OUTP=r residual;

random block*qual*quan;

run;

ods graphics on;

proc univariate data=r normal plot;

var studentresid;

qqplot studentresid/normal;

run;

Proc plot data=r;

plot studentresid*pred=mgt;

format mgt mgtgrp. qual qualgrp. quan quangrp. ;

run;

 

 

Reeza
Super User

Define manually? 

 

Why not not add a data step between yours and the Proc similar to Kurts that calculates the log and sqrt.  

Reeza
Super User
*data input step - minus the two manual fields;

*calculate the two extra fields;

data nconc_s2;
set nconc_s2;
SQR_Kconc = sqrt(kconc);
 LOG_Kconc=log(kconc);
run;

*rest of your code;
Kydanso
Fluorite | Level 6

Thanks all for the responds yesterday. I manage to get it working with the code below. As can be seen, I had many variables that I put together and then indicated in the model to sqr and log all.  Once this is set, I believe any time I write in a model statement "sqrN or logN", it will automatically use the log of the N value?

 

data Nconc_maize;
input Trt plt block mgt qual quan N P K Ca Mg;
sqrN = sqrt(N);
logN = log10(N);
sqrP = sqrt(P);
logP = log10(P);
sqrK = sqrt(K);
logK = log10(K);
sqrCa = sqrt(Ca);
logCa = log10(Ca);
sqrMg = sqrt(Mg);
logMg = log10(Mg);
datalines;
1 4 1 1 1 2 2.83 2.77 17.37 7.69 104.26
1 29 2 1 1 2 2.10 2.28 29.03 6.13 69.47

;

run

ballardw
Super User

Are trying to test if the data is normally distributed?

I would suggest checking that before adding any variables to use Proc Univariate with the NORMAL option to do the test.

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 8 replies
  • 9898 views
  • 2 likes
  • 5 in conversation