DATA Step, Macro, Functions and more

Obtaining the SD per observation

Frequent Learner
Posts: 1

Obtaining the SD per observation




Given a dataset sorted by PID (ID of each participant, N=77) with several continuous variables. 


I want to obtain which are the observations (PID) with values > 2 SD from the mean.


I am working with data in a temporary dataset called -- cortical_stroke_complete2


The variable of interest is : PCh_CONTRA_entorhinal_tck


This is the code that I have been trying without success, I tried different variables and always obtained the same PID.


proc sql noprint;

  select mean(PCh_CONTRA_entorhinal_tck) into : mean from cortical_stroke_complete2;

  select std(PCh_CONTRA_entorhinal_tck) into :std from cortical_stroke_complete2;


data cortical_stroke_complete2;

  set cortical_stroke_complete2;

   where PCh_CONTRA_entorhinal_tck >&mean+2*&std ;

proc print;run;



Basic SAS programmer.


Thank you so much!

Respected Advisor
Posts: 4,646

Re: Obtaining the SD per observation

No need for macros to do that. Try this


proc sql;
select *
from cortical_stroke_complete2
having PCh_CONTRA_entorhinal_tck - mean(PCh_CONTRA_entorhinal_tck) > 2*std(PCh_CONTRA_entorhinal_tck);


Super User
Posts: 5,081

Re: Obtaining the SD per observation

Getting the same PID each time does not mean anything is wrong.  It is possible that only one patient has outlier values.


When looking for values more than 2 standard deviations from the mean, you may have to consider two standard deviations in both directions.


where (PCh_Contra_entorhinal > &mean + 2*&std) or (PCh_Contra_entorhinal < &mean - 2*&std);


If you suspect this is not giving you the right result, print &mean and &std, and inspect 20 lines of data to see if you can confirm whether the result should be different.


Good luck.

Ask a Question
Discussion stats
  • 2 replies
  • 3 in conversation