DATA Step, Macro, Functions and more

Obtaining the SD per observation

Frequent Learner
Posts: 1

Obtaining the SD per observation




Given a dataset sorted by PID (ID of each participant, N=77) with several continuous variables. 


I want to obtain which are the observations (PID) with values > 2 SD from the mean.


I am working with data in a temporary dataset called -- cortical_stroke_complete2


The variable of interest is : PCh_CONTRA_entorhinal_tck


This is the code that I have been trying without success, I tried different variables and always obtained the same PID.


proc sql noprint;

  select mean(PCh_CONTRA_entorhinal_tck) into : mean from cortical_stroke_complete2;

  select std(PCh_CONTRA_entorhinal_tck) into :std from cortical_stroke_complete2;


data cortical_stroke_complete2;

  set cortical_stroke_complete2;

   where PCh_CONTRA_entorhinal_tck >&mean+2*&std ;

proc print;run;



Basic SAS programmer.


Thank you so much!

Esteemed Advisor
Posts: 5,524

Re: Obtaining the SD per observation

Posted in reply to l_gutierrez1808

No need for macros to do that. Try this


proc sql;
select *
from cortical_stroke_complete2
having PCh_CONTRA_entorhinal_tck - mean(PCh_CONTRA_entorhinal_tck) > 2*std(PCh_CONTRA_entorhinal_tck);


Super User
Posts: 6,762

Re: Obtaining the SD per observation

Posted in reply to l_gutierrez1808

Getting the same PID each time does not mean anything is wrong.  It is possible that only one patient has outlier values.


When looking for values more than 2 standard deviations from the mean, you may have to consider two standard deviations in both directions.


where (PCh_Contra_entorhinal > &mean + 2*&std) or (PCh_Contra_entorhinal < &mean - 2*&std);


If you suspect this is not giving you the right result, print &mean and &std, and inspect 20 lines of data to see if you can confirm whether the result should be different.


Good luck.

Ask a Question
Discussion stats
  • 2 replies
  • 3 in conversation