Hi!
Given a dataset sorted by PID (ID of each participant, N=77) with several continuous variables.
I want to obtain which are the observations (PID) with values > 2 SD from the mean.
I am working with data in a temporary dataset called -- cortical_stroke_complete2
The variable of interest is : PCh_CONTRA_entorhinal_tck
This is the code that I have been trying without success, I tried different variables and always obtained the same PID.
proc sql noprint;
select mean(PCh_CONTRA_entorhinal_tck) into : mean from cortical_stroke_complete2;
select std(PCh_CONTRA_entorhinal_tck) into :std from cortical_stroke_complete2;
quit;
data cortical_stroke_complete2;
set cortical_stroke_complete2;
where PCh_CONTRA_entorhinal_tck >&mean+2*&std ;
proc print;run;
Basic SAS programmer.
Thank you so much!
No need for macros to do that. Try this
proc sql;
select *
from cortical_stroke_complete2
having PCh_CONTRA_entorhinal_tck - mean(PCh_CONTRA_entorhinal_tck) > 2*std(PCh_CONTRA_entorhinal_tck);
quit;
(untested)
Getting the same PID each time does not mean anything is wrong. It is possible that only one patient has outlier values.
When looking for values more than 2 standard deviations from the mean, you may have to consider two standard deviations in both directions.
where (PCh_Contra_entorhinal > &mean + 2*&std) or (PCh_Contra_entorhinal < &mean - 2*&std);
If you suspect this is not giving you the right result, print &mean and &std, and inspect 20 lines of data to see if you can confirm whether the result should be different.
Good luck.
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.