05-03-2015 04:31 PM
I am currently working on a long-form data set looking to to get an overall mean of variable as well its standard deviation between and within observations.
For example my data is set up like this.
ID Visit_Number Variable
1 1 3.3
1 2 5.2
1 3 6.3
1 4 2.3
2 1 8.9
2 2 5.8
2 3 6.2
2 4 3.4
I need three numbers. 1. an overall mean for variable. 2. The standard deviation for variable between observations. 3. The standard deviation for variable within observations.
I am trying to use;
proc means data=data
This gives me a mean and standard deviation for each person but I cannot get an overall st dev between and within observations. I am trying to see if there is greater spread in my variable of interest between observations or within. Any help would be greatly appreciated!
05-04-2015 09:53 AM
What do you mean by The standard deviation between observations. The standard deviation within observations. ?
05-04-2015 10:19 AM
I will try to clarify. I am working on classifying a variable on my data-set that is in the form above. So each person in our study has 4 measures of a variable with about 200 people, so a total of 800 observations.
By standard deviation between observations: I mean a number that represents the average standard deviation for variable within each person(just looking within each ID)
My statistics background is not strong so I am not sure if I can just use a class statement ID to get standard deviation of variable for each person and then average those standard devs across entire data-set
By standard deviations between observations: I want to compare our variable of interest between ID(different people). To see how each ID's variable mean and standard deviation compare with another ID.
We want to see if there is more spread within each women(high variation from visit to visit) or more spread between women(high variation from women to women)
Hope this helps.
05-05-2015 07:53 AM
OK. That make some sense .
data have; input ID Visit_Number Variable ; cards; 1 1 3.3 1 2 5.2 1 3 6.3 1 4 2.3 2 1 8.9 2 2 5.8 2 3 6.2 2 4 3.4 ; run; proc sql; create table a as select mean(Variable) as overall_mean from have; create table b as select id,std(Variable) as with_in_std from have group by id; create table c as select mean(with_in_std) as between_std from b; quit;