DATA Step, Macro, Functions and more

help with statistic plan

Posts: 20

help with statistic plan


. Let me preface this by stating that I’m new to stats.


I have 3 populations - control, LSS and OA

I want to answer the ff questions


1. Is there a correlation b/n functional tests (SF-36 and SSS) and other self-reported measures (ODI, NCOS, KOOS etc)


2. Do IMU performance features (gait parameters) correlate with the above i.e. self-reported measures (ODI, NCOS, KOOS)


3. Does IMU performance features (gait parameters) predict functional outcomes (SF-36 and SSS)


I started by doing a proc-univariate for my continuous variables, but noticed that some of the variables had outliers. For some of the variable the same 2 individuals were consistently the identified outliers, but for other variables, the outliers were different observations /individuals.


I read on here that one way to deal with the outlier is to use the winterize and trimmed options in proc univariate.


my questions are as follows

1. since the groups are independent, does it make sense to run proc univariate by groups? 

I run this code for example,


proc univariate data =mydata plots trimmed=1 .1
                winsorized=.1  robustscale;
var e_ShoeSize_US Weigh_kg BMI Age  PF VT MH SF BP GH;
by group;
run; QUIT;


When I run this, I’m able to see the differences in the median values by groups for my continuous variables. 


Can I change the trimmed percent to 5% instead of 10%. I’m worried that the combined sample side is < 100 (30 per group). How does this 5% vs. 10% trim affect the analysis? What values do I report? the trimmed means, median, Gini's mean?


2. For my subsequent correlation analysis: proc corr,  how do I deal with the outliers?


3. I have categorical level data as well: race (5-levels), AB (5 levels), etc. To analyze the correlation with continuous variables can I use this code


PROC NPAR1WAY data=mydata wilcoxon; 
Class group;
             PF RE RP VT MH SF BP GH race AB ; run; quit;



4. For my goal Smiley Very Happyoes IMU performance features (gait parameters) predict functional outcomes (SF-36 and SSS), I’m thinking of doing a multi-variate analysis


SF-36 has 8 subscales: PF, RP, BP, GH, VT, SF, RE, MH.

I have multiple measures of gait parameters (speed, angular velocity, stride length etc),

does it make sense to run this:


proc glm data=mydata;
class group;
age gender gait_parameters race AB /ss3 solution CLPARM;


Thank you.

Ask a Question
Discussion stats
  • 0 replies
  • 1 in conversation