BookmarkSubscribeRSS Feed
Chang
Quartz | Level 8

dear Community members,

Happy 2014! Extracting descriptive statistics is among the very first steps of many statistical analyses. I am trying to automate the routine of getting basic measures, such as mean, median, standard deviation and IQR from PROC UNIVARIATE. Unfortunately I got stuck when attempting to extracting data from an ods object called 'BasicMeasures.' By default you will get the following output when using ods output BasicMeasures=:

ods_output_basic_measures_of_proc_univariate.jpg

where the first few columns are your BY-group variables, followed by four basic measure columns- LocMeasure (the highlighted one), LocValue, VarName and VarValue. I would like to create one subset containing the BY-groups+ LocMeasure+  LocValue and one subset containing BY-groups+ VarName and VarValue. As shown in the screenshot of the column attributes, the column name I want to extract is correctly identified but I got an error message 'ERROR: variable LocMeasure and LocValue are not found in the work file. Same error occurs for the other two variables VarName and VarValue. I suspect this may be due to the conflict between the 3 data set options KEEP=, WHERE= and RENAME= in my SET statement, because my codes are working when RENAME= is removed. But I have to rename the two columns in the two subsets so I can stack them later. Any idea? Here are my codes

*step 1: sort data*;

proc sort data=OA_MT_wide; by scoreType scoreValue scoreOrder; run;

*step 2: output basic measures*;

proc univariate data= OA_MT_wide normal;

    by scoreType scoreValue scoreOrder ;

    class scoreOrder;                           

    var age;

    ods output TestsForNormality=     _normality_age;

    ods output BasicMeasures=        _basic_age;

run;

*step 3: get mean, median, standard deviation and IQR and group them into two columns- statType and statValue*;

data basic_age; set

    _basic_age(keep = scoreType scoreValue scoreOrder LocMeasure LocValue

                in=a

                where=(LocMeasure in ('Mean' 'Median'))

                rename=(LocMeasure= statType LocValue= statValue))

    _basic_age(keep = scoreType scoreValue scoreOrder VarName VarValue

                in=b

                where=(VarName in ('Std Deviation' 'Interquartile Range'))

                rename=(VarName= statType VarValue= statValue))

                ;

RUN;

My data are in the excel worksheet 'data'. My codes are also attached (normality_short.sas) for your convenience. Many thanks in advance.rder

2 REPLIES 2
data_null__
Jade | Level 19

Change your WHERE data set options to use the RENAMEd variable names.  Where=(statType in('Mean' ...

I would look to PROC SUMMARY to create data sets of summary statistics, should be much easier.

Chang
Quartz | Level 8

thank you very much. In proc summary, each basic measure can be explicitly specified and this also produce a wide-format output.

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 2 replies
  • 1507 views
  • 3 likes
  • 2 in conversation