I am saving the output from proc means in the OUTPUT dataset. my issue is with STD. In the output dataset, the column for std is created to be &varname._StdDev.
I am running this proc for so many variables and some of the names for the variables are so long that with the addition of _StdDev their length goes above 32.
In the example below, the LEVEL portion of the var name is removed to fit the var name within 32 characters with the addition of _StdDev.
ods output Summary=Output;
proc means data=&dataset std MAXDEC=2;
var &varname;
run;
The MEANS Procedure
Analysis Variable : parent_highest_education_level |
---|
Std Dev |
0.02 |
proc print data=output; run;
The SAS System |
Obs | parent_highest_education__StdDev |
---|---|
1 | 0.02 |
Now that I want to access this column which should have this format &varname._StdDev, it is not accessible. How may I access the variable that is cut to a level to be <=32 characters with the suffix StdDev?
I am using this new var name to RETAIN and change the order of columns in the OUTPUT dataset after merging that with some other columns.
Thanks
Why not just use the real output dataset that the procedure produces?
ODS OUTPUT is a last resort method to generate a dataset that you can use to get a dataset from a procedure that was originally only designed to produce a printed report.
With the OUTPUT statement you can name the variables yourself.
You can have them keep their original names by not specifying any new names.
proc summary data=sashelp.class ;
var _numeric_;
output out=std std= ;
run;
proc print data=std; run;
Results
Obs _TYPE_ _FREQ_ Age Height Weight 1 0 19 1.49267 5.12708 22.7739
Or if you want just the statistics N, MIN, MAX, MEAN and STD then do not request any statistics in the OUTPUT statement. You get a vertical dataset with each statistic on a different observation.
proc summary data=sashelp.class ;
var _numeric_;
output out=want ;
run;
proc print data=want; run;
Obs _TYPE_ _FREQ_ _STAT_ Age Height Weight 1 0 19 N 19.0000 19.0000 19.000 2 0 19 MIN 11.0000 51.3000 50.500 3 0 19 MAX 16.0000 72.0000 150.000 4 0 19 MEAN 13.3158 62.3368 100.026 5 0 19 STD 1.4927 5.1271 22.774
Why don't you refer to the dictionary table and look up the names of the variables in the OUTPUT data set?
In the following code, the macro variable stdname will contain a variable name ending in "_StdDev".
data _null_;
set sashelp.vcolumn;
where memname='OUTPUT' and libname='WORK' and name like '%_StdDev';
call symput('stdname',name);
run;
You could rename your original variables to take no more than 25 characters. Then adding "_STDDEV" would keep the resulting name at 32 or less. This would ensure that you don't have a name which gets truncated for _STDDEV, but not for, say _MEAN. And you wouldn't have to add code just to determine a variable name.
If you want more detail about what a variable holds you do not need to place all of that next into a variable name. SAS Variable labels can provide output usable in most places.
So you could have a variable named "parent_hi_ed" or similar.
Then attach a label
label parent_hi_ed ="Parent's Highest Education Level";
After proc mean you would have a variable parent_hi_ed_stddev and if needed in a report you attach a label like
label parent_hi_ed_stddev ="Parent's Highest Education Level: Standard Deviation";
Now for the obnoxious comment: If your coding for "education" does not have ratio type scale then this is next to useless.
Example: coding 1 to 12 as "grades completed" or similar for first through 12th grade (high school) in typical US fashion is okay. But if 13 is "associates degree" then the 'interval' meaning between 12 and 13 is quite a bit different than between 11 and 12. This gets even less meaning full if you have "Bachelors degree" "Masters Degree" "PHD" and maybe some trade school. If the value is "years of education" then mean and stddev may mean something. But when a numeric rating scheme is only ordinal (low to high) where the intervals between each code do not mean exactly the same thing then these summary statistics are not likely appropriate. I won't even go into "equivalence" of things like GED vs High School Diploma or international school systems.
Why not just use the real output dataset that the procedure produces?
ODS OUTPUT is a last resort method to generate a dataset that you can use to get a dataset from a procedure that was originally only designed to produce a printed report.
With the OUTPUT statement you can name the variables yourself.
You can have them keep their original names by not specifying any new names.
proc summary data=sashelp.class ;
var _numeric_;
output out=std std= ;
run;
proc print data=std; run;
Results
Obs _TYPE_ _FREQ_ Age Height Weight 1 0 19 1.49267 5.12708 22.7739
Or if you want just the statistics N, MIN, MAX, MEAN and STD then do not request any statistics in the OUTPUT statement. You get a vertical dataset with each statistic on a different observation.
proc summary data=sashelp.class ;
var _numeric_;
output out=want ;
run;
proc print data=want; run;
Obs _TYPE_ _FREQ_ _STAT_ Age Height Weight 1 0 19 N 19.0000 19.0000 19.000 2 0 19 MIN 11.0000 51.3000 50.500 3 0 19 MAX 16.0000 72.0000 150.000 4 0 19 MEAN 13.3158 62.3368 100.026 5 0 19 STD 1.4927 5.1271 22.774
Thank you, everyone, for your thoughts and suggestions. So many options! Thanks!
To control how values print using a FORMAT statement.
proc print data=std;
format _numeric_ 5.2 ;
format _type_ _freq_ ;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.