3 weeks ago
Hi, I am currently trying to create a table with rows and columns that have different analyses in them..
I would like to create a row that contains the N number of each column group and wanted to know whether there is an interesting way to accomplish this in SAS using a data step.
This is how I created the first row and column, but I'd like to now start creating tables with N values, 95% CI, things like that.
length rowlabel $100 col_1 col_2 col_3 col_4 $32 ;
Any advice would be very helpful!!
3 weeks ago
For simple statistical analysis like this, do not use the Data Step. Instead use the various procedures available like PROC MEANS, PROC SUMMARY and PROC UNIVARIATE.
A simple google search will get you far in terms of example code.
3 weeks ago
Thank you! But then is it possible to use proc means and proc summary to create a customized table?
One thing you will have to consider is which type of "table" do you mean? Do you mean a data set such as is used for further analysis or reporting or a simple report from existing data?
Procs Mean, Summary, Freq, Tabulate and Report will all do both to some extent.
For example (your should have the SASHELP.CLASS data set available):
proc means data=sashelp.class n mean max min std lclm uclm; class sex; var height weight; output out=work.summary; run;
Will produce a report table in the results window in one format and with the output statement a data set that looks like
Sex _TYPE_ _FREQ_ _STAT_ Height Weight 0 19 N 19.0000 19.000 0 19 MIN 51.3000 50.500 0 19 MAX 72.0000 150.000 0 19 MEAN 62.3368 100.026 0 19 STD 5.1271 22.774 F 1 9 N 9.0000 9.000 F 1 9 MIN 51.3000 50.500 F 1 9 MAX 66.5000 112.500 F 1 9 MEAN 60.5889 90.111 F 1 9 STD 5.0183 19.384 M 1 10 N 10.0000 10.000 M 1 10 MIN 57.3000 83.000 M 1 10 MAX 72.0000 150.000 M 1 10 MEAN 63.9100 108.950 M 1 10 STD 4.9379 22.727
The _type_ variable refers to level(s) of the class variable(s) used; _freq_ is the number of records used, _stat_ indicates which statistic is reported on that line for the variables and the last two columns hold the actual statistic values (excluding the confidence limits as I didn't add all of the syntax needed for that).
In general having a single column with multiple types of data is often considered a poor idea.
The report procedures Tabulate and Report will allow most of the same statistics as Means/Summary and a few others as well plus displaying output in grouped, nested and across behaviors.