- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi, I am currently trying to create a table with rows and columns that have different analyses in them..
I would like to create a row that contains the N number of each column group and wanted to know whether there is an interesting way to accomplish this in SAS using a data step.
This is how I created the first row and column, but I'd like to now start creating tables with N values, 95% CI, things like that.
data row1;
length rowlabel $100 col_1 col_2 col_3 col_4 $32 ;
rowlabel='Score:';
col_1='X';
col_2='Y';
col_3='Total';
col_4='X-Y';
run;
Any advice would be very helpful!!
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
For simple statistical analysis like this, do not use the Data Step. Instead use the various procedures available like PROC MEANS, PROC SUMMARY and PROC UNIVARIATE.
A simple google search will get you far in terms of example code.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thank you! But then is it possible to use proc means and proc summary to create a customized table?
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@kmardinian wrote:
Thank you! But then is it possible to use proc means and proc summary to create a customized table?
One thing you will have to consider is which type of "table" do you mean? Do you mean a data set such as is used for further analysis or reporting or a simple report from existing data?
Procs Mean, Summary, Freq, Tabulate and Report will all do both to some extent.
For example (your should have the SASHELP.CLASS data set available):
proc means data=sashelp.class n mean max min std lclm uclm; class sex; var height weight; output out=work.summary; run;
Will produce a report table in the results window in one format and with the output statement a data set that looks like
Sex _TYPE_ _FREQ_ _STAT_ Height Weight
0 19 N 19.0000 19.000
0 19 MIN 51.3000 50.500
0 19 MAX 72.0000 150.000
0 19 MEAN 62.3368 100.026
0 19 STD 5.1271 22.774
F 1 9 N 9.0000 9.000
F 1 9 MIN 51.3000 50.500
F 1 9 MAX 66.5000 112.500
F 1 9 MEAN 60.5889 90.111
F 1 9 STD 5.0183 19.384
M 1 10 N 10.0000 10.000
M 1 10 MIN 57.3000 83.000
M 1 10 MAX 72.0000 150.000
M 1 10 MEAN 63.9100 108.950
M 1 10 STD 4.9379 22.727
The _type_ variable refers to level(s) of the class variable(s) used; _freq_ is the number of records used, _stat_ indicates which statistic is reported on that line for the variables and the last two columns hold the actual statistic values (excluding the confidence limits as I didn't add all of the syntax needed for that).
In general having a single column with multiple types of data is often considered a poor idea.
The report procedures Tabulate and Report will allow most of the same statistics as Means/Summary and a few others as well plus displaying output in grouped, nested and across behaviors.