BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
ybz12003
Rhodochrosite | Level 12

Hello, 

Could someone explain the table statement below? Thanks.

class scrdate_n studysite;
var chart_n enrolled_n;
table chart_n*studysite=' '*(sum*f=8. pctsum<enrolled_n>*f=8.)
	      enrolled_n*studysite*sum, scrdate_n /misstext='0';
1 ACCEPTED SOLUTION

Accepted Solutions
Cynthia_sas
SAS Super FREQ

Hi:

  No, the number of lines in the TABLE statement does not determine the number of TABLES. As I tried (and failed) to explain, the number of COMMAS indicates whether you have a one-dimension, two-dimension, or three-dimension table. A one dimension TABLE has NO commas in the TABLE statement; a two dimension table has 1 comma in the TABLE statement and a three dimension table has 2 commas in the TABLE statement. Your TABLE statement has 2 tables stacked in the row dimension and using SASHELP.SHOES, I came up with a mock-up of what I think your table might resemble:

Cynthia_sas_0-1683676314156.png

 

 

  This is basic PROC TABULATE syntax along with the * for nesting and the space for stacking and the < > brackets for defining a custom denominator for PCTSUM statistic.

  I recommend you play with some simpler examples to understand the TABLE statement and then move to the more advanced features, such as using ='Label" or =' ' to suppress labels and using the *f= to assign a specific format. and using <> for the custom denominator. Here is a different view of your TABLE statement, with the ROW dimension and the COMMA and the COLUMN dimension highlighted. The space in the ROW dimension means to stack the table for the first variable and statistics on top of the second variable.

Cynthia_sas_0-1683665701032.png

 

View solution in original post

13 REPLIES 13
Cynthia_sas
SAS Super FREQ
Hi:
Can you run the code and look at the TABLE statement side-by-side with the table in the output? The CLASS and VAR statements declare HOW the variables in the table will be USED. The TABLE statement specifies how the variables will be arranged in the table dimensions. A PROC TABULATE table can have 3 dimensions -- PAGE,ROW,COLUMN
ROW, COLUMN
COLUMN
With an addition BY statement it is possible to introduce another level, but I don't see a BY statement in your posted code. The trick in a TABLE statement is to take note of the number of commas in the TABLE statement because each comma is how you specify a dimension in a table.
I also see that you are using a custom denominator for your PCTSUM in the TABLE statement. It would be useful if you could post ALL of your code and a sample of the data because in this case, it looks like your data may already be summarized data. So any fake data that people would try to make to run your code might be in the wrong structure or contain values that were not representative of your actual data.
Cynthia

Cynthia
ybz12003
Rhodochrosite | Level 12

I got the code from someone else.  You're right; some variable "_n" were generated from previous PROC SUMMARY.

proc summary data=ARI2123 nway;
	class scrdate_n studysite;
	var c_enrolled;
	output out=c_enrolled(drop=_type_ _freq_) n=enrolled_n;
run;

proc summary data=ARI2123 nway;
	class scrdate_n studysite;
	var chart_review_complete;
	where chart_review_complete^=2;
	output out=chart_review_incomplete(drop=_type_ _freq_) n=chart_n;
run;

data chart_review_incomplete_enroll;
	merge c_enrolled chart_review_incomplete;
	by scrdate_n studysite;
run;

proc tabulate data=chart_review_incomplete_enroll;
	class scrdate_n studysite;
	var chart_n enrolled_n;
	table chart_n*studysite=' '*(sum*f=8. pctsum<enrolled_n>*f=8.)
	      enrolled_n*studysite*sum, scrdate_n /misstext='0';
run;

My main goal is trying to learn the table statement below.

table chart_n*studysite=' '*(sum*f=8. pctsum<enrolled_n>*f=8.)
	      enrolled_n*studysite*sum, scrdate_n /misstext='0';
ballardw
Super User

Any specific question?

SAS publishes an entire book on Proc Tabulate so you can't expect that much as response.

 

* nests variables and statistics. How they behave depends on what type of variable, Class or Var and to an extent the order on the Table statement.

, separates dimensions. STRONGLY recommend placing the commas at either the start or end of a line in the editor so they are easier to find.

( ) group like items. So multiple statistics for combination of nested variables would be (n pctn colpctn) or (n sum colpctsum) for examples. Again, how the variables behave depends on Class or Var type.

 

Consider this example which creates 3 slightly different appearing tables. You should have the SASHELP.CLASS data set so you can run the code to see the result.

proc tabulate data=sashelp.class;
   class sex age;
   var height weight;
   table age,
         weight*sex*mean
   ;
   table age,
         sex*weight*mean
   ;
   table age,
         mean*sex*weight
   ;
run;

Note the differences in the column headings between the 3 tables and the result in the body.

 

ybz12003
Rhodochrosite | Level 12
What's "age," for?
ballardw
Super User

@ybz12003 wrote:
What's "age," for?

If you are asking that with such a simple example you need to spend a lot of time reading the documentation for proc tabulate.

ybz12003
Rhodochrosite | Level 12

Does this mean the code trying to create three tables?

Table 1

chart_n*studysite=' '*(sum*f=8. pctsum<enrolled_n>*f=8.)

Table 2

enrolled_n*studysite*sum,

Table 3

scrdate_n
Cynthia_sas
SAS Super FREQ

Hi:

  No, the number of lines in the TABLE statement does not determine the number of TABLES. As I tried (and failed) to explain, the number of COMMAS indicates whether you have a one-dimension, two-dimension, or three-dimension table. A one dimension TABLE has NO commas in the TABLE statement; a two dimension table has 1 comma in the TABLE statement and a three dimension table has 2 commas in the TABLE statement. Your TABLE statement has 2 tables stacked in the row dimension and using SASHELP.SHOES, I came up with a mock-up of what I think your table might resemble:

Cynthia_sas_0-1683676314156.png

 

 

  This is basic PROC TABULATE syntax along with the * for nesting and the space for stacking and the < > brackets for defining a custom denominator for PCTSUM statistic.

  I recommend you play with some simpler examples to understand the TABLE statement and then move to the more advanced features, such as using ='Label" or =' ' to suppress labels and using the *f= to assign a specific format. and using <> for the custom denominator. Here is a different view of your TABLE statement, with the ROW dimension and the COMMA and the COLUMN dimension highlighted. The space in the ROW dimension means to stack the table for the first variable and statistics on top of the second variable.

Cynthia_sas_0-1683665701032.png

 

ybz12003
Rhodochrosite | Level 12

Thanks, Cynthia.  The PPT is very helpful!!  I looked back that two tables been created. It looks like the first table was "studysite" (ROW) x "scrdate_n" (Column) by the sum (format=8.) and pctsum (format=8.) of "chart_n".   The other table is "studysite" (ROW) x "scrdate_n" (Column) by the sum of "enrolled_n",  Am I right?  

 

What's the code below used for?

studysite=' '

 

ballardw
Super User

@ybz12003 wrote:

Thanks, Cynthia.  The PPT is very helpful!!  I looked back that two tables been created. It looks like the first table was "studysite" (ROW) x "scrdate_n" (Column) by the sum (format=8.) and pctsum (format=8.) of "chart_n".   The other table is "studysite" (ROW) x "scrdate_n" (Column) by the sum of "enrolled_n",  Am I right?  

 

What's the code below used for?

studysite=' '

 

 


Suggestion: run the code with and without the =' '. Observe the difference. You will likely remember better when you do it that some explaining..

Cynthia_sas
SAS Super FREQ
Hi:
Try putting this:
studysite='ss label'
in the quotes and re-run the code. Then run it again with just a space. Basically, you use that syntax to impact the label in the row and column areas.
Cynthia
ballardw
Super User

How many table statements are there in your code? If there is only one TABLE that is how many tables are being requested.

Your code, slighlty reformatted:

proc tabulate data=chart_review_incomplete_enroll;
	class scrdate_n studysite;
	var chart_n enrolled_n;
	table chart_n*studysite=' '*(sum*f=8. pctsum<enrolled_n>*f=8.  
enrolled_n*studysite*sum, scrdate_n /misstext='0'; run;

ONE comma: everything before that comma represents ROWS. After that, until you reach the /, for table options, represent columns.

The first set of Rows will have one row for each formatted level of Studysite with the sum of Chart_n and the pctsum of that total/sum of enrolled_n for the level of studysite. Since this is the ROW dimension each statistic will be in a separate row cell within the level of the Class variable.
There will then be a second block of rows for each formatted level of StudySite with the sum of enrolled_n. Any of the cells that don't have the statistics will have a 0.

Since Scrdate_n is a Class variable there will be one column of statistics for each formatted level of Scrdate_n.

 

In logical terms if a table has a page dimension, which will create displayed output that is not connected and looks like a a different "table" it is still logically one table as far as the procedure output is concerned. One table statement <=> one "table".

ybz12003
Rhodochrosite | Level 12

Thanks for both of your valuable help!  I wish I could give both "Accepted as Solution" to you two!!

ballardw
Super User

FWIW, I spend a lot of time with Proc Tabulate because most of the reports I am requested to make nest rows and columns, sometimes 3 or 4 variables deep. Powerful tool but like many such has a learning curve.

It is important to know that you can have multiple CLASS, VAR, and CLASSLEV (control appearance of values of the row/heading values) statements because you may need different options in both.

A big caveat, that will likely bite you a dozen times before it sticks, is the default behavior for CLASS variables is that missing values mean that the entire observation is excluded from the report [which is why you may need the Missing option on a class variable].

 

You also need to pay attention to which dimension a variable/statistic appears in. One reason is to have the right dimension for your desired table appearance but the second is that you cannot cross statistics. That means the intersection of an N statistic ROW request cannot cross the Mean of a COLUMN request.

 

A common approach to some specific issues is to create numeric 1/0 values for some variables to get around some of the "missing" behavior limits as the SUM of a 1/0 coded variable is the number of 1s and the MEAN is the percent of 1s. So you may find that adding another variable or two will let you do something that Class variables won't do directly.

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 13 replies
  • 2514 views
  • 7 likes
  • 3 in conversation