BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
sasgorilla
Pyrite | Level 9

I set out to create a summary demographics table for a dataset with 6 variables: groups, sex, age, race, ethnicity, and status as below. 

 

 

 

Total

Group A

Group B

Group C

Group D

Group E

 

 

N

%

N

%

N

%

N

%

N

%

N

%

Sex

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Female

 

 

 

 

 

 

 

 

 

 

 

 

 

Male

 

 

 

 

 

 

 

 

 

 

 

 

Age

 

 

 

 

 

 

 

 

 

 

 

 

 

 

<20

 

 

 

 

 

 

 

 

 

 

 

 

 

20-30

 

 

 

 

 

 

 

 

 

 

 

 

 

≥30

 

 

 

 

 

 

 

 

 

 

 

 

Etc...

 

 

 

 

 

 

 

 

 

 

 

 

 

Total

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Using the proc tabulate code below I am pretty close:

proc tabulate data=c;
	class groups sex age race ethnicity status;
	classlev sex age race ethnicity status /s=[JUST=R];
	var newcount;
	table sex="Sex" age="Age" race="Race" ethnicity="Ethnicity" status="Status" ALL="TOTAL", ALL="TOTAL"*(newcount="N"*f=8.0 colpctn="%"*f=8.1) groups*(newcount="N"*f=8,0 colpctn="%"*f=8.1);
run;

 

But, I am having a few issues:

1) There is a "Sum" cell below my "N". How can I remove? 

N % N % N % N % N % N %
Sum Sum Sum Sum Sum Sum

 

2) Though my numbers are right, my column percents are not accurate. I am not sure where they are coming from, but I imagine it has to do with my proc summary or another step below. 

 

As a test, I ran a proc freq of sex*group with weight by newcount, which gets me what should be the correct column percentages (although my frequencies are no longer the table N). I tried weighting my proc tabulate by newcount but that only significantly increases the N without changing the %. 

 

To retrace my steps: 

1)  I ran a proc summary for the variables of interest as below: 

 

proc summary data=a;
	class groups sex age race ethnicity status;
	var count;
	output out=b n= /autoname;
run; 

2)  I ran a data step to get the monthly average since the original data is a year long and I want to present the average monthly demographics. 

data c;
	set b;
	newcount=count_N/12;
	format newcount 8.0 ;
	drop count_n _freq_ ;
run;

3) I ran the proc tabulate as above. 

 

 

I know this probably has a simple solution, but I've been struggling with it for several hours and just can't see what I'm missing. Thanks in advance!

 

1 ACCEPTED SOLUTION

Accepted Solutions
Ksharp
Super User

Since your analysis variable 's implied statistic is SUM ,not N , therefore you should use COLPCTSUM (not COLPCTN) , and N should be replaced with SUM .

 

libname x v9 "C:\Users\xiakeshan\Downloads\";
options nofmterr;
proc tabulate data=x.c;
	class groups sex age race ethnicity status;
	classlev sex age race ethnicity status /s=[JUST=R];
	var newcount;
	table sex="Sex" age="Age" race="Race" ethnicity="Ethnicity" status="Status" ALL="TOTAL", 
ALL="TOTAL"*(newcount=''*(sum="N"*f=8.0 colpctsum="%"*f=8.1))
groups*(newcount=''*(sum="N"*f=8.0 colpctsum="%"*f=8.1)); run;

Ksharp_0-1739842578766.png

 

View solution in original post

6 REPLIES 6
Ksharp
Super User

You need to post some data to present your problem.

You need KEYLABEL statement to get rid of 'SUM' label.

Here is an example:

 

proc tabulate data=sashelp.heart;
	class  sex status bp_status;
	var ageatstart;
	table sex="Sex" status="Status" ALL="TOTAL", ALL="TOTAL"*(ageatstart="N"*f=8.0 colpctn="%"*f=8.1) bp_status=''*(ageatstart="N"*f=8.0 colpctn="%"*f=8.1);

keylabel sum=' ';
run;

Ksharp_0-1739429582544.png

 

sasgorilla
Pyrite | Level 9

Sorry for the slow response. The keylabel statement did the trick for the "sum" label. Thank you!

 

I've attached a sample randomized dataset I created off of plausible values from the dataset I am using. It is formatted the same way, and produces the same problem (column % does not equal the true percentage%). 

 

This is part of the table produced from the below code, and percentages appear to be off in all columns.

 

sasgorilla_0-1739821374695.png

 

 

I repaste the proc tabulate code I am using here because I noticed an errant comma in my first post. 

proc tabulate data=c;
	class groups sex age race ethnicity status;
	classlev sex age race ethnicity status /s=[JUST=R];
	var newcount;
	table sex="Sex" age="Age" race="Race" ethnicity="Ethnicity" status="Status" ALL="TOTAL", ALL="TOTAL"*(newcount="N"*f=8.0 colpctn="%"*f=8.1) groups*(newcount="N"*f=8.0 colpctn="%"*f=8.1);
run;

 

Can you identify what is happening with the proc tabulate statement to calculate the wrong %? Thank you!

Ksharp
Super User

Since your analysis variable 's implied statistic is SUM ,not N , therefore you should use COLPCTSUM (not COLPCTN) , and N should be replaced with SUM .

 

libname x v9 "C:\Users\xiakeshan\Downloads\";
options nofmterr;
proc tabulate data=x.c;
	class groups sex age race ethnicity status;
	classlev sex age race ethnicity status /s=[JUST=R];
	var newcount;
	table sex="Sex" age="Age" race="Race" ethnicity="Ethnicity" status="Status" ALL="TOTAL", 
ALL="TOTAL"*(newcount=''*(sum="N"*f=8.0 colpctsum="%"*f=8.1))
groups*(newcount=''*(sum="N"*f=8.0 colpctsum="%"*f=8.1)); run;

Ksharp_0-1739842578766.png

 

sasgorilla
Pyrite | Level 9

Thank you! The colpctn was the problem and your changes made it work. 

 

I really appreciate your help!

sasgorilla
Pyrite | Level 9

Thanks @PaigeMiller ! I had tried looking into that macro previous to this post and face similar but different issues in applying it to this data set due to how my data is summarized. It appears the macro runs off of "normally" organized long data. 

 

The reason I used proc summary on my data was due to:

1) a count variable in the dataset which basically allowed a single observation to represent multiple people with the same characteristics (i.e., where count was 3 that meant 3 individuals had that set of characteristics)

2) to present the average monthly sample (from the fluctuating year of data I had). So, once summarized by count I divided the count by 12 to get a monthly average.

 

I'm having trouble thinking of how I could get this result in a long data set, but if you have any ideas then that macro would probably work perfectly. 

sas-innovate-white.png

Register Today!

Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9.

 

Early bird rate extended! Save $200 when you sign up by March 31.

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 638 views
  • 3 likes
  • 3 in conversation