Solved: Proc tabulate column percentage discrepancy

sasgorilla · Posted 02-13-2025 12:39 AM

I set out to create a summary demographics table for a dataset with 6 variables: groups, sex, age, race, ethnicity, and status as below.

Total

Group A

Group B

Group C

Group D

Group E

N

%

N

%

N

%

N

%

N

%

N

%

Sex

Female

Male

Age

<20

20-30

≥30

Etc...

Total

Using the proc tabulate code below I am pretty close:

proc tabulate data=c;
	class groups sex age race ethnicity status;
	classlev sex age race ethnicity status /s=[JUST=R];
	var newcount;
	table sex="Sex" age="Age" race="Race" ethnicity="Ethnicity" status="Status" ALL="TOTAL", ALL="TOTAL"*(newcount="N"*f=8.0 colpctn="%"*f=8.1) groups*(newcount="N"*f=8,0 colpctn="%"*f=8.1);
run;

But, I am having a few issues:

1) There is a "Sum" cell below my "N". How can I remove?

N	%	N	%	N	%	N	%	N	%	N	%
Sum	%	Sum	%	Sum	%	Sum	%	Sum	%	Sum	%

2) Though my numbers are right, my column percents are not accurate. I am not sure where they are coming from, but I imagine it has to do with my proc summary or another step below.

As a test, I ran a proc freq of sex*group with weight by newcount, which gets me what should be the correct column percentages (although my frequencies are no longer the table N). I tried weighting my proc tabulate by newcount but that only significantly increases the N without changing the %.

To retrace my steps:

1) I ran a proc summary for the variables of interest as below:

proc summary data=a;
	class groups sex age race ethnicity status;
	var count;
	output out=b n= /autoname;
run;

2) I ran a data step to get the monthly average since the original data is a year long and I want to present the average monthly demographics.

data c;
	set b;
	newcount=count_N/12;
	format newcount 8.0 ;
	drop count_n _freq_ ;
run;

3) I ran the proc tabulate as above.

I know this probably has a simple solution, but I've been struggling with it for several hours and just can't see what I'm missing. Thanks in advance!

Ksharp · Posted 02-17-2025 08:36 PM

Since your analysis variable 's implied statistic is SUM ,not N , therefore you should use COLPCTSUM (not COLPCTN) , and N should be replaced with SUM .

libname x v9 "C:\Users\xiakeshan\Downloads\";
options nofmterr;
proc tabulate data=x.c;
	class groups sex age race ethnicity status;
	classlev sex age race ethnicity status /s=[JUST=R];
	var newcount;
	table sex="Sex" age="Age" race="Race" ethnicity="Ethnicity" status="Status" ALL="TOTAL", 
  ALL="TOTAL"*(newcount=''*(sum="N"*f=8.0 colpctsum="%"*f=8.1)) 
  groups*(newcount=''*(sum="N"*f=8.0 colpctsum="%"*f=8.1));
run;

View solution in original post

Ksharp · Posted 02-13-2025 01:53 AM

You need to post some data to present your problem.

You need KEYLABEL statement to get rid of 'SUM' label.

Here is an example:

proc tabulate data=sashelp.heart;
	class  sex status bp_status;
	var ageatstart;
	table sex="Sex" status="Status" ALL="TOTAL", ALL="TOTAL"*(ageatstart="N"*f=8.0 colpctn="%"*f=8.1) bp_status=''*(ageatstart="N"*f=8.0 colpctn="%"*f=8.1);

keylabel sum=' ';
run;

sasgorilla · Posted 02-17-2025 02:52 PM

Sorry for the slow response. The keylabel statement did the trick for the "sum" label. Thank you!

I've attached a sample randomized dataset I created off of plausible values from the dataset I am using. It is formatted the same way, and produces the same problem (column % does not equal the true percentage%).

This is part of the table produced from the below code, and percentages appear to be off in all columns.

I repaste the proc tabulate code I am using here because I noticed an errant comma in my first post.

proc tabulate data=c;
	class groups sex age race ethnicity status;
	classlev sex age race ethnicity status /s=[JUST=R];
	var newcount;
	table sex="Sex" age="Age" race="Race" ethnicity="Ethnicity" status="Status" ALL="TOTAL", ALL="TOTAL"*(newcount="N"*f=8.0 colpctn="%"*f=8.1) groups*(newcount="N"*f=8.0 colpctn="%"*f=8.1);
run;

Can you identify what is happening with the proc tabulate statement to calculate the wrong %? Thank you!

Ksharp · Posted 02-17-2025 08:36 PM

Since your analysis variable 's implied statistic is SUM ,not N , therefore you should use COLPCTSUM (not COLPCTN) , and N should be replaced with SUM .

libname x v9 "C:\Users\xiakeshan\Downloads\";
options nofmterr;
proc tabulate data=x.c;
	class groups sex age race ethnicity status;
	classlev sex age race ethnicity status /s=[JUST=R];
	var newcount;
	table sex="Sex" age="Age" race="Race" ethnicity="Ethnicity" status="Status" ALL="TOTAL", 
  ALL="TOTAL"*(newcount=''*(sum="N"*f=8.0 colpctsum="%"*f=8.1)) 
  groups*(newcount=''*(sum="N"*f=8.0 colpctsum="%"*f=8.1));
run;

sasgorilla · Posted 02-17-2025 10:15 PM

Thank you! The colpctn was the problem and your changes made it work.

I really appreciate your help!

PaigeMiller · Posted 02-13-2025 07:46 AM

You might want to try the %TABLEN macro for this type of table.

https://communities.sas.com/t5/SAS-Communities-Library/Demographic-Table-and-Subgroup-Summary-Macro-...

--
Paige Miller

sasgorilla · Posted 02-17-2025 03:17 PM

Thanks @PaigeMiller ! I had tried looking into that macro previous to this post and face similar but different issues in applying it to this data set due to how my data is summarized. It appears the macro runs off of "normally" organized long data.

The reason I used proc summary on my data was due to:

1) a count variable in the dataset which basically allowed a single observation to represent multiple people with the same characteristics (i.e., where count was 3 that meant 3 individuals had that set of characteristics)

2) to present the average monthly sample (from the fluctuating year of data I had). So, once summarized by count I divided the count by 12 to get a monthly average.

I'm having trouble thinking of how I could get this result in a long data set, but if you have any ideas then that macro would probably work perfectly.

Proc tabulate column percentage discrepancy

Re: Proc tabulate column percentage discrepancy

Re: Proc tabulate column percentage discrepancy

Re: Proc tabulate column percentage discrepancy

Re: Proc tabulate column percentage discrepancy

Re: Proc tabulate column percentage discrepancy

Re: Proc tabulate column percentage discrepancy

Re: Proc tabulate column percentage discrepancy

Registration is open

SAS Training: Just a Click Away