BookmarkSubscribeRSS Feed
monday89
Fluorite | Level 6

Hi, I need help with creating frequency table (count and %)  by treatment groups

 

IDgroupageweightgenderraceethnicitydiabetes
1placebo25150femalewhitehispanicyes
2treatment35123maleasiannot hispanicno
3placebo23153femaleunknownhispanicno
4placebo34124malewhitehispanicyes
5treatment27110malewhitenot hispanicyes
6treatment44135maleasiannot hispanicyes
7treatment23100femaleasiannot hispanicno

 

The denominator for placebo is 3 and treatment is 4. Something like?

 

 placebo_nplacebo_pcttreatment_ntreatment_pct
female266.7%125.0%
male133.3%375.0%
asian  375.0%
unknown133.3%  
white233.3%125.0%



I tried proc freq data=test; by treatment; tables gender*treatment race*treatment;run;

 

3 REPLIES 3
PaigeMiller
Diamond | Level 26

Complex tables like this aren't going to come directly from PROC FREQ and PROC MEANS. You probably ought to consider using PROC REPORT or PROC TABULATE, but even so, I don't think either of those can produce your tables exactly as you show them. You may need to either re-design the tables, or use the results of PROC FREQ and PROC MEANS in a Data step, where you can program the exact appearance of the tables as you want them.

 

I guess this requires more discussion, how exactly do the SAS outputs have to match what you show, and how much work are you willing to put into it to get there.

--
Paige Miller
mkeintz
PROC Star

Proc tabulate can do this:

 

proc format ;
  picture pct low-high='000.0%';
run;

proc tabulate data=have noseps;
  class group  gender race;
  tables gender='00'x
         race='00'x
      ,  group=' '*(N='N'*f=comma6.0 colpctn='Pct'*f=pct7.0)
    /rts=10 ;
run;
  1. The PROC FORMAT establishes a user-defined format, named PCT.  It converts the number 33.3 to 33.3%  (sort of like putting a dollar sign in front of a number for report purposes.

  2. The tables statement lists 2 classification variables (gender and race) prior to the comma.  These are the row identifiers.

  3. If I didn't have the ='00'x then the words GENDER and RACE would appear in the left-most column, with the remainder of the row containing blank cells.  These rows would be followed by one row per value of that variable.

    I could have used =' ', but that would have generated a true blank in the left column.  The row would consequently be all true blank cells, which tabulate would compress out of the final table.  By using the ='00'x   (hexadecimal 00), it will print as a blank, but not be a true blank, so a visually blank row will separate the gender rows from the race rows.

  4. Following the comma is the column specification.  It will have two major columns (for group=placebo and treatment), and each of those will have two minor columns, one with frequencies, and one with percentages (using COLPCTN as the denominator specification).

  5. The "f=" expressions determine formats.  I use a sas-standard format (comma6.0) for the N stat, and my user-defined format (pct) for the percentage.

 

Yes, SAS has its own standard PERCENT format, but it converts .333 to 33.3%, and converts 333 to 3333%.  And when proc tabulate generates percents, they range from 0 to 100, not 0 to 1.  So I had to make my own format (unless someone can show me an equivalent sas-supplied format).

 

 

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
ed_sas_member
Meteorite | Level 14

Hi @monday89 

 

If you want to create a table you can try the following code:

data test;
	input ID 1 group $ 3-11 age 13-14 weight 16-18 gender $ 20-25 race $ 27-33 ethnicity $ 35-46 diabetes $ 48-51;
	datalines;
1 placebo   25 150 female white   hispanic     yes
2 treatment 35 123 male   asian   not hispanic no
3 placebo   23 153 female unknown hispanic     no
4 placebo   34 124 male   white   hispanic     yes
5 treatment 27 110 male   white   not hispanic yes
6 treatment 44 135 male   asian   not hispanic yes
7 treatment 23 100 female asian   not hispanic no
;
run;

/* compute frequencies */
proc freq data=test;
	tables gender * group / out= group_gender (drop=percent);
	tables race * group   / out= group_race   (drop=percent);
run;

/* compute percentages */
proc sql;
	
	create table group_gender_n_pct as
	select a.gender,a.group,a.count,a.count/b.total as percent format=percent8.2
	from group_gender as a full join
		 (select group, sum(count) as total from group_gender group by group) as b
	on a.group = b.group
	order by gender;
	
	create table group_race_n_pct as
	select a.race,a.group,a.count,a.count/b.total as percent format=percent8.2
	from group_race as a full join
		 (select group, sum(count) as total from group_race group by group) as b
	on a.group = b.group
	order by race;
	
quit;

/* transpose data */

data group_gender_tr;
	length category $20.;
	merge group_gender_n_pct (where=(group="treatment") rename=(count = treatment_n percent=treatment_pct))
		  group_gender_n_pct (where=(group="placebo")   rename=(count = placebo_n percent=placebo_pct));
	by gender;
	category = gender;
	drop group gender;
run;

data group_race_tr;
	length category $20.;
	merge group_race_n_pct (where=(group="treatment") rename=(count = treatment_n percent=treatment_pct))
		  group_race_n_pct (where=(group="placebo")   rename=(count = placebo_n percent=placebo_pct));
	by race;
	category = race;
	drop group race;
run;

/* merge tables related to gender and race : table want */
data want;
	set group_gender_tr group_race_tr;
run;

proc print data=want;
	id category;
	var placebo_n placebo_pct treatment_n treatment_pct;
run;

Capture d’écran 2019-12-15 à 10.43.38.png

 

You can also create a table from PROC TABULATE as described by @mkeintz :

proc format ;
  picture pct low-high='000.0%';
run;

proc tabulate data=test out=have_count noseps;
  class group  gender race;
  tables gender='00'x
         race='00'x
      ,  group=' '*(N='N'*f=comma6.0 colpctn='Pct'*f=pct7.0)
    /rts=10 ;
run;

data have_count_tr (drop= gender race _type_ _page_ _table_ PctN_100);
	format percent percent8.2;
	set have_count;
	length category $20.;
	if gender ne "" then category = gender;
	else category = race;
	percent = PctN_100 / 100;
run;

proc sort data=have_count_tr;
	by category;
run;

data want;
	merge have_count_tr  (where=(group="treatment") rename=(n = treatment_n percent = treatment_pct))
		  have_count_tr  (where=(group="placebo")   rename=(n = placebo_n percent = placebo_pct));
	by category;
	drop group;
run;

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 831 views
  • 0 likes
  • 4 in conversation