Hi all, SAS noob here. Any help or tips are greatly appreciated. I have a large dataset that contains information about hundreds of thousands of employees. I did the following to get the percentage of the employees' sexes:
PROC FREQ data=CPAGroups;
tables Sex/ nocum;
RUN;
This is the result:
These are my questions:
1. The 'Sex' variable is coded as '1' for male and '2' for female. How can I change these values within the PROC FREQ step so that it displays the results with the labels 'Male' and 'Female' instead of 1 and 2?
2. How can I delete the 'Frequency' column? I am only interested in the 'Percent' of males/females.
3. Is it possible to insert a statement either in the log or in the Results Viewer that says "Women account for __% of employees.", where the calculated value of Percent for females is inserted into the "__%"?
Thank you so much for your help!
The FORMAT statement assigns a format to a variable, so you need both in the statement:
format sex sexfmt.;
And you're right, the nofreq only works in crosstabulation tables. OTOH, the main purpose of PROC FREQ is the creation of counts, so no big surprise there.
If you want a heavily customized output, you're better off by creating an intermediate dataset and do the report from that. Like I did with PUTting the sentences into the log.
1. For the formatting, there's at least two ways:
- you can replace the sex variable with another (type character) that holds the formatted values
- you can create a format that renders 'male' for 1 and 'female' for 2 and assing that to the sex variable
2. use the "nofreq" option in the tables statement
3. I'd output the results to a dataset and do something like this:
proc freq data=CPAGroups noprint;
tables sex /nocum nofreq out=want;
run;
proc format library=work;
value sex
1 = 'Men'
2 = 'Women'
;
run;
data _null_;
set want;
format sex sex.;
put sex @;
put 'account for '@;
put PERCENT 5.2@;
put '% of employees.';
run;
Hi, I tried to use a PROC FORMAT statement:
PROCformat;
value sexfmt 1="Male" 2="Female";
RUN;
PROCFREQ data=CPAGroups;
tables sex / nofreq nocum;
format sexfmt.;
RUN;
SAS returns the following error in the log:
186 PROC FREQ data=CPAGroups;
187 tables sex / nofreq nocum;
188 format sexfmt.;
-------
22
76
ERROR
22-322: Syntax error, expecting one of the following: a name, _ALL_,
_CHARACTER_, _CHAR_,
_NUMERIC_.
ERROR
76-322: Syntax error, statement will be ignored.
189 RUN;
On top of that, when I run PROC FREQ even without the FORMAT statement, it still shows frequencies despite my use of "nofreq".
Any help with this would be greatly appreciated!
The FORMAT statement assigns a format to a variable, so you need both in the statement:
format sex sexfmt.;
And you're right, the nofreq only works in crosstabulation tables. OTOH, the main purpose of PROC FREQ is the creation of counts, so no big surprise there.
If you want a heavily customized output, you're better off by creating an intermediate dataset and do the report from that. Like I did with PUTting the sentences into the log.
A further option (though remember efficiency is built into SAS functions - this however could be run on the DB) is:
proc sql;
create table WORK.FREQ as
select case when A.SEX="M" then "Male"
else "Female" end as SEX,
A.NUM_OBS / B.NOBS * 100 as PERCENTAGE
from (select SEX,count(SEX) as NUM_OBS from SASHELP.CLASSFIT group by SEX) A
left join (select NOBS from DICTIONARY.TABLES where LIBNAME="SASHELP" and MEMNAME="CLASSFIT") B
on 1=1;
quit;
data a;
input gen;
cards;
1
1
1
1
1
2
2
2
2
2
2
2
2
2
;
proc format ;
value gender 1="Female"
2="Male";
run;
proc freq data=a;
table gen;
format gen gender.;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.