On base SAS 9.4
I have a licensure file, and I need to make a list of demographic percentage per geographic location.
I want a table that looks something like this (except table-y):
zipcode/%male/%white/%black/%over65
29202/50/80/15/68
....
29993/20/65/30/23
I've tried multiple forms of:
proc means data=rfa.data;
var male age race4;
by zipcode;
where activemd in (1,2)and exclude = 0;
output out=rfa.mdgender;
run;
and
proc freq data=rfa.data;
table zipcode / out=rfa.mdgender3 nocol nopercent;
by age;
where activemd in (1,2)and exclude = 0;
run;
and
proc freq data=rfa.data;
table age male race4 / out=rfa.mdgender3 nocol nopercent;
by zipcode;
where activemd in (1,2)and exclude = 0;
run;
and none of them do what I want it to do. Is there a way?
You would make your coding a lot easier if you used integers 0 and 1 (not characters '0' and '1') for sex and other binary variables, and the ages were actual integers instead of character strings. Nevertheless, here is code that seems to provide the table you want
proc import datafile="C:\Users\<username>0\Downloads\fakesas.csv"
dbms=csv out=a; run;
data b;
set a;
if sex=2 then sex=0; /* 1 is male */
if input(scan(agegroup5,1,'-'),2.)>=65 then over65=1;
else over65=0;
if race4=4 then black=1;
else black=0;
if race4=1 then white=1;
else white=0;
run;
proc report data=b;
columns zipcode sex white black over65;
define zipcode/group;
define sex/analysis mean;
define white/analysis mean;
define black/analysis mean;
define over65/analysis mean;
run;
I think we'd need to see a portion of your original data. Depending on how this is set up, you could use PROC MEANS/PROC SUMMARY or even PROC REPORT to get such a table.
I can't upload it because it's proprietary. It's a list of 10,000ish observations with 163 variables: zipcode, age, age categories, race, license number, state they live in, provider specialty, FIPS county code, etc. Each observation is an individual provider.
I wasn't asking you to upload the data, I want to see a portion of it (or a made up portion of it where you type in a few records of made up data), so we can see what the data looks like. We only need to see the data or made up data for the variables you mentioned, we don't need other variables.
Words don't help.
Your SAS install has a number of data sets that should be available. You can use one of those for examples.
What you want may be possible with one of the report procedures directly if the % that you show are percent of counts:
One example with a simple data set:
proc tabulate data=sashelp.class; class sex age; table age, sex*rowpctn ; run;
In the above the variable Age creates one row for each value. I think that is what you want your zipcode to do.
If you have a sex variable with male and female you get a similar percent per column.
Proc tabulate will support a Where statement to subset data.
A specific warning for Proc tabulate: any record with one or more CLASS variables having a missing value is excluded from analysis.
You can force them by including the option \missing on the class statement.
Or and this one reason why @PaigeMiller indicated we need to know what your data looks like, if your variable for sex is numeric such that 1=male and 0=Female you can get percentages by taking appropriate means. Is your "race" field limited to two values black and white? With a name like RACE4 is suspect 4 levels but you only show two in the "desired table". So what would be done with the rest? What denominator do want used for the %white calculation?
Is your age a category like "over 65" or do you have values of individual years?
Okay, sorry, busy week. Attached an Excel sheet with a 20 obs sample of similar but made up data. Sex is a variable where 1 is male, and 2 is female, so I made male and female dichotomous variables and use male in my analysis. Race4 is the commonly used four race categories. _Age is numeric. AgeGroup will be the better one for me to use but I haven't attempted that in proc tabulate yet. zipcode is categorical; zipnum is numeric.
I also did my own research and wound up with this code. However, that gives me the summary of ALL zip codes, which I also attached. It's not by zip code. I need it by each zip code.
ods pdf file="tab.pdf";
proc tabulate data=lib1.data;
class male race4;
var zipnum;
tables zipnum*rowpctsum,male race4 all;
where activemd in (1,2)and exclude = 0;
run;
ods pdf close;
To get age in there, I assume I'd add AgeGroup in class and in the tables statement, but my bigger problem is getting individual zip codes rather than the summary?
Since some of us will not download or open a .xlsx file as it is a potential security violation, could you convert it to plain text or .csv?
Sorry!
You would make your coding a lot easier if you used integers 0 and 1 (not characters '0' and '1') for sex and other binary variables, and the ages were actual integers instead of character strings. Nevertheless, here is code that seems to provide the table you want
proc import datafile="C:\Users\<username>0\Downloads\fakesas.csv"
dbms=csv out=a; run;
data b;
set a;
if sex=2 then sex=0; /* 1 is male */
if input(scan(agegroup5,1,'-'),2.)>=65 then over65=1;
else over65=0;
if race4=4 then black=1;
else black=0;
if race4=1 then white=1;
else white=0;
run;
proc report data=b;
columns zipcode sex white black over65;
define zipcode/group;
define sex/analysis mean;
define white/analysis mean;
define black/analysis mean;
define over65/analysis mean;
run;
That worked! Thanks!!!
Is there a reason why proc tabulate wouldn't work though to create a table with individual zip codes?
I'm sure TABULATE can do the job as well, but I don't use PROC TABULATE, so I can't help you with it.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.