demographic info by geographic location.

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 7
Accepted Solution

demographic info by geographic location.

On base SAS 9.4

 

I have a licensure file, and I need to make a list of demographic percentage per geographic location.

 

I want a table that looks something like this (except table-y):

 

zipcode/%male/%white/%black/%over65

29202/50/80/15/68

....

29993/20/65/30/23

 

I've tried multiple forms of:

proc means data=rfa.data;
var male age race4;
by zipcode;
where activemd in (1,2)and exclude = 0;
output out=rfa.mdgender;
run;

and

proc freq data=rfa.data;
table zipcode / out=rfa.mdgender3 nocol nopercent;
by age;
where activemd in (1,2)and exclude = 0;
run;

and

proc freq data=rfa.data;
table age male race4 / out=rfa.mdgender3 nocol nopercent;
by zipcode;
where activemd in (1,2)and exclude = 0;
run;

and none of them do what I want it to do. Is there a way?


Accepted Solutions
Solution
Thursday
Respected Advisor
Posts: 3,069

Re: demographic info by geographic location.

[ Edited ]
Posted in reply to jesspurse

You would make your coding a lot easier if you used integers 0 and 1 (not characters '0' and '1') for sex and other binary variables, and the ages were actual integers instead of character strings. Nevertheless, here is code that seems to provide the table you want

 

proc import datafile="C:\Users\<username>0\Downloads\fakesas.csv"
     dbms=csv out=a; run;
data b;
	set a;
	if sex=2 then sex=0; /* 1 is male */
	if input(scan(agegroup5,1,'-'),2.)>=65 then over65=1; 
        else over65=0;
	if race4=4 then black=1;
	else black=0;
	if race4=1 then white=1;
	else white=0;
run;

proc report data=b;
	columns zipcode sex white black over65;
	define zipcode/group;
	define sex/analysis mean;
	define white/analysis mean;
	define black/analysis mean;
	define over65/analysis mean;
run;

  

--
Paige Miller

View solution in original post


All Replies
Respected Advisor
Posts: 3,069

Re: demographic info by geographic location.

Posted in reply to jesspurse

I think we'd need to see a portion of your original data. Depending on how this is set up, you could use PROC MEANS/PROC SUMMARY or even PROC REPORT to get such a table.

--
Paige Miller
Occasional Contributor
Posts: 7

Re: demographic info by geographic location.

Posted in reply to PaigeMiller

I can't upload it because it's proprietary. It's a list of 10,000ish observations with 163 variables: zipcode, age, age categories, race, license number, state they live in, provider specialty, FIPS county code, etc. Each observation is an individual provider.

Respected Advisor
Posts: 3,069

Re: demographic info by geographic location.

Posted in reply to jesspurse

I wasn't asking you to upload the data, I want to see a portion of it (or a made up portion of it where you type in a few records of made up data), so we can see what the data looks like. We only need to see the data or made up data for the variables you mentioned, we don't need other variables.

 

Words don't help.

--
Paige Miller
Super User
Posts: 13,584

Re: demographic info by geographic location.

Posted in reply to jesspurse

Your SAS install has a number of data sets that should be available. You can use one of those for examples.

 

What you want may be possible with one of the report procedures directly if the % that you show are percent of counts:

One example with a simple data set:

proc tabulate data=sashelp.class;
   class sex age;
   table age,
         sex*rowpctn
   ;
run;

In the above the variable Age creates one row for each value. I think that is what you want your zipcode to do.

 

If you have a sex variable with male and female you get a similar percent per column.

Proc tabulate will support a Where statement to subset data.

A specific warning for Proc tabulate: any record with one or more CLASS variables having a missing value is excluded from analysis.

You can force them by including the option \missing on the class statement.

 

Or and this one reason why @PaigeMiller indicated we need to know what your data looks like, if your variable for sex is numeric such that 1=male and 0=Female you can get percentages by taking appropriate means. Is your "race" field limited to two values black and white? With a name like RACE4 is suspect 4 levels but you only show two in the "desired table". So what would be done with the rest? What denominator do want used for the %white calculation?

Is your age a category like "over 65" or do you have values of individual years?

Occasional Contributor
Posts: 7

Re: demographic info by geographic location.

Okay, sorry, busy week. Attached an Excel sheet with a 20 obs sample of similar but made up data. Sex is a variable where 1 is male, and 2 is female, so I made male and female dichotomous variables and use male in my analysis. Race4 is the commonly used four race categories. _Age is numeric. AgeGroup will be the better one for me to use but I haven't attempted that in proc tabulate yet. zipcode is categorical; zipnum is numeric.

 

I also did my own research and wound up with this code. However, that gives me the summary of ALL zip codes, which I also attached. It's not by zip code. I need it by each zip code.

 

ods pdf file="tab.pdf";
proc tabulate data=lib1.data;
class male race4;
var zipnum;
tables zipnum*rowpctsum,male race4 all;
where activemd in (1,2)and exclude = 0;
run;
ods pdf close;

To get age in there, I assume I'd add AgeGroup in class and in the tables statement, but my bigger problem is getting individual zip codes rather than the summary?

Respected Advisor
Posts: 3,069

Re: demographic info by geographic location.

Posted in reply to jesspurse

Since some of us will not download or open a .xlsx file as it is a potential security violation, could you convert it to plain text or .csv?

--
Paige Miller
Occasional Contributor
Posts: 7

Re: demographic info by geographic location.

Posted in reply to PaigeMiller

Sorry!

Solution
Thursday
Respected Advisor
Posts: 3,069

Re: demographic info by geographic location.

[ Edited ]
Posted in reply to jesspurse

You would make your coding a lot easier if you used integers 0 and 1 (not characters '0' and '1') for sex and other binary variables, and the ages were actual integers instead of character strings. Nevertheless, here is code that seems to provide the table you want

 

proc import datafile="C:\Users\<username>0\Downloads\fakesas.csv"
     dbms=csv out=a; run;
data b;
	set a;
	if sex=2 then sex=0; /* 1 is male */
	if input(scan(agegroup5,1,'-'),2.)>=65 then over65=1; 
        else over65=0;
	if race4=4 then black=1;
	else black=0;
	if race4=1 then white=1;
	else white=0;
run;

proc report data=b;
	columns zipcode sex white black over65;
	define zipcode/group;
	define sex/analysis mean;
	define white/analysis mean;
	define black/analysis mean;
	define over65/analysis mean;
run;

  

--
Paige Miller
Occasional Contributor
Posts: 7

Re: demographic info by geographic location.

Posted in reply to PaigeMiller

That worked! Thanks!!!

 

Is there a reason why proc tabulate wouldn't work though to create a table with individual zip codes?

Respected Advisor
Posts: 3,069

Re: demographic info by geographic location.

Posted in reply to jesspurse

I'm sure TABULATE can do the job as well, but I don't use PROC TABULATE, so I can't help you with it.

--
Paige Miller
☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 10 replies
  • 85 views
  • 0 likes
  • 3 in conversation