BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
acros
Calcite | Level 5

Hi,

I'm trying to take data that's in this form:

County FIPSSurvey Question
220011 (yes)
220011 (yes)
220012 (no)
220011 (yes)
220039 (refused)
220031 (yes)
220031 (yes)
220051 (yes)
220052 (no)

And get summary statistics (percent that said no for each county) in this form:

County FIPSPercent No
2200125%
220030%
2200550%

I've tried using PROC FREQ in various ways to create different outputs, but I'm not getting this simple and straightforward output - County and % No.

I would greatly appreciate any guidance or suggestions.

Amanda

Sorry, error in original second table

1 ACCEPTED SOLUTION

Accepted Solutions
Steelers_In_DC
Barite | Level 11

I don't come up with the same numbers that you do but here is a solution, there's probably a neater way to do this with proc summary but I'm not too experienced with it:

Data have;

infile cards dsd;

length County_FIPS $5. Survey_Question $11.;

input County_FIPS$ Survey_Question$;

cards;

22001,1 (yes)

22001,1 (yes)

22001,2 (no)

22001,1 (yes)

22003,9 (refused)

22003,1 (yes)

22003,1 (yes)

22005,1 (yes)

22005,2 (no)

;

proc sql;

create table want as

select county_fips,count(survey_question)/counts as percent_no

from (select county_fips,survey_question,count(survey_question) as counts

from have

group by county_fips)

where survey_question contains 'no'

group by county_fips;

If you want to count refused you can change 'contains 'no'' to 'not contains 'yes'' but it's 33%, not 67%.

Hope this helps.

Edit:

proc sql;

create table want as

select county_fips,count(survey_question)/counts as Percent_No format = percent.

from (select county_fips,survey_question,count(survey_question) as counts

from have

group by county_fips)

where survey_question not contains 'yes'

group by county_fips;

Message was edited by: Mark Johnson

View solution in original post

7 REPLIES 7
Steelers_In_DC
Barite | Level 11

I don't come up with the same numbers that you do but here is a solution, there's probably a neater way to do this with proc summary but I'm not too experienced with it:

Data have;

infile cards dsd;

length County_FIPS $5. Survey_Question $11.;

input County_FIPS$ Survey_Question$;

cards;

22001,1 (yes)

22001,1 (yes)

22001,2 (no)

22001,1 (yes)

22003,9 (refused)

22003,1 (yes)

22003,1 (yes)

22005,1 (yes)

22005,2 (no)

;

proc sql;

create table want as

select county_fips,count(survey_question)/counts as percent_no

from (select county_fips,survey_question,count(survey_question) as counts

from have

group by county_fips)

where survey_question contains 'no'

group by county_fips;

If you want to count refused you can change 'contains 'no'' to 'not contains 'yes'' but it's 33%, not 67%.

Hope this helps.

Edit:

proc sql;

create table want as

select county_fips,count(survey_question)/counts as Percent_No format = percent.

from (select county_fips,survey_question,count(survey_question) as counts

from have

group by county_fips)

where survey_question not contains 'yes'

group by county_fips;

Message was edited by: Mark Johnson

acros
Calcite | Level 5

Thank you, Mark! Your code gave me exactly what I wanted! Perfect!

Rick_SAS
SAS Super FREQ

I don't understand your results for FIPS=22003. Shouldn't it be 0%? Or do you want to count 9=refuced as a "no"?

Anyway:

1) sort the data by FIPS

2) Create a format which has the value "No" for the answers that you consider to be negative.

3) Use PROC FREQ to tabulate the binary formatted variable BY FIPS

4) Keep only the observations that have the formatted value "No"

Here's an example where I take the sashelp.cars data and make "No" mean that the cars do not have 6 cylinders:


proc sort data=sashelp.cars out=cars;
by origin;
run;

proc format;
value YESNO 6="Yes"
            other="No ";
run;

proc freq data=cars noprint;
format Cylinders YESNO.;
by origin;
tables Cylinders / missing out=FreqOut;
run;

proc print data=FreqOut;
where put(cylinders,YESNO.)="No";
run;

acros
Calcite | Level 5

Yes, I corrected my table. Would this give me percentages, not just counts? Mark's solution worked perfectly so I can use that. Thanks.

ballardw
Super User

This does raw count percentages:

proc freq data=have noprint;

     tables fips*survey_question/ list out=want(where=(survey_question= "2 (no)"));

run;

proc print data=want noobs;

var  fips percent;

label fips='County FIPS'

     percent = "Percent No"

     ;

run;

Or another way: When interested in a specific level of a variable you can create a dichotomous 1/0 for yes/no values and summarize that.

One question: are refused to be considered in denominator or not?

If you want refused (or don't know or other possible not applicable responses) in denominator:

If not missing(survey_question) then SQ_d = (survey_question = "2 (no)");

The not missing is to avoid creating a record with a code when the question has no response.

To exclude the refused or others, something like:

If not missing(survey_question) and survey_question ne "9 (refused)"  then SQ_d = (survey_question = "2 (no)");

The mean of the SQ_D variable would be a percent No for those answered in decimal. 

There are questions then:

Are there weights to be applied? Then you need to apply the weights using an appropriate procedure with a weight statement. Proc means or summary would work with Fips as a class variable.

And if so are they complex? Now you need to move to Proc SurveyFreq or Surveymeans.

May I say that whoever is providing data as 1 (yes) isn't doing any favors? A value of 1,2,9 and knowledge of the coding works well. Values of "yes", "no", "refused" works well. Both adds way to much text and likelihood of coding issues with extra () and getting quotes in the right spots.

Reeza
Super User

That seems like a straight proc freq unless I'm missing something:

Data have;

infile cards dsd;

length County_FIPS $5. Survey_Question $11.;

input County_FIPS$ Survey_Question$;

cards;

22001,1 (yes)

22001,1 (yes)

22001,2 (no)

22001,1 (yes)

22003,9 (refused)

22003,1 (yes)

22003,1 (yes)

22005,1 (yes)

22005,2 (no)

;

proc freq data=have;

where Survey_Question ne '9 (refused)';

table county_fips*survey_question/list outpct out=want(where=(survey_question='2 (no)')) sparse;

run;

proc print data= want;

var county_fips pct_row;

run;

Ksharp
Super User

Code: Program

Data have;
infile cards dsd;
length County_FIPS $5. Survey_Question $11.;
input County_FIPS$ Survey_Question$;
cards;
22001,1 (yes)
22001,1 (yes)
22001,2 (no)
22001,1 (yes)
22003,9 (refused)
22003,1 (yes)
22003,1 (yes)
22005,1 (yes)
22005,2 (no)
;
proc sql;
select County_FIPS,sum(Survey_Question contains 'no')/count(*) as per format=percent8.2
from have
  group by County_FIPS;
quit;


Log: Program

Notes (3)

1 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;

53 

54 Data have;

55 infile cards dsd;

56 length County_FIPS $5. Survey_Question $11.;

57 input County_FIPS$ Survey_Question$;

58 cards;

NOTE: The data set WORK.HAVE has 9 observations and 2 variables.

NOTE: DATA statement used (Total process time):

  real time 0.00 seconds

  cpu time 0.00 seconds

  

68 ;

69 proc sql;

70 select County_FIPS,sum(Survey_Question contains 'no')/count(*) as per format=percent8.2

71 from have

72 group by County_FIPS;

73 quit;

NOTE: PROCEDURE SQL used (Total process time):

  real time 0.02 seconds

  cpu time 0.03 seconds

  

74 

75 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;

85 


Results: Program

2200125.00%
220030.00%
2200550.00%

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 7 replies
  • 1732 views
  • 2 likes
  • 6 in conversation