turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- General Programming
- /
- Am I going crazy? Percentages in PROC TABULATE are...

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

05-25-2017 10:00 AM

class syear;

weight annualwt;

where employed=**1**;

table syear*annualwt*(sumwgt pctsum);

Results in these incorrect percentages (also occurs for rowpctn and all other options)!

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to fieldsa83

05-25-2017 10:02 AM - edited 05-25-2017 10:03 AM

So without your data or any sample data, results, output or expected output I'm going to say no, the results are correct.

They're not what you want, but we have no idea of what you want....

Edit: I don't feel qualified to answer the first part of your question but I could guess...

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to fieldsa83

05-25-2017 10:42 AM

You will need to show some input data and desired result.

I suspect that you are not understanding which variable count or sum is being used for the denominator and numerator.

Putting the weight variable in the table statment is not really how weight is intended to work

data junk; input syear weight ; datalines; 2000 15.3 2001 12.9 2003 10 ; run; proc tabulate data=junk; class syear; weight weight; table syear*weight, sum pctsum /style=[Pretext='when weight is in table'] ; run;

results

Count when weight is in table

Sum | PctSum | ||
---|---|---|---|

syear | 234.09 | 46.77 | |

2000 | weight | ||

2001 | weight | 166.41 | 33.25 |

2003 | weight | 100.00 | 19.98 |

Which with some insight 234.09 is 15.3*15.3 - weight in effect counts each value WEIGHT times for the sum.

Moral of the story: requesting statistics from weight other than N are likely not to be what you think they are.

Tabulate is going to expect another variable to SUM, which should be on a VAR statement.

I have no idea what you are actually attempting so example data and desired output for that example is really needed.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to ballardw

05-25-2017 12:20 PM

After trying a few options, I have to use SUMWGT instead of SUM (otherwise the results are completely off). The PCTSUM seems to use the 'completely off' numbers from SUM. So it's almost like I need a PCTSUMWGT (which doesn't work--I tried!).

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to fieldsa83

05-25-2017 01:30 PM

**Example data;**

**Desired results for that example.**

And I'll reiterate you don't actually a variable you are summing properly as Tabulate expects. So of course SUM is "wrong" and of course Pctsum, rowpctsum or colpctsum use such a sum. IF you have an actual analysis VAR variable you **may** be able to get the percentage you need using the pctsum<varname> which uses the sums of the stated variable. But it really should be a VAR variable.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to ballardw

05-25-2017 01:55 PM

Sorry I'm completely lost. If i delete the WEIGHT line, and instead put VAR ANNUALWT it actually works 100%. Is this what you're recommending--just get rid of the WEIGHT line altogether??

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to fieldsa83

05-25-2017 07:06 PM

The WEIGHT statement is to provide other information about the data. Please see this brief example:

data junk; input syear weight value ; datalines; 2000 15.3 18 2001 12.9 22 2003 10 12 ; run; proc tabulate data=junk; class syear; weight weight; var value; table syear, value *(sum pctsum) / ; Title "with weight statement"; run;title; proc tabulate data=junk; class syear; var value; table syear, value *(sum pctsum) / ; Title "without weight statement"; run;title;

Weights are often the inverse of a probability of selection so that adjustments, using a weight statement can be made. For instance if the probability of selecting a person of type X from a population is 0.1 then the weight might be 10. So when we combine the responses of different people the weight information helps provide a better view of what's going on overall.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to ballardw

05-26-2017 10:33 AM

Perhaps it is my inexperience with different types of weights. The data I'm using is from a sample survey, and each respondent has a weight that is a proportion of the target population.

So here is an example of the data structure and where the weight works with sumwgt but not with pctsum:

input year personid $ province $ personweight;

datalines;

2016 Person1 Ontario 10

2016 Person2 Ontario 15

2016 Person3 Ontario 15

2016 Person4 Ontario 5

2016 Person5 Ontario 15

2016 Person6 Quebec 10

2016 Person7 Quebec 10

2016 Person8 Quebec 5

2016 Person9 Quebec 10

2016 Person10 Quebec 5

;**run**;

class year province;

weight personweight;

table province,year*personweight*(sumwgt pctsum);

Title "Proper sumwgt but incorrect pctcum";

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to fieldsa83

05-26-2017 11:12 AM

Perhaps you really want to use the FREQ statement instead of the WEIGHT statement for that type of data?

```
proc tabulate data=new;
class year province;
freq personweight;
table province,year*(N pctn);
run;
```

Results

--------------------------------------------------- | | year | | |-------------------------| | | 2016 | | |-------------------------| | | N | PctN | |-----------------------+------------+------------| |province | | | |-----------------------| | | |Ontario | 60.00| 60.00| |-----------------------+------------+------------| |Quebec | 40.00| 40.00| ---------------------------------------------------

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

05-26-2017 02:58 PM

OK, this is great thanks; it appears to work for the most part.

However, this does have a limitarion which I have observed:

"

a FREQ variable value is missing or is less than 1 | does not use that observation to calculate statistics | no alternative |

"

For the monthly data, this is fine. But for annual averages, I've been dividing the personweight by 12, which has been working. However, for many cases this results in an annualweight of less than 1 so they appear to be excluded, and it is lowering the N by quite a bit. Any way to deak with this?

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to fieldsa83

05-26-2017 08:34 PM - edited 05-26-2017 08:35 PM

Again: Please provide sample data and a desired output sample for what you want our help with (=data for your annual averages). Getting such data means much more clarity for us what's the problem, better answers for you and less guess work and time spent for us to provide you with tested code samples.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to fieldsa83

05-26-2017 09:07 PM

Change the order of operations, divide by 12 after you calculate your final values.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Reeza

05-26-2017 10:11 PM

Yes, I tried doing (N/12) but that didn't work. Is there a way to do it in PROC TABULATE?

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to fieldsa83

05-26-2017 11:06 PM

Not AFAIK.

PS. If you're working with PUMF files the documentation for most that I've seen illustrate how to handle the weights with example SAS code.