turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- ODS and Base Reporting
- /
- How to calculate right percentage using across fun...

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

11-25-2009 05:16 AM

Dear All

This is the code...................

/***************************************/

data test;

input state $11. city $9. area $2. people 2. earning 3.;

datalines;

PUNJAB NABHA A 20 14

PUNJAB NABHA B 10 5

PUNJAB NABHA C 5 2

PUNJAB NABHA A 41 26

PUNJAB NABHA B 15 10

PUNJAB NABHA C 14 6

PUNJAB NABHA A 52 24

PUNJAB NABHA B 35 32

PUNJAB NABHA C 42 40

PUNJAB NABHA A 74 50

PUNJAB NABHA B 25 12

PUNJAB NABHA C 42 30

PUNJAB PATIALA A 21 15

PUNJAB PATIALA B 11 5

PUNJAB PATIALA C 6 3

PUNJAB PATIALA A 42 30

PUNJAB PATIALA B 16 10

PUNJAB PATIALA C 15 10

PUNJAB PATIALA A 53 41

PUNJAB PATIALA B 36 32

PUNJAB PATIALA C 43 14

MAHARSHTRA MUMBAI A 75 70

MAHARSHTRA MUMBAI B 21 15

MAHARSHTRA MUMBAI C 11 8

MAHARSHTRA MUMBAI A 6 3

MAHARSHTRA MUMBAI B 42 30

MAHARSHTRA MUMBAI C 16 10

MAHARSHTRA KHANDALA A 15 10

MAHARSHTRA KHANDALA B 53 50

MAHARSHTRA KHANDALA C 36 30

MAHARSHTRA KHANDALA A 43 40

MAHARSHTRA KHANDALA B 75 70

MAHARSHTRA KHANDALA C 26 20

MAHARSHTRA KHANDALA A 43 40

MAHARSHTRA KHANDALA B 22 20

MAHARSHTRA KHANDALA C 12 10

MAHARSHTRA KHANDALA A 7 3

MAHARSHTRA KHANDALA B 43 40

MAHARSHTRA KHANDALA C 17 16

MAHARSHTRA KHANDALA A 16 15

MAHARSHTRA KHANDALA B 54 50

MAHARSHTRA KHANDALA C 37 36

;

RUN;

DATA EARNIN;

SET TEST;

FORMAT EARNING_PER PERCENT10.2;

EARNING_PER = EARNING/PEOPLE;

RUN;

PROC REPORT DATA = EARNIN;

COLUMN STATE CITY AREA,(PEOPLE EARNING EARNING_PER);

DEFINE STATE/GROUP;

DEFINE CITY/GROUP;

DEFINE AREA/ACROSS;

DEFINE PEOPLE/ANALYSIS;

DEFINE EARNING/ANALYSIS;

DEFINE EARNING_PER/ANALYSIS;

BREAK AFTER STATE/SUMMARIZE;

RBREAK AFTER/SUMMARIZE;

RUN;

/***************************************/

Here in the output of proc report, the EARNING_PER column which shows the percentage is also getting add up. But instead of that i want that column to be the exact percentage i.e EARNING/PEOPLE and also in break and rbreak,i need the same thing...e.g In EARNING_PER for A it should show

87.09% i.e 108/124 against KHANDALA,

90.12% i.e 73/81 against MUMBAI

/**********************************/

So please can anyone help me out on this?

This is the code...................

/***************************************/

data test;

input state $11. city $9. area $2. people 2. earning 3.;

datalines;

PUNJAB NABHA A 20 14

PUNJAB NABHA B 10 5

PUNJAB NABHA C 5 2

PUNJAB NABHA A 41 26

PUNJAB NABHA B 15 10

PUNJAB NABHA C 14 6

PUNJAB NABHA A 52 24

PUNJAB NABHA B 35 32

PUNJAB NABHA C 42 40

PUNJAB NABHA A 74 50

PUNJAB NABHA B 25 12

PUNJAB NABHA C 42 30

PUNJAB PATIALA A 21 15

PUNJAB PATIALA B 11 5

PUNJAB PATIALA C 6 3

PUNJAB PATIALA A 42 30

PUNJAB PATIALA B 16 10

PUNJAB PATIALA C 15 10

PUNJAB PATIALA A 53 41

PUNJAB PATIALA B 36 32

PUNJAB PATIALA C 43 14

MAHARSHTRA MUMBAI A 75 70

MAHARSHTRA MUMBAI B 21 15

MAHARSHTRA MUMBAI C 11 8

MAHARSHTRA MUMBAI A 6 3

MAHARSHTRA MUMBAI B 42 30

MAHARSHTRA MUMBAI C 16 10

MAHARSHTRA KHANDALA A 15 10

MAHARSHTRA KHANDALA B 53 50

MAHARSHTRA KHANDALA C 36 30

MAHARSHTRA KHANDALA A 43 40

MAHARSHTRA KHANDALA B 75 70

MAHARSHTRA KHANDALA C 26 20

MAHARSHTRA KHANDALA A 43 40

MAHARSHTRA KHANDALA B 22 20

MAHARSHTRA KHANDALA C 12 10

MAHARSHTRA KHANDALA A 7 3

MAHARSHTRA KHANDALA B 43 40

MAHARSHTRA KHANDALA C 17 16

MAHARSHTRA KHANDALA A 16 15

MAHARSHTRA KHANDALA B 54 50

MAHARSHTRA KHANDALA C 37 36

;

RUN;

DATA EARNIN;

SET TEST;

FORMAT EARNING_PER PERCENT10.2;

EARNING_PER = EARNING/PEOPLE;

RUN;

PROC REPORT DATA = EARNIN;

COLUMN STATE CITY AREA,(PEOPLE EARNING EARNING_PER);

DEFINE STATE/GROUP;

DEFINE CITY/GROUP;

DEFINE AREA/ACROSS;

DEFINE PEOPLE/ANALYSIS;

DEFINE EARNING/ANALYSIS;

DEFINE EARNING_PER/ANALYSIS;

BREAK AFTER STATE/SUMMARIZE;

RBREAK AFTER/SUMMARIZE;

RUN;

/***************************************/

Here in the output of proc report, the EARNING_PER column which shows the percentage is also getting add up. But instead of that i want that column to be the exact percentage i.e EARNING/PEOPLE and also in break and rbreak,i need the same thing...e.g In EARNING_PER for A it should show

87.09% i.e 108/124 against KHANDALA,

90.12% i.e 73/81 against MUMBAI

/**********************************/

So please can anyone help me out on this?

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

11-25-2009 08:57 AM

Suggest the following Google advanced search to find technical conference papers on the topic:

proc report calculate percentage column site:sas.com

Scott Barry

SBBWorks, Inc.

proc report calculate percentage column site:sas.com

Scott Barry

SBBWorks, Inc.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

11-25-2009 09:14 AM

Hi:

The default statistic is the SUM statistic, when you define an item on PROC REPORT as "ANALYSIS" usage. So you are, by default, asking for the SUM statistic. Other statistics are available (such as PCTN and PCTSUM), as described here:

http://support.sas.com/documentation/cdl/en/basess/58133/HTML/default/a001353244.htm (under the topic Descriptive Statistics).

ACROSS is not a function. It is a PROC REPORT usage specification -- just like ANALYSIS, another usage specification, tells PROC REPORT that you want a specific statistic, ACROSS as a usage specification, tells PROC REPORT that you want a certain variable's values to make unique columns going across the top of the report. GROUP and ORDER and DISPLAY are also PROC REPORT usage specifications. In addition to all these usage specifications is the COMPUTED usage specification. When would you use COMPUTED as a usage specification???

Generally speaking, items on a PROC REPORT report row come from the data. I can see from this INPUT statement that you have the following variables in your data: STATE CITY AREA PEOPLE and EARNING.

[pre]

input state $11. city $9. area $2. people 2. earning 3.;

[/pre]

Then, you have computed EARNING_PER in a DATA step program. One of the most powerful usages in PROC REPORT is the COMPUTED usage, which allows you to compute report items within PROC REPORT instead of needing to run a separate DATA step program. For more information about COMPUTE blocks, which are used to calculate values for REPORT items, refer to:

http://support.sas.com/documentation/cdl/en/proc/61895/HTML/default/a002473626.htm

My recommendation would be to read the documentation about how PROC REPORT works and about requesting statistics versus using a COMPUTE block and experiment with your program to see whether using PCTSUM, PCTN or a COMPUTE block will generate the percent that you want.

cynthia

The default statistic is the SUM statistic, when you define an item on PROC REPORT as "ANALYSIS" usage. So you are, by default, asking for the SUM statistic. Other statistics are available (such as PCTN and PCTSUM), as described here:

http://support.sas.com/documentation/cdl/en/basess/58133/HTML/default/a001353244.htm (under the topic Descriptive Statistics).

ACROSS is not a function. It is a PROC REPORT usage specification -- just like ANALYSIS, another usage specification, tells PROC REPORT that you want a specific statistic, ACROSS as a usage specification, tells PROC REPORT that you want a certain variable's values to make unique columns going across the top of the report. GROUP and ORDER and DISPLAY are also PROC REPORT usage specifications. In addition to all these usage specifications is the COMPUTED usage specification. When would you use COMPUTED as a usage specification???

Generally speaking, items on a PROC REPORT report row come from the data. I can see from this INPUT statement that you have the following variables in your data: STATE CITY AREA PEOPLE and EARNING.

[pre]

input state $11. city $9. area $2. people 2. earning 3.;

[/pre]

Then, you have computed EARNING_PER in a DATA step program. One of the most powerful usages in PROC REPORT is the COMPUTED usage, which allows you to compute report items within PROC REPORT instead of needing to run a separate DATA step program. For more information about COMPUTE blocks, which are used to calculate values for REPORT items, refer to:

http://support.sas.com/documentation/cdl/en/proc/61895/HTML/default/a002473626.htm

My recommendation would be to read the documentation about how PROC REPORT works and about requesting statistics versus using a COMPUTE block and experiment with your program to see whether using PCTSUM, PCTN or a COMPUTE block will generate the percent that you want.

cynthia

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

11-26-2009 12:40 AM

Dear Cynthia

Thanks for your reply and i know how to use the computed column in this case but the thing is that the data which i had sent you is the sample data and in my actual data there will be aound 70-80 columns in the final report and out of that 20-30 will be there to calculate the percentage.and it will be a tediuos task to use computed then. And also i have already used PCTN but it does not give desired result..

Thanks for your reply and i know how to use the computed column in this case but the thing is that the data which i had sent you is the sample data and in my actual data there will be aound 70-80 columns in the final report and out of that 20-30 will be there to calculate the percentage.and it will be a tediuos task to use computed then. And also i have already used PCTN but it does not give desired result..

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

11-26-2009 12:38 PM

Hi:

I thought it possible that PCTN or PCTSUM would not suit your need. But I don't think calculating EARNING_PER in a DATA step is going to work either. I really do think the COMPUTE block is the way to go. And, you're right...the fact is that it -is- tedious to hard-code computed items, since you have to use absolute column numbers when you are combining COMPUTED items with ACROSS items.

But, since the computed items fall into a predictable pattern, it is easy to write a macro program to generate the COMPUTE block logic for your program, as shown in this paper on pages 12, 13 and 14:

http://support.sas.com/rnd/papers/sgf07/sgf2007-report.pdf

For example, given the data and variables that you showed in your original post, your COMPUTE block would look something like this:

[pre]

PROC REPORT DATA = test nowd;

COLUMN STATE CITY AREA,(PEOPLE EARNING EARNING_PER);

DEFINE STATE/GROUP;

DEFINE CITY/GROUP;

DEFINE AREA/ACROSS;

DEFINE PEOPLE/sum;

DEFINE EARNING/sum;

DEFINE EARNING_PER/computed f=PERCENT10.2 'Earning Per';

BREAK AFTER STATE/SUMMARIZE;

RBREAK AFTER/SUMMARIZE;

compute earning_per;

** Use Absolute column numbers in COMPUTE block for ACROSS items;

_c5_ = _c4_ / _c3_;

_c8_ = _c7_ / _c6_;

_c11_ = _c10_ / _c9_;

endcomp;

RUN;

[/pre]

You can see the pattern that the items fall into. Conceptually, State and City are the first and second column on the report. That means that for Area A, column 3 is PEOPLE, column 4 is EARNING and column 5 is the calculated EARNING_PER. Then, Area B starts... Proc REPORT assigns a special number variables which are under ACROSS items. The column numbers are: _C??_ where the ?? would be replaced by the variable's position on the report row, as calculated from the number of ACROSS items and the number of variables under each ACROSS item. For your data, the pattern is:

_C3_, _C6_ and _C9_ represent the PEOPLE value for areas A, B and C

_C4_, _C7_ and _C10_ represent the EARNING value for areas A, B and C

_C5_, _c8_ and _C11_ represent the computed earning_per for areas A, B and C

As it explains in the paper, once you know how many items will be -under- each ACROSS item (in your case, 3 items under each across item) and you know how many items are on the report row -before- the ACROSS items start (in your case, there are 2 items BEFORE Area A starts so the first absolute column number is _c3_), you can then write a macro program that will generate as many statements in the COMPUTE block as you need. You don't even have to know how many ACROSS items there are -- you can have the macro program do that for you.

Otherwise, your choice would be to presummarize all the data using a different procedure and then calculate the percent manually-- but on the summarized data.

cynthia

I thought it possible that PCTN or PCTSUM would not suit your need. But I don't think calculating EARNING_PER in a DATA step is going to work either. I really do think the COMPUTE block is the way to go. And, you're right...the fact is that it -is- tedious to hard-code computed items, since you have to use absolute column numbers when you are combining COMPUTED items with ACROSS items.

But, since the computed items fall into a predictable pattern, it is easy to write a macro program to generate the COMPUTE block logic for your program, as shown in this paper on pages 12, 13 and 14:

http://support.sas.com/rnd/papers/sgf07/sgf2007-report.pdf

For example, given the data and variables that you showed in your original post, your COMPUTE block would look something like this:

[pre]

PROC REPORT DATA = test nowd;

COLUMN STATE CITY AREA,(PEOPLE EARNING EARNING_PER);

DEFINE STATE/GROUP;

DEFINE CITY/GROUP;

DEFINE AREA/ACROSS;

DEFINE PEOPLE/sum;

DEFINE EARNING/sum;

DEFINE EARNING_PER/computed f=PERCENT10.2 'Earning Per';

BREAK AFTER STATE/SUMMARIZE;

RBREAK AFTER/SUMMARIZE;

compute earning_per;

** Use Absolute column numbers in COMPUTE block for ACROSS items;

_c5_ = _c4_ / _c3_;

_c8_ = _c7_ / _c6_;

_c11_ = _c10_ / _c9_;

endcomp;

RUN;

[/pre]

You can see the pattern that the items fall into. Conceptually, State and City are the first and second column on the report. That means that for Area A, column 3 is PEOPLE, column 4 is EARNING and column 5 is the calculated EARNING_PER. Then, Area B starts... Proc REPORT assigns a special number variables which are under ACROSS items. The column numbers are: _C??_ where the ?? would be replaced by the variable's position on the report row, as calculated from the number of ACROSS items and the number of variables under each ACROSS item. For your data, the pattern is:

_C3_, _C6_ and _C9_ represent the PEOPLE value for areas A, B and C

_C4_, _C7_ and _C10_ represent the EARNING value for areas A, B and C

_C5_, _c8_ and _C11_ represent the computed earning_per for areas A, B and C

As it explains in the paper, once you know how many items will be -under- each ACROSS item (in your case, 3 items under each across item) and you know how many items are on the report row -before- the ACROSS items start (in your case, there are 2 items BEFORE Area A starts so the first absolute column number is _c3_), you can then write a macro program that will generate as many statements in the COMPUTE block as you need. You don't even have to know how many ACROSS items there are -- you can have the macro program do that for you.

Otherwise, your choice would be to presummarize all the data using a different procedure and then calculate the percent manually-- but on the summarized data.

cynthia

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

12-02-2009 02:04 AM

Thanks a lot.......Cynthia for your help