turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- Base SAS Programming
- /
- Creating a new variable that shows percentage of t...

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

05-19-2015 12:16 PM

Hello all! I have just started a new job and have been teaching myself to use SAS. Progress is slow since I have very little experience with programming languages. I would be incredibly appreciative of any help with this problem!

I have a large raw macro data set. Based on this raw data, a former co-worker used SAS to make an Excel spreadsheet. In the spreadsheet, there are 3 columns: Group, Subgroup, and the Number of Observations of Subgroup (i.e. the number of times the subgroup was observed in the data). Specifically, the Groups are things like “Administrative Services”, and the Subgroups include specific jobs like “Receptionists”, “Assistants”, and “Managers”. Below I have made an example of what the table looks like. In reality, there aren’t just 2 Groups, there are hundreds!

I have imported the Excel data into SAS. My question is: **How can I make the “Share of Group” column in SAS?**

Group | Subgroup | Number of Observations | Share of Group |

Administrative Services | Receptionists | 5 | 41.6% |

Assistants | 4 | 33.3% | |

Managers | 3 | 25% | |

Computer Science | Data Analysts | 10 | 71.4% |

IT Support | 4 | 28.5% |

What I would like to do is the following: For each Subgroup, calculate its percentage share of the entire group based on the Number of Observations. For example, I would like to be able to calculate a new variable that will show that Receptionists were 41.6% of all observed Administrative Services employees, Assistants were 33.3%, etc.

Since I have a large number of Groups, I am guessing that the best way to do this for each Group is to make a macro, but I’m not sure what to tell SAS to do in the macro.

Thank you so much for reading this! I really appreciate any help or tips I can get.

Accepted Solutions

Solution

05-19-2015
12:58 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

05-19-2015 12:58 PM

I think this is what you are looking for. Two sql queries will get the job done. Run this:

data have;

infile cards dsd dlm=',';

length group subgroup $15.;

input group $ subgroup $ num_obs;

cards;

Admin,Receptionists,5

Admin,Assistants,4

Admin,Managers,3

CompSci,DataAnalysts,10

CompSci,ITSupport,4

;

run;

proc sql;

create table want as

select group,subgroup,num_obs/tot_obs as share format=percent7.1

from(select *,sum(num_obs) as tot_obs

from have

group by group);

All Replies

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

05-19-2015 12:24 PM

PROC FREQ, with BY GROUP; with the OUT= option will give you this information

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

05-19-2015 12:39 PM

proc tabulate data=have;

class group subgroup;

table group*subgroup, n pctn<subgroup>;

run;

Will make a summary table directly similar to what you want.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

05-19-2015 12:54 PM

Thanks so much for your reply! Unfortunately this didn't work. SAS counted how many times the name of the Subgroup appeared in the Group, and did not use the Number of Observations variable instead. So my table ended up looking like

Receptionists 1 0.33

Assistants 1 0.33

Managers 1 0.33

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

05-20-2015 09:34 AM

I like **ballardw** 's code. Question is what form of data you would like ? report or table ?

data have; infile cards dsd dlm=','; length group subgroup $15.; input group $ subgroup $ num_obs; cards; Admin,Receptionists,5 Admin,Assistants,4 Admin,Managers,3 CompSci,DataAnalysts,10 CompSci,ITSupport,4 ; run; proc tabulate data=have; class group subgroup; var num_obs; table group*subgroup, num_obs*(sum pctsum); run;

Xia Keshan

Solution

05-19-2015
12:58 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

05-19-2015 12:58 PM

I think this is what you are looking for. Two sql queries will get the job done. Run this:

data have;

infile cards dsd dlm=',';

length group subgroup $15.;

input group $ subgroup $ num_obs;

cards;

Admin,Receptionists,5

Admin,Assistants,4

Admin,Managers,3

CompSci,DataAnalysts,10

CompSci,ITSupport,4

;

run;

proc sql;

create table want as

select group,subgroup,num_obs/tot_obs as share format=percent7.1

from(select *,sum(num_obs) as tot_obs

from have

group by group);

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

05-19-2015 01:07 PM

Thank you so much! You, sir, have just made my day so much easier.