## Creating a new variable that shows percentage of total observations in a group

Solved
Occasional Contributor
Posts: 16

# Creating a new variable that shows percentage of total observations in a group

Hello all! I have just started a new job and have been teaching myself to use SAS. Progress is slow since I have very little experience with programming languages. I would be incredibly appreciative of any help with this problem!

I have a large raw macro data set. Based on this raw data, a former co-worker used SAS to make an Excel spreadsheet. In the spreadsheet, there are 3 columns: Group, Subgroup, and the Number of Observations of Subgroup (i.e. the number of times the subgroup was observed in the data). Specifically, the Groups are things like “Administrative Services”, and the Subgroups include specific jobs like “Receptionists”, “Assistants”, and “Managers”. Below I have made an example of what the table looks like. In reality, there aren’t just 2 Groups, there are hundreds!

I have imported the Excel data into SAS. My question is: How can I make the “Share of Group” column in SAS?

 Group Subgroup Number of Observations Share of Group Administrative Services Receptionists 5 41.6% Assistants 4 33.3% Managers 3 25% Computer Science Data Analysts 10 71.4% IT Support 4 28.5%

What I would like to do is the following: For each Subgroup, calculate its percentage share of the entire group based on the Number of Observations. For example, I would like to be able to calculate a new variable that will show that Receptionists were 41.6% of all observed Administrative Services employees, Assistants were 33.3%, etc.

Since I have a large number of Groups, I am guessing that the best way to do this for each Group is to make a macro, but I’m not sure what to tell SAS to do in the macro.

Thank you so much for reading this! I really appreciate any help or tips I can get.

Accepted Solutions
Solution
‎05-19-2015 12:58 PM
Valued Guide
Posts: 864

## Re: Creating a new variable that shows percentage of total observations in a group

I think this is what you are looking for.  Two sql queries will get the job done.  Run this:

data have;

infile cards dsd dlm=',';

length group subgroup \$15.;

input group \$ subgroup \$ num_obs;

cards;

CompSci,DataAnalysts,10

CompSci,ITSupport,4

;

run;

proc sql;

create table want as

select group,subgroup,num_obs/tot_obs as share format=percent7.1

from(select *,sum(num_obs) as tot_obs

from have

group by group);

All Replies
Posts: 3,054

## Re: Creating a new variable that shows percentage of total observations in a group

PROC FREQ, with BY GROUP; with the OUT= option will give you this information

--
Paige Miller
Super User
Posts: 13,582

## Re: Creating a new variable that shows percentage of total observations in a group

proc tabulate data=have;

class group subgroup;

table group*subgroup, n pctn<subgroup>;

run;

Will make a summary table directly similar to what you want.

Occasional Contributor
Posts: 16

## Re: Creating a new variable that shows percentage of total observations in a group

Thanks so much for your reply! Unfortunately this didn't work. SAS counted how many times the name of the Subgroup appeared in the Group, and did not use the Number of Observations variable instead. So my table ended up looking like

Receptionists 1 0.33

Assistants 1 0.33

Managers 1 0.33

Super User
Posts: 10,784

## Re: Creating a new variable that shows percentage of total observations in a group

I like ballardw 's code. Question is what form of data you would like ? report or table ?

```
data have;
infile cards dsd dlm=',';
length group subgroup \$15.;
input group \$ subgroup \$ num_obs;
cards;
CompSci,DataAnalysts,10
CompSci,ITSupport,4
;
run;
proc tabulate data=have;
class group subgroup;
var num_obs;
table group*subgroup, num_obs*(sum pctsum);
run;
```

Xia Keshan

Solution
‎05-19-2015 12:58 PM
Valued Guide
Posts: 864

## Re: Creating a new variable that shows percentage of total observations in a group

I think this is what you are looking for.  Two sql queries will get the job done.  Run this:

data have;

infile cards dsd dlm=',';

length group subgroup \$15.;

input group \$ subgroup \$ num_obs;

cards;

CompSci,DataAnalysts,10

CompSci,ITSupport,4

;

run;

proc sql;

create table want as

select group,subgroup,num_obs/tot_obs as share format=percent7.1

from(select *,sum(num_obs) as tot_obs

from have

group by group);

Occasional Contributor
Posts: 16