Solved
Contributor
Posts: 23

# Creating rows in a data set based on sums of existing variables

I need to write code create rows inside an existing dataset based on groupings. I have the following existing dataset:

 Year Area Profit 2001 A 1 2002 A 2 2001 B 1 2001 C 3 2002 C 1 2001 E 4 2002 E 2 2001 F 3 2002 F 4

I need to add in more rows corresponding to two more areas. Area D consists of A,B,C and Area G consists of E and F. I need to sum over the groups and years and add them into the existing dataset so it looks as such:

 Year Area Profit 2001 A 1 2002 A 2 2001 B 1 2001 C 3 2002 C 1 2001 E 4 2002 E 2 2001 F 3 2002 F 4 2001 D 5 2002 D 3 2001 G 7 2002 G 6

Accepted Solutions
Solution
2 weeks ago
Super User
Posts: 23,342

## Re: Creating rows in a data set based on sums of existing variables

Try MultiLabel Formats with PROC MEANS.

``````data have;
input Year    Area \$    Profit;
cards;
2001    A   1
2002    A   2
2001    B   1
2001    C   3
2002    C   1
2001    E   4
2002    E   2
2001    F   3
2002    F   4
;;;;
run;

proc format;
value \$ area_fmt (multilabel)
'A' = 'A'
'B' = 'B'
'C' = 'C'
'A', 'B', 'C' = 'D'
'E' = 'E'
'F' = 'F'
'E', 'F' = 'G';
run;

proc means data=have noprint nway;
class year area / mlf;
format area \$area_fmt.;
var profit;
output out=want sum(profit)=profit;
run;

``````

All Replies
Super User
Posts: 13,358

## Re: Creating rows in a data set based on sums of existing variables

I generally discourage have summaries inside data as you may end up getting things counted twice: sum of profit for the entire data set.

How will you be using that data set with the summaries included?

Contributor
Posts: 23

## Re: Creating rows in a data set based on sums of existing variables

It will be used to display graphs on profit. No additional calculations will be made

Solution
2 weeks ago
Super User
Posts: 23,342

## Re: Creating rows in a data set based on sums of existing variables

Try MultiLabel Formats with PROC MEANS.

``````data have;
input Year    Area \$    Profit;
cards;
2001    A   1
2002    A   2
2001    B   1
2001    C   3
2002    C   1
2001    E   4
2002    E   2
2001    F   3
2002    F   4
;;;;
run;

proc format;
value \$ area_fmt (multilabel)
'A' = 'A'
'B' = 'B'
'C' = 'C'
'A', 'B', 'C' = 'D'
'E' = 'E'
'F' = 'F'
'E', 'F' = 'G';
run;

proc means data=have noprint nway;
class year area / mlf;
format area \$area_fmt.;
var profit;
output out=want sum(profit)=profit;
run;

``````
☑ This topic is solved.