Help using Base SAS procedures

Set One row Category to the Sum of Another

Reply
Valued Guide
Posts: 613

Set One row Category to the Sum of Another

I have two sets of rows in this table.  The general rows and a redundant total row.  

 

Is there a simple way to do this?  I can think of a few that involve 10 lines of code.

 

data merged_SAL; set merged_SAL;
if CONTRACT_LENGTH = ('9 10 11 12') then do;
SALARY_OUTLAYS = /*sum of salary_outlays where CONTRACT_LENGTH in ('9', '10', '11', '12') */
end;
run;

 

 

Valued Guide
Posts: 613

Re: Set One row Category to the Sum of Another

Posted in reply to DavidPhillips2

Note in an example like this:

data test;
infile cards;
input A1 A2 A3 A4 A5;
cards;
2 2 5 1 0
;
run;

data sums;
 set test;

 /* method 1: array of all numerics */
 array n {*} _numeric_;
 sum_allnum=sum(of n[*]);

 /*method 2: variable range in data set */
 sum_range=sum(of a1-a5);
run;

The author is summing across columns.  I am suming across rows.

Super User
Posts: 3,909

Re: Set One row Category to the Sum of Another

Posted in reply to DavidPhillips2

Here is one way with SQL. You haven't really explained your problem completely - it would help if you supplied sample input and output data - so I'm guessing this is close to what you want. If you don't want sums by contract_length take out the GROUP BY.

 

proc sql;
  create table want as
  select A.*
           ,B.salary_outlays_sum
  from merged_sal as A
  left join
  (select contract_length
             ,sum(salary_outlays) as salary_outlays_sum
   from  merged_sal
  where contract_length in ('9','10','11','12')
  group by contract_length
  ) as B
  on A.contract_length = B.contract_length
  ;
quit;
Trusted Advisor
Posts: 1,337

Re: Set One row Category to the Sum of Another

[ Edited ]
Posted in reply to DavidPhillips2

You are summing over rows.  That much is obvious.  Are you saying you want to PRODUCE a redundant ROW, which is the sum of the contract_lengths between 9 and 12?  Or are you producing a new column with row sums for contact lengths 9, 10, 11, and 12, respectively.  I.e. if the individual contract is 9, you want to add a SALARY_OUTLAYS_SUM variable summing all length 9 contracts.  That's what @SASKiwi did in SQL.  Here's how in a data step:

 

data want;
  set sal_merged (in=prelim where=(9<=contract_length<=12))
      sal_merged (in=inkeep);
  array tmpsum {9:12} _temporary_;
  if prelim then tmpsum{contract_length}+salary_outlays;
  if inkeep;
  if (9<=contract_length<=12) then salary_outlays_sum=tmpsum{contract_length};
run;

I assume that contract_length is a numeric variable.  The SET statement reads the data set twice, first to generate sums in a temporary array, and the second to output while retrieve sums from the same array.

 

Valued Guide
Posts: 613

Re: Set One row Category to the Sum of Another

[ Edited ]

Sample data is below:

 

 

 

data sal_merged;

infile datalines;

input identifer contract_length salary_outlays ;

return;

datalines;

1 9 10000

2 9 20000

3 10 10000

4 11 30000

5 12 10000

6 9 20000

7 11 30000

8 10 30000

9 12 20000

10 13 40000

11 13 40000

12 13 20000

;

run;

 

I’m trying to sum the values of contact lengths 9-12 as the value for contract length 13.

Valued Guide
Posts: 613

Re: Set One row Category to the Sum of Another

Posted in reply to DavidPhillips2

This works.  Is there a shorter solution?

 

data sal_merged;
infile datalines;
input identifer contract_length salary_outlays gender $1.;
return;
datalines;
1 9 10000 M
2 9 20000 M
3 10 10000 M
4 11 30000 M
5 12 10000 M
6 9 20000 F
7 11 30000 F
8 10 30000 F
9 12 20000 F
10 13 . F
11 13 . M
;
run;

proc sql;
create table tempSalaryOutlays as
select gender
,sum(salary_outlays) as salary_outlays_sum
from sal_merged
where contract_length in (9,10,11,12)
group by gender;

quit;

proc sort data=sal_merged;
by gender;
run;
proc sort data=tempSalaryOutlays;
by gender;
run;

data sal_merged; merge sal_merged (in=x) tempSalaryOutlays;
by gender;
if contract_length = 13 then salary_outlays = salary_outlays_sum;
run;

Trusted Advisor
Posts: 1,337

Re: Set One row Category to the Sum of Another

Posted in reply to DavidPhillips2

@DavidPhillips2   Your program is not the shortest, but more importantly, it must have a problem somewhere.  I ran it and it produces a sum of salaries of 80,000 for contract length 9, but I see only 3 observations with contract length 9  (10000, 20000, and 20000).

 

As to shortest, I believe my single data step qualifies.  And now, then I edited it, it actually works.

 

 

Valued Guide
Posts: 613

Re: Set One row Category to the Sum of Another

[ Edited ]

The total is correct.  Note, the program updates salary outlays.  You can drop the salary_outlays_sum column at the end if you want to.  I left it in for edit check reasons.

 

I modified your program slightly to update contract length 13 only.  I'm trying to figure out how to use your program to do the same split out by gender like the code I posted earlier.  

 

At the end of the below code is a dataset that includes gender.

 

data sal_merged;

infile datalines;

input identifer contract_length salary_outlays ;

return;

datalines;

1 9 10000

2 9 20000

3 10 10000

4 11 30000

5 12 10000

6 9 20000

7 11 30000

8 10 30000

9 12 20000

10 13 40000

11 13 40000

12 13 20000

;

run;

data want;
set sal_merged (in=prelim where=(9<=contract_length<=12))
sal_merged (in=inkeep);
array tmpsum {9:12} _temporary_;
if prelim then tmpsum{contract_length}+salary_outlays;
if inkeep;
if (contract_length=13) then salary_outlays=tmpsum{9} + tmpsum{10} + tmpsum{11} +tmpsum{12};
run;


data sal_merged;
infile datalines;
input identifer contract_length salary_outlays gender $1.;
return;
datalines;
1 9 10000 M
2 9 20000 M
3 10 10000 M
4 11 30000 M
5 12 10000 M
6 9 20000 F
7 11 30000 F
8 10 30000 F
9 12 20000 F
10 13 . F
11 13 . M
;

Trusted Advisor
Posts: 1,337

Re: Set One row Category to the Sum of Another

Posted in reply to DavidPhillips2

@DavidPhillips2   Ah, it was my misinterpretation.  Thanks.

 

As to modification for gender ... If the dataset is sorted by gender, you can utilize the BY statement (untested):

 

data want;

  set salmerged (where=(9<=contract_length<=12) in=prelim)

        salmerged;

  by descending gender;

  array tmpsum{9:12} _temporary_;

  if first.gender then call missing(of tmpsum{*});

 

  if prelim then tmpsum{contract_length}+salary_outlays;

  if inkeep;
  if (contract_length=13) then salary_outlays=sum(of tmpsum{*});
run;

 

Ask a Question
Discussion stats
  • 8 replies
  • 239 views
  • 0 likes
  • 3 in conversation