Data calculation

Accepted Solution Solved
Reply
Contributor
Posts: 25
Accepted Solution

Data calculation

Hi all,

 

Please see the following description.

 

Data Have:

 

Obs   Priciple        Cnts    Balance

 1      38244.52     516     80790384.18
 2     107503.57       .             .
 3     158943.74       .             .


What I want to do is sum up 38244.52 + 107503.57 + 158943.74 = 304691.83.
The no of rows in the dataset will dynamically change..in this case I just gave 3 rows.
Cnts and Balance remain the same i.e. they are read from another dataset and cobmbined with principle dataset actualliy(is there a better way?)

I want to deduct total principle from the balance i.e. 80790384.18 - 304691.83 = 80485692.35

and cnts = 516 - 3(no of rows) = 513 in this case.


Data Want:

 

Obs cnts    Balance
1      513     80485692.35

 

Thanks.


Accepted Solutions
Solution
‎11-09-2015 10:08 AM
Super User
Posts: 5,085

Re: Data calculation

As was mentioned, PROC SUMMARY will get you totals:

 

proc summary data=principles;

   var principle;

   where principle > .;

   output out=totals (keep=tot_principle _freq_) sum=tot_principle;

run;

 

The bigger question about there being a better way ... almost certainly.  Why are the two data sources already combined, with variables being repeated on every observation?  It would be better to keep the PRINCIPLE values in one data set and the COUNT and BALANCE values in a separate data set.  AFTER  you compute the total principle, then combine them:

 

data want;

set totals;

set balances;

new_balance = balance - tot_principle;

new_count = count - _freq_;

run;

 

I realize that there may be a variable you are omitting, such as CUSTOMER_ID.  But the same principles would still apply ... don't combine the data sets until you need to combine them.

View solution in original post


All Replies
Trusted Advisor
Posts: 1,621

Re: Data calculation

[ Edited ]

Proc summary will sum the principal values, count how many you have, ignore missing values, and provide an output data set for you.

 

Then you can subtract the principal from the balance and adjust the cnts variable.

Solution
‎11-09-2015 10:08 AM
Super User
Posts: 5,085

Re: Data calculation

As was mentioned, PROC SUMMARY will get you totals:

 

proc summary data=principles;

   var principle;

   where principle > .;

   output out=totals (keep=tot_principle _freq_) sum=tot_principle;

run;

 

The bigger question about there being a better way ... almost certainly.  Why are the two data sources already combined, with variables being repeated on every observation?  It would be better to keep the PRINCIPLE values in one data set and the COUNT and BALANCE values in a separate data set.  AFTER  you compute the total principle, then combine them:

 

data want;

set totals;

set balances;

new_balance = balance - tot_principle;

new_count = count - _freq_;

run;

 

I realize that there may be a variable you are omitting, such as CUSTOMER_ID.  But the same principles would still apply ... don't combine the data sets until you need to combine them.

Trusted Advisor
Posts: 1,204

Re: Data calculation

proc sql;
select cnts-count(*) as cnts, balance-sum(priciple) as Balance format=12.2 from have;
quit;

Contributor
Posts: 25

Re: Data calculation

Now the data is like below.

 

Obs cnts    Balance
1      513     80485692.35

 

I want to stick these values in a middle of  a mainframe dataset which is of 1000 length.

 

data _null_;
  set originalfile;
  file xxxx;
      put  @0001    data    $char1000.
             @0225    cnts    zd11.2
             @0234    balance zd9.

 ;

is anything wrong with this? I do not see the values in the mainframe dataset.

 

Thanks to all of you for your time and effort.

 

Neal.

Super User
Posts: 5,085

Re: Data calculation

Well, without visibility into the entire programming flow, I don't want to question your strategy.  Assuming you have a one-record data set named BALANCES, you could add this line to your existing program:

 

if _n_=1 then set balances;

 

That line could be the second line in your program, immediately following data _null_;

 

As long as ORIGINALFILE does not contain CNTS or BALANCE, this will work.

 

Good luck.

Contributor
Posts: 25

Re: Data calculation

After I stick cnts, balance in mainframe dataset :


Original:
****************************
00000051F08079038418
****************************

what I get:

***************************
.0000051F.8079038418
***************************

what I want:

***************************
00000051308048569238
***************************

Super User
Posts: 5,085

Re: Data calculation

That's a separate question, that centers around how to use a PUT statement.  As far as I can tell, your original question about how to get the data prepared is answered.

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 7 replies
  • 415 views
  • 1 like
  • 4 in conversation