Running Total with Missing Values

Accepted Solution Solved
Reply
Frequent Contributor
Posts: 140
Accepted Solution

Running Total with Missing Values

Hello,

 

I am trying to create a data set that includes a running total.  In some observations, the value that is being added to the running total is missing.  Instead of keeping the previous running total, the new running total value for the record that is missing is also missing.

 

I have been researching all day.  I have tried the retain statement, just adding the running total to the next value (assuming that a missing value would act as Running Total=0) and nothing is working.  I couldn't get it to work with proc sql either.

 

Any help would be appreciated.

 

Here is the code and current results I am getting:

 

SASCode.JPG

 

RunningTotal.JPG


Accepted Solutions
Solution
‎01-04-2016 09:09 AM
Super User
Posts: 17,963

Re: Running Total with Missing Values

Sort first and then calculate your running total?

View solution in original post


All Replies
Super Contributor
Posts: 490

Re: Running Total with Missing Values

Your code and the results are not matching. How the first retianed variable is bigger than the first not missing values???

and many unnessesary statment is included like a=a;

 

The summing function and your code logic should handle the missing values?

Frequent Contributor
Posts: 140

Re: Running Total with Missing Values

Exactly my problem. I have no idea why they are not matching and why the summing function is not working. I tried that as well. I am hoping that someone can help me figure out what I am doing wrong. The running total should start with the first observation (earliest FALLOUT_DATE) and grow from there.
Super Contributor
Posts: 490

Re: Running Total with Missing Values

do you have any other proc or data step before or after this code in the workflow??

Frequent Contributor
Posts: 140

Re: Running Total with Missing Values

Just the proc print I posted below

Super User
Posts: 10,552

Re: Running Total with Missing Values

I would first try only using one RETAIN for each variable. Multiples, especially when you have one initializing values and the other not may not be quite what you want.

 

Also, please post code, not pictures of code as we can't copy and paste elements for correction or suggestion.

 

I am not at all sure why there are statements like:

Fallout_date=Fallout_date;

Note_balance_20yr=Note_balance_20yr;

 

Waste of space.

 

Frequent Contributor
Posts: 140

Re: Running Total with Missing Values

Here is the code. 

 

As far as the waste of space issue, I was naming the variable for my new dataset (i.e Fallout_date=Fallout_date). 

 

DATA FLOWFRED.FMFLOW_LOANSDROPPED_TOT_RPT;
    SET FLOWFRED.FMFLOW_LOANSDROPPED_RUN_TOT;
    RETAIN RUNNING_NOTE_BALANCE_15YR;
    RETAIN RUNNING_NOTE_BALANCE_20YR;
    RETAIN RUNNING_NOTE_BALANCE_30YR;
        FALLOUT_DATE=FALLOUT_DATE;
        NOTE_BALANCE_15YR=NOTE_BALANCE_15YR;
        retain RUNNING_NOTE_BALANCE_15YR 0;
        RUNNING_NOTE_BALANCE_15YR=SUM(RUNNING_NOTE_BALANCE_15YR,NOTE_BALANCE_15YR);
        NOTE_BALANCE_20YR=NOTE_BALANCE_20YR;
        retain RUNNING_NOTE_BALANCE_20YR 0;
        RUNNING_NOTE_BALANCE_20YR=SUM(RUNNING_NOTE_BALANCE_20YR,NOTE_BALANCE_20YR);
        NOTE_BALANCE_30YR=NOTE_BALANCE_30YR;
        retain RUNNING_NOTE_BALANCE_30YR 0;
        RUNNING_NOTE_BALANCE_30YR=SUM(RUNNING_NOTE_BALANCE_30YR,NOTE_BALANCE_30YR);
run;

proc print data=flowfred.fmflow_loansdropped_tot_rpt;
run;

 

 

Super Contributor
Posts: 490

Re: Running Total with Missing Values

Can you post snapshot from the first rows of the original dataset "FLOWFRED.FMFLOW_LOANSDROPPED_RUN_TOT"?

Frequent Contributor
Posts: 140

Re: Running Total with Missing Values

Sure, thanks.

 

As you can see... the running total works in the original data set (created with proc sql).  The problem I have is that the running total doesn't carry forward if there is a missing value in NOTE_BALANCE_15YR.  For example, on row 7 there is a missing value for NOTE_BALANCE_15YR so the RUNNING_NOTE_BALANCE_15YR is also missing.  I need the RUNNING_NOTE_BALANCE_15YR from row 6 (FALLOUT_DATE 8/12/2015) to carry forward until the next record with a value occurs (row 12, FALLOUT_DATE 8/25/2015).

 

RunningTotal.JPG

Super Contributor
Posts: 490

Re: Running Total with Missing Values

[ Edited ]

Now i understand your problem.

The retained variables are already existing in your original data set and also contains data. 

DATA FLOWFRED.FMFLOW_LOANSDROPPED_TOT_RPT;
    SET FLOWFRED.FMFLOW_LOANSDROPPED_RUN_TOT;
    RETAIN _RUNNING_NOTE_BALANCE_15YR;
    RETAIN _RUNNING_NOTE_BALANCE_20YR;
    RETAIN _RUNNING_NOTE_BALANCE_30YR;
        _RUNNING_NOTE_BALANCE_15YR=SUM(_RUNNING_NOTE_BALANCE_15YR,NOTE_BALANCE_15YR);
        _RUNNING_NOTE_BALANCE_20YR=SUM(_RUNNING_NOTE_BALANCE_20YR,NOTE_BALANCE_20YR);
        _RUNNING_NOTE_BALANCE_30YR=SUM(_RUNNING_NOTE_BALANCE_30YR,NOTE_BALANCE_30YR);
run;

Run the above code and you find the correct result in the new created variables starting with _ .

Super User
Posts: 10,552

Re: Running Total with Missing Values

Does your input data set FLOWFRED.FMFLOW_LOANSDROPPED_RUN_TOT have a variable already named Running_Note_Balance_15Yr? If so, then each time the input set is read it pulls that value from the input set.

If that is the case then either DROP the existing Running total variables on the set statement (data set option DROP)

   SET FLOWFRED.FMFLOW_LOANSDROPPED_RUN_TOT (drop = RUNNING_NOTE_BALANCE_15YR  RUNNING_NOTE_BALANCE_20YR RUNNING_NOTE_BALANCE_30YR );

or use entirely new variables in your Retain statement and calculations. And use only one retain statement per variable. And all of the variables could be on one statement:

 

Retain RUNNING_NOTE_BALANCE_15YR  RUNNING_NOTE_BALANCE_20YR RUNNING_NOTE_BALANCE_30YR 0;

 

Frequent Contributor
Posts: 140

Re: Running Total with Missing Values

Thanks  mohamed_zaki

 

Frequent Contributor
Posts: 140

Re: Running Total with Missing Values

Thanks again, this worked great.  However, now that I have the running totals, I would like to have everything sorted in descending order with the most recent date.  That means I would like the running total descending along with the date.  I can get the date sorted descending easily but the running total will not sort that way with the date.  Any ideas?

Solution
‎01-04-2016 09:09 AM
Super User
Posts: 17,963

Re: Running Total with Missing Values

Sort first and then calculate your running total?
Frequent Contributor
Posts: 140

Re: Running Total with Missing Values

Thanks Reeza.  After taking the sort out of my initial query this worked.  I appreciate all of the answers everyone!

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 14 replies
  • 650 views
  • 2 likes
  • 4 in conversation