12-31-2015 04:32 PM
I am trying to create a data set that includes a running total. In some observations, the value that is being added to the running total is missing. Instead of keeping the previous running total, the new running total value for the record that is missing is also missing.
I have been researching all day. I have tried the retain statement, just adding the running total to the next value (assuming that a missing value would act as Running Total=0) and nothing is working. I couldn't get it to work with proc sql either.
Any help would be appreciated.
Here is the code and current results I am getting:
12-31-2015 04:59 PM
Your code and the results are not matching. How the first retianed variable is bigger than the first not missing values???
and many unnessesary statment is included like a=a;
The summing function and your code logic should handle the missing values?
12-31-2015 05:02 PM
12-31-2015 05:02 PM
I would first try only using one RETAIN for each variable. Multiples, especially when you have one initializing values and the other not may not be quite what you want.
Also, please post code, not pictures of code as we can't copy and paste elements for correction or suggestion.
I am not at all sure why there are statements like:
Waste of space.
12-31-2015 05:06 PM
Here is the code.
As far as the waste of space issue, I was naming the variable for my new dataset (i.e Fallout_date=Fallout_date).
retain RUNNING_NOTE_BALANCE_15YR 0;
retain RUNNING_NOTE_BALANCE_20YR 0;
retain RUNNING_NOTE_BALANCE_30YR 0;
proc print data=flowfred.fmflow_loansdropped_tot_rpt;
12-31-2015 05:19 PM
As you can see... the running total works in the original data set (created with proc sql). The problem I have is that the running total doesn't carry forward if there is a missing value in NOTE_BALANCE_15YR. For example, on row 7 there is a missing value for NOTE_BALANCE_15YR so the RUNNING_NOTE_BALANCE_15YR is also missing. I need the RUNNING_NOTE_BALANCE_15YR from row 6 (FALLOUT_DATE 8/12/2015) to carry forward until the next record with a value occurs (row 12, FALLOUT_DATE 8/25/2015).
12-31-2015 05:27 PM - edited 12-31-2015 05:30 PM
Now i understand your problem.
The retained variables are already existing in your original data set and also contains data.
DATA FLOWFRED.FMFLOW_LOANSDROPPED_TOT_RPT; SET FLOWFRED.FMFLOW_LOANSDROPPED_RUN_TOT; RETAIN _RUNNING_NOTE_BALANCE_15YR; RETAIN _RUNNING_NOTE_BALANCE_20YR; RETAIN _RUNNING_NOTE_BALANCE_30YR; _RUNNING_NOTE_BALANCE_15YR=SUM(_RUNNING_NOTE_BALANCE_15YR,NOTE_BALANCE_15YR); _RUNNING_NOTE_BALANCE_20YR=SUM(_RUNNING_NOTE_BALANCE_20YR,NOTE_BALANCE_20YR); _RUNNING_NOTE_BALANCE_30YR=SUM(_RUNNING_NOTE_BALANCE_30YR,NOTE_BALANCE_30YR); run;
Run the above code and you find the correct result in the new created variables starting with _ .
12-31-2015 05:34 PM
Does your input data set FLOWFRED.FMFLOW_LOANSDROPPED_RUN_TOT have a variable already named Running_Note_Balance_15Yr? If so, then each time the input set is read it pulls that value from the input set.
If that is the case then either DROP the existing Running total variables on the set statement (data set option DROP)
SET FLOWFRED.FMFLOW_LOANSDROPPED_RUN_TOT (drop = RUNNING_NOTE_BALANCE_15YR RUNNING_NOTE_BALANCE_20YR RUNNING_NOTE_BALANCE_30YR );
or use entirely new variables in your Retain statement and calculations. And use only one retain statement per variable. And all of the variables could be on one statement:
Retain RUNNING_NOTE_BALANCE_15YR RUNNING_NOTE_BALANCE_20YR RUNNING_NOTE_BALANCE_30YR 0;
01-01-2016 12:38 PM
Thanks again, this worked great. However, now that I have the running totals, I would like to have everything sorted in descending order with the most recent date. That means I would like the running total descending along with the date. I can get the date sorted descending easily but the running total will not sort that way with the date. Any ideas?
Need further help from the community? Please ask a new question.