Hi Everyone
Using the lag function, I am trying to create a declining balance in a numeric column based on the value from the previous record, assuming it meets one criteria. Below is the code for this. However, the result is actually accumulating rather than declining. I have tried various iterations of this and it does not work. However, I have lag in the past without issue. Does anyone know if this is logically wrong or a syntax issue?
Attached is part of the result also.
Paul
data test2;
set &a3;
by &a6;
if first.&a7 then CNumber=CohortYearTotalAge;
if ExitMonthCategory > .5 then do;
CNumber=lag(CNumber)-DistributiveNumber;
end;
run;
Maybe this?
proc sort data=test out=test1 (drop=CNumber);
by county year decending CohortYearTotalAge ExitMonthCategory;
run;
data test2;
set test1;
by county year descending CohortYearTotalAge;
retain CNumber;
if first.CohortYearTotalAge then CNumber=CohortYearTotalAge;
CNumber = ifn( ExitMonthCategory > .5, (CNumber-DistributiveNumber), (CNumber) );
run;
Does this do what you want?
data test2;
set &a3;
by &a6;
retain cnumber;
if first.&a7 then CNumber=CohortYearTotalAge;
if ExitMonthCategory > .5 then do;
CNumber=cnumber-DistributiveNumber;
end;
run;
Thanks Reeza, but no, it is the same result actually.
Paul
Then I think your logic is actually what's incorrect. Post a Have dataset and want. Preferably in here, rather than in CSV files.
It is simply this (the only difference is the CNumber column):
Input:
County | Year | ExitMonthCategory | CohortYearTotalAge | DistributiveNumber | CNumber |
1 - New York State | 2006 | 0.5 | 4632 | 0 | |
1 - New York State | 2006 | 1 | 4632 | 2 | |
1 - New York State | 2006 | 2 | 4632 | 0 | |
1 - New York State | 2006 | 3 | 4632 | 0 | |
1 - New York State | 2006 | 4 | 4632 | 0 | |
1 - New York State | 2006 | 5 | 4632 | 0 | |
1 - New York State | 2006 | 6 | 4632 | 2 | |
1 - New York State | 2006 | 12 | 4632 | 2 | |
1 - New York State | 2006 | 18 | 4632 | 6 | |
1 - New York State | 2006 | 24 | 4632 | 4 | |
1 - New York State | 2006 | 30 | 4632 | 2 | |
1 - New York State | 2006 | 36 | 4632 | 7 | |
1 - New York State | 2006 | 0.5 | 2350 | 0 | |
1 - New York State | 2006 | 1 | 2350 | 2 | |
1 - New York State | 2006 | 2 | 2350 | 0 | |
1 - New York State | 2006 | 3 | 2350 | 0 | |
1 - New York State | 2006 | 4 | 2350 | 0 | |
1 - New York State | 2006 | 5 | 2350 | 0 | |
1 - New York State | 2006 | 6 | 2350 | 2 | |
1 - New York State | 2006 | 12 | 2350 | 2 | |
1 - New York State | 2006 | 18 | 2350 | 6 | |
1 - New York State | 2006 | 24 | 2350 | 4 | |
1 - New York State | 2006 | 30 | 2350 | 2 | |
1 - New York State | 2006 | 36 | 2350 | 7 |
End:
County | Year | ExitMonthCategory | CohortYearTotalAge | DistributiveNumber | CNumber |
1 - New York State | 2006 | 0.5 | 4632 | 0 | 4632 |
1 - New York State | 2006 | 1 | 4632 | 2 | 4630 |
1 - New York State | 2006 | 2 | 4632 | 0 | 4630 |
1 - New York State | 2006 | 3 | 4632 | 0 | 4630 |
1 - New York State | 2006 | 4 | 4632 | 0 | 4630 |
1 - New York State | 2006 | 5 | 4632 | 0 | 4630 |
1 - New York State | 2006 | 6 | 4632 | 2 | 4628 |
1 - New York State | 2006 | 12 | 4632 | 2 | 4626 |
1 - New York State | 2006 | 18 | 4632 | 6 | 4620 |
1 - New York State | 2006 | 24 | 4632 | 4 | 4616 |
1 - New York State | 2006 | 30 | 4632 | 2 | 4614 |
1 - New York State | 2006 | 36 | 4632 | 7 | 4607 |
1 - New York State | 2006 | 0.5 | 2350 | 0 | 2350 |
1 - New York State | 2006 | 1 | 2350 | 2 | 2348 |
1 - New York State | 2006 | 2 | 2350 | 0 | 2348 |
1 - New York State | 2006 | 3 | 2350 | 0 | 2348 |
1 - New York State | 2006 | 4 | 2350 | 0 | 2348 |
1 - New York State | 2006 | 5 | 2350 | 0 | 2348 |
1 - New York State | 2006 | 6 | 2350 | 2 | 2346 |
1 - New York State | 2006 | 12 | 2350 | 2 | 2344 |
1 - New York State | 2006 | 18 | 2350 | 6 | 2338 |
1 - New York State | 2006 | 24 | 2350 | 4 | 2334 |
1 - New York State | 2006 | 30 | 2350 | 2 | 2332 |
1 - New York State | 2006 | 36 | 2350 | 7 | 2325 |
Drop CNumber from your original code and try it again.
data test2;
set &a3 (drop= Cnumber);
by &a6;
retain cnumber;
if first.&a7 then CNumber=CohortYearTotalAge;
if ExitMonthCategory > .5 then do;
CNumber=cnumber-DistributiveNumber;
end;
run;
Nothing prints for CNumber after the initial value in the 'first' statement (if first.&a7 then CNumber=CohortYearTotalAge;).
Paul
What are the values of &a6 and &a7 ?
Unless there is a typo in the sample data you posted, it looks like the file is sorted by County CohortYearTotalAge ExitMonthCategory??? Or should year for second half be 2007, so that it would be sorted by County YEar ExitMonthCategory?
Regardless, for the sample data (assuming you are happy with the current sort order), it looks like when ExitMonthCategory=.5 you want to set Cnumber to the value of CohortYearTotalAge. For any other value of ExitMonthCategory you want to decrement Cnumber by the value of DistributiveNumber. If I've got the right logic, I think you could do that with:
data want ;
set have (drop=Cnumber) ;
if ExitMonthCategory < 1 then Cnumber = CohortYearTotalAge ;
else Cnumber +- DistributiveNumber ;
run;
So that's using the Sum statement (typically count+1 or some such), which automatically retains Cnumber, and in this case is used to decrement rather than increment.
HTH
Hi Quentin
You have the logic right and that is what I have done. However the resulting data set just adds a negative to the DistributiveNumber value and does not subtract it from the CohortYearTotalAge value:
County | Year | ExitMonthCategory | CohortYearTotalAge | DistributiveNumber | CNumber |
1 - New York State | 2006 | 0.5 | 4632 | 0 | 4632 |
1 - New York State | 2006 | 1 | 4632 | 2 | -2 |
1 - New York State | 2006 | 2 | 4632 | 0 | 0 |
1 - New York State | 2006 | 3 | 4632 | 0 | 0 |
1 - New York State | 2006 | 4 | 4632 | 0 | 0 |
1 - New York State | 2006 | 5 | 4632 | 0 | 0 |
1 - New York State | 2006 | 6 | 4632 | 2 | -2 |
1 - New York State | 2006 | 12 | 4632 | 2 | -2 |
1 - New York State | 2006 | 18 | 4632 | 6 | -6 |
1 - New York State | 2006 | 24 | 4632 | 4 | -4 |
1 - New York State | 2006 | 30 | 4632 | 2 | -2 |
1 - New York State | 2006 | 36 | 4632 | 7 | -7 |
1 - New York State | 2006 | 42 | 4632 | 0 | 0 |
1 - New York State | 2006 | 48 | 4632 | 0 | 0 |
1 - New York State | 2006 | 54 | 4632 | 3 | -3 |
1 - New York State | 2006 | 60 | 4632 | 0 | 0 |
1 - New York State | 2006 | 66 | 4632 | 0 | 0 |
1 - New York State | 2006 | 72 | 4632 | 0 | 0 |
1 - New York State | 2006 | 78 | 4632 | 0 | 0 |
1 - New York State | 2006 | 84 | 4632 | 0 | 0 |
1 - New York State | 2006 | 87 | 4632 | 0 | 0 |
1 - New York State | 2006 | 0.5 | 4632 | 0 | 4632 |
1 - New York State | 2006 | 1 | 4632 | 0 | 0 |
1 - New York State | 2006 | 2 | 4632 | 0 | 0 |
1 - New York State | 2006 | 3 | 4632 | 0 | 0 |
1 - New York State | 2006 | 4 | 4632 | 1 | -1 |
1 - New York State | 2006 | 5 | 4632 | 1 | -1 |
1 - New York State | 2006 | 6 | 4632 | 3 | -3 |
1 - New York State | 2006 | 12 | 4632 | 7 | -7 |
1 - New York State | 2006 | 18 | 4632 | 26 | -26 |
1 - New York State | 2006 | 24 | 4632 | 23 | -23 |
1 - New York State | 2006 | 30 | 4632 | 22 | -22 |
1 - New York State | 2006 | 36 | 4632 | 18 | -18 |
1 - New York State | 2006 | 42 | 4632 | 15 | -15 |
1 - New York State | 2006 | 48 | 4632 | 8 | -8 |
Hi,
Do you have Cnumber + - DistributiveNumber? That does it for me:
data have; input County Year ExitMonthCategory CohortYearTotalAge DistributiveNumber; cards; 1 2006 0.5 4632 0 1 2006 1 4632 2 1 2006 2 4632 0 1 2006 3 4632 0 1 2006 0.5 4632 0 1 2006 1 4632 7 1 2006 2 4632 3 1 2006 3 4632 0 ; run; data want ; set have; if ExitMonthCategory < 1 then Cnumber = CohortYearTotalAge ; else Cnumber +- DistributiveNumber ; put (_all_)(=); run; *log: ; 121 data want ; 122 set have; 123 if ExitMonthCategory < 1 then Cnumber = CohortYearTotalAge ; 124 else Cnumber +- DistributiveNumber ; 125 126 put (_all_)(=); 127 run; County=1 Year=2006 ExitMonthCategory=0.5 CohortYearTotalAge=4632 DistributiveNumber=0 Cnumber=4632 County=1 Year=2006 ExitMonthCategory=1 CohortYearTotalAge=4632 DistributiveNumber=2 Cnumber=4630 County=1 Year=2006 ExitMonthCategory=2 CohortYearTotalAge=4632 DistributiveNumber=0 Cnumber=4630 County=1 Year=2006 ExitMonthCategory=3 CohortYearTotalAge=4632 DistributiveNumber=0 Cnumber=4630 County=1 Year=2006 ExitMonthCategory=0.5 CohortYearTotalAge=4632 DistributiveNumber=0 Cnumber=4632 County=1 Year=2006 ExitMonthCategory=1 CohortYearTotalAge=4632 DistributiveNumber=7 Cnumber=4625 County=1 Year=2006 ExitMonthCategory=2 CohortYearTotalAge=4632 DistributiveNumber=3 Cnumber=4622 County=1 Year=2006 ExitMonthCategory=3 CohortYearTotalAge=4632 DistributiveNumber=0 Cnumber=4622 NOTE: There were 8 observations read from the data set WORK.HAVE. NOTE: The data set WORK.WANT has 8 observations and 6 variables.
Yes, I tried it exactly as you have it and tried with quotes as well. Neither worked. The only item I don't have is your statement below:
put (_all_)(=);
Paul
Not sure what's happening.
My last post had sample data creating HAVE, and code creating WANT, and I think the results are what you are looking for, as shown in the log. Right? If you run that code, do you get the same results I get?
If so, then I would double check that your code is really the same as what I posted.
If not, then perhaps you've found some odd bug (rarely the case...)
--Q.
Maybe this?
proc sort data=test out=test1 (drop=CNumber);
by county year decending CohortYearTotalAge ExitMonthCategory;
run;
data test2;
set test1;
by county year descending CohortYearTotalAge;
retain CNumber;
if first.CohortYearTotalAge then CNumber=CohortYearTotalAge;
CNumber = ifn( ExitMonthCategory > .5, (CNumber-DistributiveNumber), (CNumber) );
run;
Nolaness, for some reason, that actually worked!! THANK YOU.
It didn't work until I added the drop statement in the prior Sort, but did thereafter. I have no idea why this worked and the 10+ variations of the code posted here and my own did not.
Paul
data want;
set have;
by descending CohortYearTotalAge ;
if first.CohortYearTotalAge then cnumber=CohortYearTotalAge;
else if ExitMonthCategory > .5 then cnumber+-DistributiveNumber;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.