## Lag statement

Solved
Regular Contributor
Posts: 216

# Lag statement

Hi Everyone

Using the lag function, I am trying to create a declining balance in a numeric column based on the value from the previous record, assuming it meets one criteria. Below is the code for this. However, the result is actually accumulating rather than declining. I have tried various iterations of this and it does not work. However, I have lag in the past without issue. Does anyone know if this is logically wrong or a syntax issue?

Attached is part of the result also.

Paul

data test2;

set &a3;

by &a6;

if first.&a7 then CNumber=CohortYearTotalAge;

if ExitMonthCategory > .5 then do;

CNumber=lag(CNumber)-DistributiveNumber;

end;

run;

Accepted Solutions
Solution
‎01-24-2014 11:41 AM
New Contributor
Posts: 4

## Re: Lag statement

Maybe this?

proc sort data=test out=test1 (drop=CNumber);

by county year decending CohortYearTotalAge ExitMonthCategory;

run;

data test2;

set test1;

by county year descending CohortYearTotalAge;

retain CNumber;

if first.CohortYearTotalAge then CNumber=CohortYearTotalAge;

CNumber = ifn( ExitMonthCategory > .5, (CNumber-DistributiveNumber), (CNumber) );

run;

All Replies
Super User
Posts: 23,773

## Re: Lag statement

Does this do what you want?

data test2;

set &a3;

by &a6;

retain cnumber;

if first.&a7 then CNumber=CohortYearTotalAge;

if ExitMonthCategory > .5 then do;

CNumber=cnumber-DistributiveNumber;

end;

run;

Regular Contributor
Posts: 216

## Re: Lag statement

Thanks Reeza, but no, it is the same result actually.

Paul

Super User
Posts: 23,773

## Re: Lag statement

Then I think your logic is actually what's incorrect. Post a Have dataset and want. Preferably in here, rather than in CSV files.

Regular Contributor
Posts: 216

## Re: Lag statement

It is simply this (the only difference is the CNumber column):

Input:

 County Year ExitMonthCategory CohortYearTotalAge DistributiveNumber CNumber 1 - New York State 2006 0.5 4632 0 1 - New York State 2006 1 4632 2 1 - New York State 2006 2 4632 0 1 - New York State 2006 3 4632 0 1 - New York State 2006 4 4632 0 1 - New York State 2006 5 4632 0 1 - New York State 2006 6 4632 2 1 - New York State 2006 12 4632 2 1 - New York State 2006 18 4632 6 1 - New York State 2006 24 4632 4 1 - New York State 2006 30 4632 2 1 - New York State 2006 36 4632 7 1 - New York State 2006 0.5 2350 0 1 - New York State 2006 1 2350 2 1 - New York State 2006 2 2350 0 1 - New York State 2006 3 2350 0 1 - New York State 2006 4 2350 0 1 - New York State 2006 5 2350 0 1 - New York State 2006 6 2350 2 1 - New York State 2006 12 2350 2 1 - New York State 2006 18 2350 6 1 - New York State 2006 24 2350 4 1 - New York State 2006 30 2350 2 1 - New York State 2006 36 2350 7

End:

 County Year ExitMonthCategory CohortYearTotalAge DistributiveNumber CNumber 1 - New York State 2006 0.5 4632 0 4632 1 - New York State 2006 1 4632 2 4630 1 - New York State 2006 2 4632 0 4630 1 - New York State 2006 3 4632 0 4630 1 - New York State 2006 4 4632 0 4630 1 - New York State 2006 5 4632 0 4630 1 - New York State 2006 6 4632 2 4628 1 - New York State 2006 12 4632 2 4626 1 - New York State 2006 18 4632 6 4620 1 - New York State 2006 24 4632 4 4616 1 - New York State 2006 30 4632 2 4614 1 - New York State 2006 36 4632 7 4607 1 - New York State 2006 0.5 2350 0 2350 1 - New York State 2006 1 2350 2 2348 1 - New York State 2006 2 2350 0 2348 1 - New York State 2006 3 2350 0 2348 1 - New York State 2006 4 2350 0 2348 1 - New York State 2006 5 2350 0 2348 1 - New York State 2006 6 2350 2 2346 1 - New York State 2006 12 2350 2 2344 1 - New York State 2006 18 2350 6 2338 1 - New York State 2006 24 2350 4 2334 1 - New York State 2006 30 2350 2 2332 1 - New York State 2006 36 2350 7 2325
Super User
Posts: 23,773

## Re: Lag statement

Drop CNumber from your original code and try it again.

data test2;

set &a3 (drop= Cnumber);

by &a6;

retain cnumber;

if first.&a7 then CNumber=CohortYearTotalAge;

if ExitMonthCategory > .5 then do;

CNumber=cnumber-DistributiveNumber;

end;

run;

Regular Contributor
Posts: 216

## Re: Lag statement

Nothing prints for CNumber after the initial value in the 'first' statement (if first.&a7 then CNumber=CohortYearTotalAge.

Paul

PROC Star
Posts: 1,460

## Re: Lag statement

What are the values of &a6 and &a7 ?

Unless there is a typo in the sample data you posted, it looks like the file is sorted by County CohortYearTotalAge ExitMonthCategory???   Or should year for second half be 2007, so that it would be sorted by County YEar ExitMonthCategory?

Regardless, for the sample data (assuming you are happy with the current sort order), it looks like when ExitMonthCategory=.5 you want to set Cnumber to the value of CohortYearTotalAge.  For any other value of ExitMonthCategory you want to decrement Cnumber by the value of DistributiveNumber.  If I've got the right logic, I think you could do that with:

data want ;

set have (drop=Cnumber) ;

if ExitMonthCategory < 1 then Cnumber = CohortYearTotalAge ;

else Cnumber +- DistributiveNumber ;

run;

So that's using the Sum statement (typically count+1 or some such), which automatically retains Cnumber, and in this case is used to decrement rather than increment.

HTH

Regular Contributor
Posts: 216

## Re: Lag statement

Hi Quentin

You have the logic right and that is what I have done. However the resulting data set just adds a negative to the DistributiveNumber value and does not subtract it from the CohortYearTotalAge value:

 County Year ExitMonthCategory CohortYearTotalAge DistributiveNumber CNumber 1 - New York State 2006 0.5 4632 0 4632 1 - New York State 2006 1 4632 2 -2 1 - New York State 2006 2 4632 0 0 1 - New York State 2006 3 4632 0 0 1 - New York State 2006 4 4632 0 0 1 - New York State 2006 5 4632 0 0 1 - New York State 2006 6 4632 2 -2 1 - New York State 2006 12 4632 2 -2 1 - New York State 2006 18 4632 6 -6 1 - New York State 2006 24 4632 4 -4 1 - New York State 2006 30 4632 2 -2 1 - New York State 2006 36 4632 7 -7 1 - New York State 2006 42 4632 0 0 1 - New York State 2006 48 4632 0 0 1 - New York State 2006 54 4632 3 -3 1 - New York State 2006 60 4632 0 0 1 - New York State 2006 66 4632 0 0 1 - New York State 2006 72 4632 0 0 1 - New York State 2006 78 4632 0 0 1 - New York State 2006 84 4632 0 0 1 - New York State 2006 87 4632 0 0 1 - New York State 2006 0.5 4632 0 4632 1 - New York State 2006 1 4632 0 0 1 - New York State 2006 2 4632 0 0 1 - New York State 2006 3 4632 0 0 1 - New York State 2006 4 4632 1 -1 1 - New York State 2006 5 4632 1 -1 1 - New York State 2006 6 4632 3 -3 1 - New York State 2006 12 4632 7 -7 1 - New York State 2006 18 4632 26 -26 1 - New York State 2006 24 4632 23 -23 1 - New York State 2006 30 4632 22 -22 1 - New York State 2006 36 4632 18 -18 1 - New York State 2006 42 4632 15 -15 1 - New York State 2006 48 4632 8 -8
PROC Star
Posts: 1,460

## Re: Lag statement

Hi,

Do you have Cnumber + - DistributiveNumber?  That does it for me:

```data have;
input County Year ExitMonthCategory CohortYearTotalAge DistributiveNumber;
cards;
1 2006 0.5 4632 0
1 2006 1   4632 2
1 2006 2   4632 0
1 2006 3   4632 0
1 2006 0.5 4632 0
1 2006 1   4632 7
1 2006 2   4632 3
1 2006 3   4632 0

;
run;

data want ;
set have;
if ExitMonthCategory < 1 then Cnumber = CohortYearTotalAge ;
else Cnumber +- DistributiveNumber ;

put (_all_)(=);
run;

*log: ;

121  data want ;
122  set have;
123  if ExitMonthCategory < 1 then Cnumber = CohortYearTotalAge ;
124  else Cnumber +- DistributiveNumber ;
125
126  put (_all_)(=);
127  run;

County=1 Year=2006 ExitMonthCategory=0.5 CohortYearTotalAge=4632 DistributiveNumber=0 Cnumber=4632
County=1 Year=2006 ExitMonthCategory=1 CohortYearTotalAge=4632 DistributiveNumber=2 Cnumber=4630
County=1 Year=2006 ExitMonthCategory=2 CohortYearTotalAge=4632 DistributiveNumber=0 Cnumber=4630
County=1 Year=2006 ExitMonthCategory=3 CohortYearTotalAge=4632 DistributiveNumber=0 Cnumber=4630
County=1 Year=2006 ExitMonthCategory=0.5 CohortYearTotalAge=4632 DistributiveNumber=0 Cnumber=4632
County=1 Year=2006 ExitMonthCategory=1 CohortYearTotalAge=4632 DistributiveNumber=7 Cnumber=4625
County=1 Year=2006 ExitMonthCategory=2 CohortYearTotalAge=4632 DistributiveNumber=3 Cnumber=4622
County=1 Year=2006 ExitMonthCategory=3 CohortYearTotalAge=4632 DistributiveNumber=0 Cnumber=4622
NOTE: There were 8 observations read from the data set WORK.HAVE.
NOTE: The data set WORK.WANT has 8 observations and 6 variables.

```
Regular Contributor
Posts: 216

## Re: Lag statement

Yes, I tried it exactly as you have it and tried with quotes as well. Neither worked. The only item I don't have is your statement below:

put (_all_)(=);

Paul

PROC Star
Posts: 1,460

## Re: Lag statement

Not sure what's happening.

My last post had sample data creating HAVE, and code creating WANT, and I think the results are what you are looking for, as shown in the log. Right? If you run that code, do you get the same results I get?

If so, then I would double check that your code is really the same as what I posted.

If not, then perhaps you've found some odd bug (rarely the case...)

--Q.

Solution
‎01-24-2014 11:41 AM
New Contributor
Posts: 4

## Re: Lag statement

Maybe this?

proc sort data=test out=test1 (drop=CNumber);

by county year decending CohortYearTotalAge ExitMonthCategory;

run;

data test2;

set test1;

by county year descending CohortYearTotalAge;

retain CNumber;

if first.CohortYearTotalAge then CNumber=CohortYearTotalAge;

CNumber = ifn( ExitMonthCategory > .5, (CNumber-DistributiveNumber), (CNumber) );

run;

Regular Contributor
Posts: 216

## Re: Lag statement

Nolaness, for some reason, that actually worked!! THANK YOU.

It didn't work until I added the drop statement in the prior Sort, but did thereafter. I have no idea why this worked and the 10+ variations of the code posted here and my own did not.

Paul

Frequent Contributor
Posts: 106

## Re: Lag statement

data want;

set have;

by descending CohortYearTotalAge ;

if first.CohortYearTotalAge then cnumber=CohortYearTotalAge;

else if ExitMonthCategory > .5 then cnumber+-DistributiveNumber;

run;

🔒 This topic is solved and locked.