BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Paul_NYS
Obsidian | Level 7

Hi Everyone

Using the lag function, I am trying to create a declining balance in a numeric column based on the value from the previous record, assuming it meets one criteria. Below is the code for this. However, the result is actually accumulating rather than declining. I have tried various iterations of this and it does not work. However, I have lag in the past without issue. Does anyone know if this is logically wrong or a syntax issue?

Attached is part of the result also.

Paul

data test2;

set &a3;

by &a6;

if first.&a7 then CNumber=CohortYearTotalAge;

if ExitMonthCategory > .5 then do;

CNumber=lag(CNumber)-DistributiveNumber;

end;

run;

1 ACCEPTED SOLUTION

Accepted Solutions
nolaness
Calcite | Level 5

Maybe this?

proc sort data=test out=test1 (drop=CNumber);

by county year decending CohortYearTotalAge ExitMonthCategory;

run;

data test2;

  set test1;

  by county year descending CohortYearTotalAge;

  retain CNumber;

  if first.CohortYearTotalAge then CNumber=CohortYearTotalAge;

  CNumber = ifn( ExitMonthCategory > .5, (CNumber-DistributiveNumber), (CNumber) );

run;

View solution in original post

17 REPLIES 17
Reeza
Super User

Does this do what you want?

data test2;

set &a3;

by &a6;

retain cnumber;

if first.&a7 then CNumber=CohortYearTotalAge;

if ExitMonthCategory > .5 then do;

CNumber=cnumber-DistributiveNumber;

end;

run;

Paul_NYS
Obsidian | Level 7

Thanks Reeza, but no, it is the same result actually.

Paul

Reeza
Super User

Then I think your logic is actually what's incorrect. Post a Have dataset and want. Preferably in here, rather than in CSV files.

Paul_NYS
Obsidian | Level 7

It is simply this (the only difference is the CNumber column):

Input:

CountyYearExitMonthCategoryCohortYearTotalAgeDistributiveNumberCNumber
1 - New York State20060.546320
1 - New York State2006146322
1 - New York State2006246320
1 - New York State2006346320
1 - New York State2006446320
1 - New York State2006546320
1 - New York State2006646322
1 - New York State20061246322
1 - New York State20061846326
1 - New York State20062446324
1 - New York State20063046322
1 - New York State20063646327
1 - New York State20060.523500
1 - New York State2006123502
1 - New York State2006223500
1 - New York State2006323500
1 - New York State2006423500
1 - New York State2006523500
1 - New York State2006623502
1 - New York State20061223502
1 - New York State20061823506
1 - New York State20062423504
1 - New York State20063023502
1 - New York State20063623507

End:

CountyYearExitMonthCategoryCohortYearTotalAgeDistributiveNumberCNumber
1 - New York State20060.5463204632
1 - New York State20061463224630
1 - New York State20062463204630
1 - New York State20063463204630
1 - New York State20064463204630
1 - New York State20065463204630
1 - New York State20066463224628
1 - New York State200612463224626
1 - New York State200618463264620
1 - New York State200624463244616
1 - New York State200630463224614
1 - New York State200636463274607
1 - New York State20060.5235002350
1 - New York State20061235022348
1 - New York State20062235002348
1 - New York State20063235002348
1 - New York State20064235002348
1 - New York State20065235002348
1 - New York State20066235022346
1 - New York State200612235022344
1 - New York State200618235062338
1 - New York State200624235042334
1 - New York State200630235022332
1 - New York State200636235072325
Reeza
Super User

Drop CNumber from your original code and try it again.

data test2;

set &a3 (drop= Cnumber);

by &a6;

retain cnumber;

if first.&a7 then CNumber=CohortYearTotalAge;

if ExitMonthCategory > .5 then do;

CNumber=cnumber-DistributiveNumber;

end;

run;

Paul_NYS
Obsidian | Level 7

Nothing prints for CNumber after the initial value in the 'first' statement (if first.&a7 then CNumber=CohortYearTotalAge;).

Paul

Quentin
Super User

What are the values of &a6 and &a7 ?

Unless there is a typo in the sample data you posted, it looks like the file is sorted by County CohortYearTotalAge ExitMonthCategory???   Or should year for second half be 2007, so that it would be sorted by County YEar ExitMonthCategory?

Regardless, for the sample data (assuming you are happy with the current sort order), it looks like when ExitMonthCategory=.5 you want to set Cnumber to the value of CohortYearTotalAge.  For any other value of ExitMonthCategory you want to decrement Cnumber by the value of DistributiveNumber.  If I've got the right logic, I think you could do that with:

data want ;

set have (drop=Cnumber) ;

if ExitMonthCategory < 1 then Cnumber = CohortYearTotalAge ;

else Cnumber +- DistributiveNumber ;

run;

So that's using the Sum statement (typically count+1 or some such), which automatically retains Cnumber, and in this case is used to decrement rather than increment.

HTH

BASUG is hosting free webinars Next up: Jane Eslinger presenting PROC REPORT and the ODS EXCEL destination on Mar 27 at noon ET. Register now at the Boston Area SAS Users Group event page: https://www.basug.org/events.
Paul_NYS
Obsidian | Level 7

Hi Quentin

You have the logic right and that is what I have done. However the resulting data set just adds a negative to the DistributiveNumber value and does not subtract it from the CohortYearTotalAge value:

CountyYearExitMonthCategoryCohortYearTotalAgeDistributiveNumberCNumber
1 - New York State20060.5463204632
1 - New York State2006146322-2
1 - New York State20062463200
1 - New York State20063463200
1 - New York State20064463200
1 - New York State20065463200
1 - New York State2006646322-2
1 - New York State20061246322-2
1 - New York State20061846326-6
1 - New York State20062446324-4
1 - New York State20063046322-2
1 - New York State20063646327-7
1 - New York State200642463200
1 - New York State200648463200
1 - New York State20065446323-3
1 - New York State200660463200
1 - New York State200666463200
1 - New York State200672463200
1 - New York State200678463200
1 - New York State200684463200
1 - New York State200687463200
1 - New York State20060.5463204632
1 - New York State20061463200
1 - New York State20062463200
1 - New York State20063463200
1 - New York State2006446321-1
1 - New York State2006546321-1
1 - New York State2006646323-3
1 - New York State20061246327-7
1 - New York State200618463226-26
1 - New York State200624463223-23
1 - New York State200630463222-22
1 - New York State200636463218-18
1 - New York State200642463215-15
1 - New York State20064846328-8
Quentin
Super User

Hi,

Do you have Cnumber + - DistributiveNumber?  That does it for me:

data have;
  input County Year ExitMonthCategory CohortYearTotalAge DistributiveNumber;
  cards;
1 2006 0.5 4632 0
1 2006 1   4632 2
1 2006 2   4632 0
1 2006 3   4632 0
1 2006 0.5 4632 0
1 2006 1   4632 7
1 2006 2   4632 3
1 2006 3   4632 0

;
run;

data want ;
set have;
if ExitMonthCategory < 1 then Cnumber = CohortYearTotalAge ;
else Cnumber +- DistributiveNumber ;

put (_all_)(=);
run;

*log: ;

121  data want ;
122  set have;
123  if ExitMonthCategory < 1 then Cnumber = CohortYearTotalAge ;
124  else Cnumber +- DistributiveNumber ;
125
126  put (_all_)(=);
127  run;

County=1 Year=2006 ExitMonthCategory=0.5 CohortYearTotalAge=4632 DistributiveNumber=0 Cnumber=4632
County=1 Year=2006 ExitMonthCategory=1 CohortYearTotalAge=4632 DistributiveNumber=2 Cnumber=4630
County=1 Year=2006 ExitMonthCategory=2 CohortYearTotalAge=4632 DistributiveNumber=0 Cnumber=4630
County=1 Year=2006 ExitMonthCategory=3 CohortYearTotalAge=4632 DistributiveNumber=0 Cnumber=4630
County=1 Year=2006 ExitMonthCategory=0.5 CohortYearTotalAge=4632 DistributiveNumber=0 Cnumber=4632
County=1 Year=2006 ExitMonthCategory=1 CohortYearTotalAge=4632 DistributiveNumber=7 Cnumber=4625
County=1 Year=2006 ExitMonthCategory=2 CohortYearTotalAge=4632 DistributiveNumber=3 Cnumber=4622
County=1 Year=2006 ExitMonthCategory=3 CohortYearTotalAge=4632 DistributiveNumber=0 Cnumber=4622
NOTE: There were 8 observations read from the data set WORK.HAVE.
NOTE: The data set WORK.WANT has 8 observations and 6 variables.



BASUG is hosting free webinars Next up: Jane Eslinger presenting PROC REPORT and the ODS EXCEL destination on Mar 27 at noon ET. Register now at the Boston Area SAS Users Group event page: https://www.basug.org/events.
Paul_NYS
Obsidian | Level 7

Yes, I tried it exactly as you have it and tried with quotes as well. Neither worked. The only item I don't have is your statement below:

put (_all_)(=);

Paul

Quentin
Super User

Not sure what's happening.

My last post had sample data creating HAVE, and code creating WANT, and I think the results are what you are looking for, as shown in the log. Right? If you run that code, do you get the same results I get?

If so, then I would double check that your code is really the same as what I posted.

If not, then perhaps you've found some odd bug (rarely the case...)

--Q.

BASUG is hosting free webinars Next up: Jane Eslinger presenting PROC REPORT and the ODS EXCEL destination on Mar 27 at noon ET. Register now at the Boston Area SAS Users Group event page: https://www.basug.org/events.
nolaness
Calcite | Level 5

Maybe this?

proc sort data=test out=test1 (drop=CNumber);

by county year decending CohortYearTotalAge ExitMonthCategory;

run;

data test2;

  set test1;

  by county year descending CohortYearTotalAge;

  retain CNumber;

  if first.CohortYearTotalAge then CNumber=CohortYearTotalAge;

  CNumber = ifn( ExitMonthCategory > .5, (CNumber-DistributiveNumber), (CNumber) );

run;

Paul_NYS
Obsidian | Level 7

Nolaness, for some reason, that actually worked!! THANK YOU.

It didn't work until I added the drop statement in the prior Sort, but did thereafter. I have no idea why this worked and the 10+ variations of the code posted here and my own did not.

Paul

pradeepalankar
Obsidian | Level 7

data want;

set have;

by descending CohortYearTotalAge ;

if first.CohortYearTotalAge then cnumber=CohortYearTotalAge;

else if ExitMonthCategory > .5 then cnumber+-DistributiveNumber;

run;

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 17 replies
  • 1653 views
  • 0 likes
  • 5 in conversation