BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
adil256
Quartz | Level 8

Hi everyone,

 

I'm facing a problem that I can't resolve in one step.

So I would like to output the sum of volume for each year with using by statement. However I can't do that because the variable year doesn't exist in the input set.

I know I can do it in 2 step but I don't want to because with the real database I'm using it will be too heavy. 

 

 

Data have;
format date ddmmyy8.;
Date="01mar2015"d;
duration=24;
Volume=10;
run;

Data want;
set have;
by year;
retain Volume_cumul 0;
Volume_cumul=Volume;

do i=0 to Duration;
Year=year(intnx('month',date,i,'s'));
Month=month(intnx('month',date,i,'s'));
Volume_cumul=Volume_cumul+1;
if last.year=1 then output;
end;

run;

Thx for your help

 

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

The simple answer is NO.  You can simulate the FIRST. flag by using LAG.

  x=<some function>;
  first_x  = x ne lag(x);

But to calculate LAST. flag you have to know what the next value of X, that you have not calculated yet, is going to be.

 

Why not generate the derived dataset as a view and then use the BY statement when processing the VIEW?

 

But if your real use case is as simple as your example why not just test if MONTH=12?

do i=0 to Duration;
  Year=year(intnx('month',date,i,'s'));
  Month=month(intnx('month',date,i,'s'));
  Volume_cumul=Volume_cumul+1;
  if Month=12 or i=duration then output;
end;

View solution in original post

11 REPLIES 11
PaigeMiller
Diamond | Level 26

Don't write your own DATA step code to compute sums over groups. SAS has already done this work for you and it is called either PROC MEANS or PROC SUMMARY.

 

proc summary data=have nway;
    class date;
    format date year4.;
    var volume;
    output out=_sums_ sum=;
run;

 

 

--
Paige Miller
adil256
Quartz | Level 8

Hi thank u for u answer but your solution doesn't meet my needs. 

Actually I'm performing a lot of calculation and I need to get the cumulative sum at the end of year this way. 

 

PaigeMiller
Diamond | Level 26

"cumulative sum at the end of the year" and sum from the PROC SUMMARY code are identical and equal and have no difference. Or you haven't explained what you want clearly enough.

--
Paige Miller
maguiremq
SAS Super FREQ

Can you please post what you ultimately want? I run your code but can't really tell what we're trying to get at.

adil256
Quartz | Level 8

Hi,

If u run the code I posted earlier , u'll have an error message saying that the By variable Year does not exist in the input data set.

 

I'd like to get the dataset want2 in one step.

Data have;
format date ddmmyy8.;
Date="01mar2015"d;
duration=24;
Volume=10;
run;

Data want;
set have;
*by year;
retain Volume_cumul 0;
Volume_cumul=Volume;

do i=0 to Duration;
Year=year(intnx('month',date,i,'s'));
Month=month(intnx('month',date,i,'s'));
Volume_cumul=Volume_cumul+1;
output;*if last.year=1 then output;
end;

run;
proc sort data=want;
by year;
run;
Data want2;
set want;
by year;
if last.year=1 then output;
run;
maguiremq
SAS Super FREQ

Yes, I understand that, and I did see that error when I ran it. It makes sense that it didn't run.

 

What I am asking for, however, is your final data set -- what will it look like?

 

Providing that ensures that we are able to get exactly what you want.

adil256
Quartz | Level 8

I'd like to get the dataset Want2

Data have;
format date ddmmyy8.;
Date="01mar2015"d;
duration=24;
Volume=10;
run;

Data want;
set have;
*by year;
retain Volume_cumul 0;
Volume_cumul=Volume;

do i=0 to Duration;
Year=year(intnx('month',date,i,'s'));
Month=month(intnx('month',date,i,'s'));
Volume_cumul=Volume_cumul+1;
output;*if last.year=1 then output;
end;

run;
proc sort data=want;
by year;
run;

Data want2;
set want;
by year;
if last.year=1 then output;
run;
adil256
Quartz | Level 8

Yeah technically it does the same job but I'll need to output the entire database in first step before taking the sum which I don't.

 

Finally I just want to use the statement last.variable with a variable that doesn't exist in the input dataset. 

PaigeMiller
Diamond | Level 26

@adil256 wrote:

Yeah technically it does the same job but I'll need to output the entire database in first step before taking the sum which I don't.


I just don't understand how "output the entire database" relates to the rest of the problem you have described.

 

You were previously asked to SHOW US what you want (or a small portion of the output), and that hasn't been done either.

--
Paige Miller
Tom
Super User Tom
Super User

The simple answer is NO.  You can simulate the FIRST. flag by using LAG.

  x=<some function>;
  first_x  = x ne lag(x);

But to calculate LAST. flag you have to know what the next value of X, that you have not calculated yet, is going to be.

 

Why not generate the derived dataset as a view and then use the BY statement when processing the VIEW?

 

But if your real use case is as simple as your example why not just test if MONTH=12?

do i=0 to Duration;
  Year=year(intnx('month',date,i,'s'));
  Month=month(intnx('month',date,i,'s'));
  Volume_cumul=Volume_cumul+1;
  if Month=12 or i=duration then output;
end;
adil256
Quartz | Level 8

😁 I absolutely wanted to use the last.variable statement to output the data that  I didn't think at all at this solution.

 

Thx for ur help

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 11 replies
  • 1446 views
  • 1 like
  • 4 in conversation