Hello,
How to get the running total of a variable upto n records
group by some variable?
I want to create a new variable by doing the sum of first 4 records order by date value and group by color.
data have; input date : mmddyy10. color $ value ; format date mmddyy10.; cards; 3/17/2014 white 3 3/18/2014 white 2 3/19/2014 white 1 3/20/2014 white 2 3/21/2014 white 0 3/24/2014 white 38 3/25/2014 white 38 3/26/2014 white 38 3/17/2014 black 1 3/18/2014 black 3 3/19/2014 black 2 3/20/2014 black 1 3/21/2014 black 0 3/24/2014 black 6 3/25/2014 black 7 3/26/2014 black 6 ; run; data want; set have; if color ne lag(color) then do; n=0;sum=0;end; n+1;sum+value; if n in (1:4); drop n; run;
Xia Keshan
You can use the data step with set, by, retain and first.-logic to accomplish this.
Judging by the language you use to describe the problem, you are more accustomed to using SQL than a DATA step. SQL will not be a reliable tool for this application, as it does not guarantee the order of extracted records. Instead, you can try:
proc sort data=have;
by color date;
run;
data want;
set have;
by color date;
if first.color then do;
total = value;
record_count=1;
end;
else do;
total + value;
record_count + 1;
end;
if record_count = 4;
run;
This approach outputs nothing for colors that have fewer than 4 records. Slight changes to the logic can change that, but this DATA step illustrates the basic tools you will need to start with.
data have; input date : mmddyy10. color $ value ; format date mmddyy10.; cards; 3/17/2014 white 3 3/18/2014 white 2 3/19/2014 white 1 3/20/2014 white 2 3/21/2014 white 0 3/24/2014 white 38 3/25/2014 white 38 3/26/2014 white 38 3/17/2014 black 1 3/18/2014 black 3 3/19/2014 black 2 3/20/2014 black 1 3/21/2014 black 0 3/24/2014 black 6 3/25/2014 black 7 3/26/2014 black 6 ; run; data want; set have; if color ne lag(color) then do; n=0;sum=0;end; n+1;sum+value; if n in (1:4); drop n; run;
Xia Keshan
proc sort data=have;
by color date;
run;
data want;
set have;
if mod(_n_,4)=1 then grp+1;
run;
proc sql;
select color, sum(value) from want
where grp in (1,3)
group by color,grp;
quit;
If you have SAS/ETS licensed look into proc expand.
Thanks all for your suggestion. Is it possible to get the same result using PROC SQL?
No, it's not possible using PROC SQL. SQL by its nature does not guarantee the order of the records that it retrieves. It has no guaranteed way of determining which are the four records that you want for each COLOR.
Perhaps I shouldn't say it's impossible. Perhaps you can devise a method to identify the records that you want to retrieve. But nobody that I know has ever accomplished that. There is an undocumented feature (MONOTONIC) that could help in theory ... if only it were guaranteed to work every time. But it is not guaranteed so even if you got it to work properly once, it might not work properly the next time.
You just might be stuck having to learn to program in a DATA step for some types of problems.
Yes, but not recommended.
If you're familiar with SQL and working with a SQL database you can use pass-through logic and program as if you were in a native DB.
Otherwise create a row counter, via a data step and then merge via the row counter. More steps and less efficient.
You can try this. As mentioned in above responses MONOTONIC ( ) function is undocumented so results should be interpreted carefully.
proc sql;
create table one as
select *, monotonic () as cnt
from have
order by color,date;
create table want as
select *, (cnt - min(cnt) + 1 ) as cnt_grp
from one
group by color
order by cnt;
select color,sum(value) as total_4 from want
where cnt_grp<5
group by color;
quit;
if you have to resort to Proc SQL, there is a documented solution to get the sequence number:
ods _all_ close;
ods output sql_results=want;
proc sql number;
select * from have;
quit;
ods output close;
ods listing;
Haikuo
Thanks Haikuo.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.
Find more tutorials on the SAS Users YouTube channel.