BookmarkSubscribeRSS Feed
Edoedoedo
Pyrite | Level 9

Hi,

how to calculate the classic window aggregations (like cumulative sum, moving average, moving stddev, ....) in CAS on CAS tables with CAS actions only?

Datastep or fedsql are forbidden as answers 😁, I don't want to reinvent the wheel and write the code for this, I need a configurable action to execute these standard calculations (like the old proc expand did).

 

EDIT: 

Perhaps with the action "aggregation.aggregate"? Not sure what it does, the documentation is very unclear and there are no useful examples (https://documentation.sas.com/?cdcId=pgmsascdc&cdcVersion=v_009&docsetId=casanpg&docsetTarget=cas-ag...)

 

Thank you very much

Regards

4 REPLIES 4
SASJedi
SAS Super FREQ

Here is a little bit of code to get you started:

 

cas mySession;
libname casuser cas caslib="casuser" sessref="mySession";

data casuser.test;
	do i=1 to 12;
		do y=2020 to 2021;
			date=mdy(i, 1, y);
			total=round(rand("uniform", 1, 10));
			output;
		end;
	end;
	format date date9.;
	drop i y;
run;

proc cas;
	tbl={name="test", caslib="casuser"};
	table.fetch / table=tbl, index=FALSE;
	aggregation.aggregate / 
        table=tbl || {groupBy="Date"}
	   ,varSpecs={{name="Total", subset="MEAN"} }
       ,ID="Date"
       ,Interval="MONTH"
       ,windowInt="QTR"
       ,casOut={name="summary", caslib="casuser", replace=TRUE}
    ;
	table.fetch / 
		 table={name="summary", caslib="casuser", orderBy={"date"}}
		,index=FALSE
     ;
quit;

proc expand data=casuser.test 
            out=Stats;
   id  Date;
   convert Total=_total_Summary_Mean_ / transout=(movave 3  trimleft 2);
run;

proc compare base=stats compare=casuser.summary brief transpose;
  id date;
run;

PROC COMPARE results show minute, insignificant differences in the calculated values for a few rows, but otherwise results are identical:

The COMPARE Procedure
Comparison of WORK.STATS with CASUSER.SUMMARY
(Method=EXACT)
Comparison Results for Observations

date=01OCT2020:
Variable Base Value Compare Diff. % Diff
_total_Summary_Mean_ 5.000000 5 -8.88178E-16 -1.77636E-14

date=01MAR2021:
Variable Base Value Compare Diff. % Diff
_total_Summary_Mean_ 4.333333 4.3333333333 8.881784E-16 2.049643E-14

date=01MAY2021:
Variable Base Value Compare Diff. % Diff
_total_Summary_Mean_ 4.333333 4.3333333333 -8.88178E-16 -2.04964E-14

date=01DEC2021:
Variable Base Value Compare Diff. % Diff
_total_Summary_Mean_ 6.000000 6 2.664535E-15 4.440892E-14

NOTE: Values of the following 1 variables compare unequal: _total_Summary_Mean_

 

 

Check out my Jedi SAS Tricks for SAS Users
mkeintz
PROC Star

I have no idea of whether the CAS approach is more or less efficient than proc expand for sorted data from a sas data set.

 

But it's certainly more flexible.  A major shortcoming (IMO) of PROC EXPAND, is that it only provides univariate statistics for windows.  An equivalent ordinary data step (and I presume an equivalent PROC CAS) would be much superior to proc expand in generating rolling window bivariate/multivariate stats - for example generating rolling windows sums-of-squares-and-cross-products (SSCP).  A rolling window SSCP is ready-made for rolling-window regressions.   PROC EXPAND can't do that unless you pre-process the data to generate observation-by-observation individual cross-products.

 

So @SASJedi , may I shift @Edoedoedo 's goal posts a bit.  Might we see CAS actions to generate rolling SSCP?

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
SASJedi
SAS Super FREQ

I'm going to admit right here that I am not statistically inclined, and CSSP is not one of the statistics I'm familiar with - nor is it supported by the aggregation.aggregate CAS action.  But your post suggests you may know how to accomplish what you need with DATA step code. If so, give the dataStep.runCode action a whirl 🙂

 

Check out my Jedi SAS Tricks for SAS Users
PaalNavestad
Pyrite | Level 9

Since there is a set of really good cumulative functions in VA I guess there must be a use of the aggregate function. What I really want is the cumulative month to date average for all days in a month and a YTD  cumulative for all the days in a year. 

 

Traditionally I've always done these sort of tasks with a datastep. That need the data sorted and retain. But I would love to sse how this have been achieved in VA

 

Pål N

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 1775 views
  • 0 likes
  • 4 in conversation