Hi all,
I know that SAS has functions to get 'mean' and 'std' directly, as the following codes:
/*Rolling 24-month window*/
proc sql;
create table roll as
select
h2.cusip,
h2.monthDate as periodEndDate format=yymmd7.,
h2.ret,
mean(h1.ret) as meanRet,
std(h1.ret) as stdRet
from
have as h1 inner join
have as h2 on h1.cusip=h2.cusip and
intck("MONTH", h1.monthDate, h2.monthDate) between 1 and 24
group by h2.cusip, h2.monthDate, h2.ret
having count(h2.cusip)=24;
quit;
How about the function for 'skewness' or 'idiosyncractic skewness'?
Thanks
ZZb is there a reason that you must code in SQL? Do you have a third party DBMS? When you have a third-party DBMS wich one? SQL aggregated functions are as standard as babylonian languages (old tst) for the advanced ones.
I your question the same as: https://communities.sas.com/message/192004
I believe proc Univariate gives Skewness statistics for a given variable.
This wat the SAS SQL supports: SAS(R) 9.4 SQL Procedure User's Guide (summary-function) this is standard SQL not some dialect.
The SAS procedures have there origin in Stats see: Base SAS(R) 9.4 Procedures Guide, Second Edition (Keywords and Formulas)
When you are coming form some RDBMS that have implemented statistics by grouping it are native dialects.
In those cases you could benefit that SAS will do a translation (as that shoudl word) for in-database processing.
As indicated by Jaap, there isn't a SKEW function for the SAS SQL implementation.
Also, it seems weird to include your analysis variable (ret) in your Group BY statement.
You can use proc means or proc univariate to get the skewness.
proc sort data=sashelp.class out=class; by sex; run;
proc means data=class noprint;
by sex;
output out=summary mean(weight)=avg_ret std(weight)=std_ret skew(weight)=skew_ret;
run;
EDIT: to supply working example
I want to get the rolling skewness. Since there is no SKEW function in SQL, I have to think about other ways.
ret meanRet stdRet SkewRet
36110 | 1985-01 | 0.16865 | 0.034400 | 0.083940 |
36110 | 1985-02 | -0.02326 | 0.041494 | 0.087867 |
36110 | 1985-03 | -0.10714 | 0.040525 | 0.088471 |
36110 | 1985-04 | -0.10027 | 0.029867 | 0.090268 |
36110 | 1985-05 | 0.11940 | 0.019866 | 0.090858 |
36110 | 1985-06 | 0.01333 | 0.015292 | 0.082216 |
36110 | 1985-07 | 0.15868 | 0.014435 | 0.082121 |
36110 | 1985-08 | 0.03429 | 0.022072 | 0.086726 |
No it doesn't. Back to basics: compute your own :
data test;
do x = 2,4,6,3,1;
output;
end;
run;
/* Expected : skewness(2,4,6,3,1) = 0.5901286564 */
proc sql;
select count(x) * (sum(x**3) -3*mean(x)*uss(x)+3*mean(x)**2*sum(x)-count(x)*mean(x)**3) /
((count(x)-1)*(count(x)-2)*STD(x)**3) as skewness format=14.10
from test;
quit;
PG
Use PROC MEANS or PROC UNIVARIATE if you want to calculate statistics. You can define your current 24 month merge as a view if you want avoid creating a large intermediate table.
proc sql ;
create view roll24 as
select h2.cusip
, h2.monthDate as periodEndDate format=yymmd7.
, h2.ret
from have as h1 inner join have as h2
on h1.cusip=h2.cusip
and intck("MONTH", h1.monthDate, h2.monthDate) between 1 and 24
group by h2.cusip, h2.monthDate, h2.ret
having count(h2.cusip)=24
;
quit;
proc means data=roll24 ;
by cuspid periodenddate ;
var ret ;
output out=roll mean=meanRet std=stdRet skew=skewRet ;
run;
I tried your code. It does not seem to work.
The MEANS will calculate the statistics properly, you just need to fix the SQL so that it includes the values that you want to include in your calculations. It seems dubious to me to group by RET and also include RET in the returned values? Perhaps you want to group by H2.RET (so you can still see the actual value for that date) and also include H1.RET as another variable so that it cab be fed into PROC MEANS. A small example of what you want would help (perhaps with 3 or 5 instead of 24 months so that the number of records is smaller).
data have ;
do cusip=1,2 ;
rec+1;
baseDate = '01JAN2000'd ;
do month=0 to 26 ;
monthDate = intnx('month',baseDate,month);
ret = ranuni(0);
output;
end;
end;
run;
proc sql ;
create view roll24 as
select h2.cusip
, h2.monthDate as periodEndDate format=yymmd7.
, h2.ret
, h1.ret as lagret
from have as h1
, have as h2
where h1.cusip=h2.cusip
and intck("MONTH", h1.monthDate, h2.monthDate) between 1 and 24
group by h2.cusip, h2.monthDate, h2.ret
having count(h2.cusip)=24
;
quit;
proc means data=roll24 noprint ;
by cusip periodenddate ret ;
var lagret ;
output out=roll mean=meanRet std=stdRet skew=skewRet ;
run;
proc print data=roll ;
run;
period
Obs cusip EndDate ret _TYPE_ _FREQ_ meanRet stdRet skewRet
1 1 2002-01 0.22396 0 24 0.50622 0.29944 -0.26692
2 1 2002-02 0.23832 0 24 0.51477 0.28762 -0.21262
3 1 2002-03 0.64083 0 24 0.49023 0.28492 -0.04201
4 2 2002-01 0.59923 0 24 0.52899 0.28713 0.13578
5 2 2002-02 0.91479 0 24 0.52256 0.28359 0.19230
6 2 2002-03 0.41203 0 24 0.52664 0.28867 0.20478
ZZb is there a reason that you must code in SQL? Do you have a third party DBMS? When you have a third-party DBMS wich one? SQL aggregated functions are as standard as babylonian languages (old tst) for the advanced ones.
I your question the same as: https://communities.sas.com/message/192004
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.