SAS Support Communities

Edoedoedo · ‎12-22-2023

Hi, could you please verify if this is the correct way to implement a sarimax model with ssm, especially for the exogenous part? proc ssm data=casuser.series optimizer(technique=ACTIVESET maxiter=10) like=DIFFUSE; id d interval=day; trend sarima(arima(d=1 sd=1 p=1 q=1 sq=1 s=52)) ; *trend arimaTrend(arma(d=1 sd=1 p=1 q=1 sq=1 s=52)) ar=0.8550 ma=-0.9781 -0.9990 ; model v = sarima x; output outfor=casuser.For; run; proc ssm data=casuser.series optimizer(technique=ACTIVESET maxiter=10) like=DIFFUSE; id d interval=week; trend sarima(arima(d=1 sd=1 p=1 q=1 sq=1 s=52)) ; model v = sarima x; output outfor=casuser.For; run; The Series dataset has 3 columns (d=date, v=endogenous variable, x=exogenous variable). I define an arima trend with that parameters. I model the v as: that trend AND the x, simply put in there. Is this the correct formulation for the sarimaX model? Thank you

Edoedoedo · ‎09-27-2023

Thanks, so the caveat was just not to apply "read convey" at SASContent level, so in the subfolder the authenticated users does not have the read permission by default, it seems much cleanerò

Edoedoedo · ‎09-22-2023

Hi, I have a folder in the SASContent like /SASContent/Projects/SecretProject. I need to restrict all grants for this specific path only to the group "SpecialGroup". I can a new rule to grant "SpecialGroup" everything for that path without problems. But the "Authenticated Users" principal is present everywhere with READ grant on /SASContent and convey, and hence the SecretProject folder inherits the READ grant for "Authenticated Users": I cannot deny that, otherwise I would lock everyone out. How can I secure that path and only that from "Authenticated Users", still keeping the general READ grant on all /SASContent and convey for "Authenticated Users" BUT on the SecretProject folder? Using Viya 2023.x

Edoedoedo · ‎09-14-2023

Consider this scenario: - We have a table X in the Caslib DB1 on the Cas server cas-shared-default-1 on the Viya server server1.ondemand.sas.com - We have a table Y in the Caslib DB2 on the Cas server cas-shared-default-2 on the Viya server server2.ondemand.sas.com I'm working in Sas Studio e.g. on the server2 environment. I want to write an operation which involves both servers, i.e. a datastep which concatenates the tables X and Y: data DB2.Z; set DB1.X DB2.Y; run; That requires a working connection to both the cas servers at the same time, the concatenation calculation should happen on the server2 since I'm working there, the server1 should only be used to read the table. Clearly that should be performed entirely on memory, without "tricks" like downloading the table X in sas94, making a signon between the two sas94, and reloading the sas dataset in memory again on server2, that would be a complete waste of resources. How would you suggest to implement this, making sure that everything keeps working exclusively in the two Cas? Thank you

Edoedoedo · ‎01-13-2023

Thanks, I'm trying changing for example the optimizer and the maximum iterations, sometimes I get a little bit closer sometimes not, so far I'm not succeeding in finding the corresponding config but I'll keep trying 🙂

Edoedoedo · ‎01-09-2023

Thank you VERY much for your answer, that seems the key to understand the difference between the two models. Just one question: the coefficients you got ar=0.8550 ma=-0.9781 -0.9990 are indeed very close to the coefficients I get with statsmodels (and other libraries, i.e. the above mentioned darts); however how did you find them out? Just running: trend arimaTrend(arma(d=1 sd=1 p=1 q=1 sq=1 s=52)) ; leads to different coefficients: which are very close to the original proc carima results (in fact I get the same prediction as proc carima, not the state space one). What am I missing? Thanks a lot again, Regards

Edoedoedo · ‎12-27-2022

Yes of course it may be some mysterious trick from statsmodels implementation 😁 I tried to read their documentation as well and dig in their code being open source, so far I didn't figure out anything special. Naturally I don't doubt neither sas nor python implementations, I believe they are very robust, I just need to find the reason behind the models difference to be able to explain it to my business. When I ported their models to sas I didn't even check the results since I believed they would have been equals by default, but then they pointed out that there are several series where the difference is quite strong (measured i.e. with rmse, python ones are smaller), so I took this series as an example and started to dig deeper isolating the code to try to figure out why. Trying to compare the coefficients as you suggest, I tried to insert one pdq at a time in both models and comparing coefficient estimates each time. I found out that up to the simple model (1,0,1) they are approximately the same, when adding (1,1,1) they begin to be different and also when adding seasonality (1,0,1)x(0,0,1)52 they are different; mixing differenciating with seasonality to build the final model (1,1,1)x(0,1,1)52 they are a lot different. Another test I tried: rewriting the model with proc tsmodel; the results are identical to proc carima/unitimeseries.arima. Thanks a lot for your time and your help, very much appreciated. Regards

Edoedoedo · ‎12-27-2022

Thank you for your answer, yes afaik proc carima and uniTimeSeries.arima are the same, I've been using the latter because at the end I will call it with API, but proc carima code is totally ok (I will translate it later). So I tried your new model specification, and it gets even worse ("sas2" is this new model): Changing some parameters just to experiment with, I see that "maxiter" and "converge" have no effect on the result, while "noint" if is false produces the "sas2" forecast, while if is true produces the "sas" forecast, so without the intercept seems closer to python (I saw that python SARIMAX doesn't use the intercept by default). It seems so weird, that particular shape you can see at 450 is present both in python and sas, it seems they are doing the same calculation, but sas one has this mysterious "trend" or "offset" upward...

Edoedoedo · ‎12-27-2022

Thanks, with your model specification it is getting a little bit closer, but still is much worse than python, I really can't figure out why: SAS: PYTHON Here's a comparison on 52 steps forecast: As you can see python and sas predictions seem to have a similar shape but sas is off of some kind of weird offset (around -3, but variable it is not a constant). And also the sas 95% confidence interval is way broader than python one (at step 52, [11.6813, 98.6936] vs [37.640213, 66.402516]) About the interval it doesn't really matter, yes the data are weekly (hence the '52') but they already are processed (averaged, missing filled, etc) so it is just a sequence of numbers, dates don't matter nor the interval. In fact, in python I even didn't use the fake date column, while sas needs the interval= parameter so I just provided a fake incremental date with interval=day. What do you think? Thanks a lot again, Regards

Edoedoedo · ‎12-22-2022

Hi, I have a very simple time series (attached) which I need to model and forecast with sarima. The model is already defined: p,d,q=(1,1,1), the seasonality is 52, and the p,d,q for seasonality are = (0,1,1). I fit the model with python statsmodels library SARIMAX: import pandas as pd from statsmodels.tsa.statespace.sarimax import SARIMAX df = pd.read_csv('series.csv',sep=";") model_training = SARIMAX(df.v, order=(1, 1, 1), seasonal_order=(0, 1, 1, 52)).fit(disp=False) model_training.get_forecast(1).summary_frame() and I get for example the first forecast value: mean mean_se mean_ci_lower mean_ci_upper 55.683232 3.071487 49.663228 61.703237 Then I fit the very same model with the very same data in CAS using uniTimeSeries.arima: data series; infile "series.csv" dsd truncover firstobs=2 delimiter=";"; input d $ v; date = input(d,ddmmyy10.); format date yymmdd10.; drop d; rename date=d; run; cas session; libname CASUSER cas caslib="CASUSER"; data CASUSER.series; set series; run; proc cas; uniTimeSeries.arima / table={name="series", caslib="CASUSER"} timeId={name="d"} interval="day" outFor={name="for", replace=True} outEst={name="est", replace=True} series={{name='v', model={{estimate={p={{factor={1}}} q={{factor={1,52}}},diff={1, 52},noint=True}, forecast={{lead=1}}}} }} ; run; quit; and I get for example the first forecast value: Forecast Std Error 95% Confidence Limits 56.4786 3.5467 49.5272 63.4301 As you can see the forecasted value is different, and the SAS model is generally worse: the true value is closer to 55, and also the CL are broader. And it gets worse requesting more forecast leads, while the python model keeps being more accurate. My question is: why? How it is possible that the SAS model has a different result? I tried to change maybe the noint parameter or the convergence criteria to see maybe if there were some different defaults, but no matter what I change the python model is always better. What am I missing? Just to contextualize: in the project I'm following, the team has already experimented successfully with python statsmodels for some time series; now they want to apply the model to thousands of time series exploiting the CAS parallelism, but they are being very disappointed since the models they already know were good are not goot anymore in sas, and I really don't understand why. Note: my final model must run in CAS, so please do not provide sas base code. Thanks a lot!

Edoedoedo · ‎07-12-2021

Thank you, as you suggested I opened a ticket for Technical Support. I thought as well that the problem may have something related to missing values, but after removing all missing values still the result is 719, so I guess the missing are handled correctly and the problem is not there. I'll share useful news as soon as I have ones. Regards

Edoedoedo · ‎07-09-2021

Hi, I have a time series with it's prediction (attached). I calculate the SMAPE with: cas session; libname D cas caslib="CASUSER"; proc tsmodel data=D.TESTSMAPE outobj=(utlstat=D.OUTSTATS(replace=YES)); id DATE interval=day; var MEASURED SARIMA; require utl; submit; declare object utlstat(utlstat); rc = utlstat.Collect(MEASURED, SARIMA); endsubmit; run; In D.OUTSTATS I get that SMAPE = 719.77. That is very wierd, since the SMAPE metric is bounded between [0%,200%]. I tried the same calculation by hand using the standard SMAPE formula 100%/N * SUM [ |PREDICTED-MEASURED| / ((|MEASURED|+|PREDICTED|)/2) ] and I get SMAPE = 119.43 which seems correct, as shown in the attached excel file. So what is SAS doing? Where that 719.77 value comes from?!? Thank you Regards

Edoedoedo · ‎06-17-2021

Thank you, I completely forgot to check the periodicity, That time series is part of a large set of time series on which perform the stationarity test and I didn't noticed that most of them were daily but some were not. That explains the calculation failure, thank you very much!

Edoedoedo · ‎06-17-2021

Hi, consider this code: proc tsmodel data=MYCAS.ZZZ outscalar=MYCAS.AAA; id DATE interval=day; var VALORE; outscalars pvalue rc; require tsa; submit; declare object tsa(tsa); rc = tsa.stationaritytest(VALORE,,,,'SSM',pvalue); rc = rc; pvalue = pvalue; endsubmit; run; I want to calculate the stationarity test pvalue for the time series (attached as csv): However in the results I see: so that the pvalue is missing and the return code is 0, which according to the documentation (link) means Time series is stationary with the default significance level of 0.05 which clearly is false here. So what's wrong? Can you help me to figure out what is happening? Moreover, what kind of test is stationaritytest performing? Is unclear from the documentation. Note: I must use only procedures cas-enabled. Thanks Regards

Edoedoedo · ‎06-09-2021

Hi, I need to run the KPSS stationarity test on Viya, but I found no info about KPSS in the Time Series Analysis Package, only ADF. How could it be done? Note: the time series I'm dealing with are thousands and very long, so it needs to be done on CAS, SAS9.4 is not an option. Thank you Regards

Online Status	Offline
Date Last Visited	‎12-22-2023 09:35 AM

SAS Support Communities

proc ssm and sarimax

Re: Allow access only to a given group for a SASContent folder and den...

Allow access only to a given group for a SASContent folder and deny to...

Working with caslibs in different cas servers

Re: uniTimeSeries.arima results are different (and worse) than statsmo...

Re: uniTimeSeries.arima results are different (and worse) than statsmo...

Re: uniTimeSeries.arima results are different (and worse) than statsmo...

Re: uniTimeSeries.arima results are different (and worse) than statsmo...

Re: uniTimeSeries.arima results are different (and worse) than statsmo...

uniTimeSeries.arima results are different (and worse) than statsmodels...

Re: Allow access only to a given group for a SASContent folder and den...

Re: uniTimeSeries.arima results are different (and worse) than statsmo...

Re: SMAPE calculation in tsmodel/utlstat seems to give wrong result

Re: TSA STATIONARITYTEST gives null pvalue

Re: TSA STATIONARITYTEST gives null pvalue

Re: Recove from error

Action to list all tables in a given caslib (loaded tables, not files)

Re: uniTimeSeries.arima results are different (and worse) than statsmo...

Macro for loop hangs exactly at 27th cycle

Re: Reuse SASLogon user/pass authentication to get an oauth token for ...

proc ssm and sarimax

Re: Allow access only to a given group for a SASContent folder and den...

Allow access only to a given group for a SASContent folder and deny to...

Working with caslibs in different cas servers

Re: uniTimeSeries.arima results are different (and worse) than statsmo...

Re: uniTimeSeries.arima results are different (and worse) than statsmo...

Re: uniTimeSeries.arima results are different (and worse) than statsmo...

Re: uniTimeSeries.arima results are different (and worse) than statsmo...

Re: uniTimeSeries.arima results are different (and worse) than statsmo...

uniTimeSeries.arima results are different (and worse) than statsmodels...

Re: SMAPE calculation in tsmodel/utlstat seems to give wrong result

SMAPE calculation in tsmodel/utlstat seems to give wrong result

Re: TSA STATIONARITYTEST gives null pvalue

TSA STATIONARITYTEST gives null pvalue

KPSS stationarity test in Viya

SAS Analytics Explorers

Follow Us

What is...