About andrea_magatti

andrea_magatti · ‎07-21-2022

Thanks! Working!

andrea_magatti · ‎07-21-2022

Even stranger now: after using your suggestion, I'm getting the same error: NOTE: CALL EXECUTE generated line. 1 + proc fedsql sessref=EG_Session; select distinct id_remi from GNC.REMI_VOL_2008_2021_FINAL where id_class ="MD"; 1 + quit; ERROR: Column "MD" not found or cannot be accessed ERROR: Azione interrotta a causa di errori. ERROR: The FedSQL action was not successful. It still says that it can't find the column, despite the quote function as: data _null_; a= "MD"; set GNC.GNC_CAUSES_NO_EDGE_BACK(obs=1); call execute('proc fedsql sessref=EG_Session; select distinct id_remi from GNC.REMI_VOL_2008_2021_FINAL where id_class ='||quote(a)||'; quit;'); run; Many thanks for considering my request.

andrea_magatti · ‎07-21-2022

Hi all, a simple question: while trying to run a proc fedsql inside a call execute loop i'm getting a strange error: data _null_; a= "MD"; set GNC.GNC_CAUSES_NO_EDGE_BACK(obs=1); call execute('proc fedsql sessref=EG_Session; select distinct id_remi from GNC.REMI_VOL_2008_2021_FINAL where id_class ='||a||'; quit;'); run; BTW: the dataset live in a cas library I'm getting this error: NOTE: There were 1 observations read from the data set GNC.GNC_CAUSES_NO_EDGE_BACK. NOTE: DATA statement used (Total process time): real time 0.01 seconds cpu time 0.01 seconds NOTE: CALL EXECUTE generated line. 1 + proc fedsql sessref=EG_Session; select distinct id_remi from GNC.REMI_VOL_2008_2021_FINAL where id_class =MD; 1 + quit; ERROR: Column "MD" not found or cannot be accessed ERROR: Azione interrotta a causa di errori. ERROR: The FedSQL action was not successful. NOTE: PROC FEDSQL has set option NOEXEC and will continue to prepare statements. NOTE: PROCEDURE FEDSQL used (Total process time): real time 0.00 seconds cpu time 0.01 seconds I've tested the same code using proc SQL, with no problem, but since the main table is huge i would like to use fedsql (much faster) I'm on viya 3.5 executed from a local SAS session.

andrea_magatti · ‎06-27-2022

Thank!

andrea_magatti · ‎06-24-2022

How to set allowXCMD in a viya programming-only environment?

andrea_magatti · ‎03-07-2022

Hi, community, I want to explain a couple of situations where Viya (3.5) behaves differently from SAS 9.4. The cases: dataSciencePilot.featureMachine Proc partition vs. Proc surveyselect In the first case, I encountered a strange behavior with a simple dataset of 15k obs and around 120 features. I added some date vars to the specific input list on a machine with 512GB ram and 80 cores on the first run. To my surprise, the actionset has used all the available ram and the swap, causing the cas process to be killed by OS (Redhat). After that, I realized that the date vars were not helpful for the model I was going to build, so these vars have been dropped. With that change, the process took 30 seconds to complete, so I assume that the distribution of the dates determines some issues. My question is: why Viya hasn't provided any warning in the log? Second case While sampling a dataset needed for the TSNE analysis, I first used the surveyselect procedure for a stratified sampling approach. By mistake, I added the logical key of the dataset to the BY group. The procedure log reported (correctly): ERROR: The number of strata, 14551, is greater than the total sample size, 1456. That's fine! I recognized my mistake, and once corrected, I got the results I needed. Then I tried the same (erroneous) approach with the partition procedure, obtaining the same result I encountered with the dataSciencePilot.featureMachine: The actionSet has consumed both the RAM and the SWAP filesystem without warnings in the log. Could you explain this behavior? I appreciate any help you can provide.

andrea_magatti · ‎08-11-2021

Hi all, I'm writing a final note about GAMSELECT and gam.gamSelect actionSet. Both of them are producing precisely the same results. Still, as a positive note, the actionSet is much faster than the procedure (0.6 seconds vs. 6 seconds), but also gives us the possibility to separate the model building phase, from the scoring stage. This behavior is a natural enhancement to the GAM family, that gives us much more flexibility in a production environment. Thanks again!

andrea_magatti · ‎08-11-2021

Thanks to both! Since the data provided are coming from 12 different forecast models, I used gamselect as an ensemble (btw with excellent results). I'm quite suspicious of using the "allobs" option since the spline parameters are built using even future data. I'm choosing a different approach that identifies extreme forecasts and replaces their value with the mean of all the other models. Thanks

andrea_magatti · ‎08-11-2021

Hi all, while running this code: proc gamselect data=casuser.gam_test_gamselect seed=220870 plots; model target= spline(P_AR_GLO P_AR_sem / degree=2 difforder=1 details df=7 ) spline(P_B_glo P_B_sem / degree=3 difforder=1 details df=7) spline(P_F_sem P_F_tri / degree=3 difforder=2 details df=6) spline(P_O_glo P_O_sta / degree=3 difforder=2 details df=6) spline(P_S_sem P_S_tri / degree=3 difforder=2 details df=6) spline(P_T_sem P_T_sta / degree=3 difforder=2 details df=6) /distribution=normal link=id; displayout SplineDetails=splinedet; partition rolevar=_role_(TRAIN='train' VALIDATE='valid' TEST='test'); selection method=boosting(choose=VALIDATE maxiter=500 STEPSIZE=0.10); output out=casuser.forecast_gamsel copyvars=(data target _role_); run; I'm getting this message: NOTE: One observation with the validation role was omitted due to values outside of the interior knot ranges. NOTE: 481393 bytes were written to the table "gam_model" in the caslib "CASUSER". As far as I know, I could increase the number of knots for any spline terms (actually, each spline is using 10 interior knots, that should be the default values) using the MAXKNOTS= option on each term. But even increasing the number to a huge (and unuseful number), I'm getting the same message. Inspecting the data, the omitted points are "almost" maximum or minimum values in each partition. Any suggestion is welcome. Thanks

andrea_magatti · ‎06-14-2021

Thank Taiyenong, I know that TSMODEL goes on a single machine, but monitoring the machine, I saw that the machine was using 2 CPUs at 100% while the remaining was idle. Whit your suggestion about YWINSISE, I almost cut the time needed by 50%. But still, I can't understand why the DTW algorithm is not fully parallelled since it just calculated all the combinations (pairwise) of the provide time series. Thanks again!

andrea_magatti · ‎06-09-2021

Hi all, I need to cluster many time series. In the past (SAS 9.4), I've used proc timeseries, which was constrained to a single process, and this way, the calculation took a long time to measure the distance between any time series provided. Non I'm on viya 3.5, and I've tried both the TSMODEL approach with the TSD Package and rewritten the code using timeData.runTimeCode action set. But still, I'm in the same situation, even if I'm running the code on a Viya 3.5 machine with 80 physical CPUs. I've got over 8k series, with daily data spanning from Jan2008 up to Apr2021. Here is the code with TSMODEL: proc tsmodel data=casuser.gnc_vol_t_tra_impute_all outlog=casuser.outlog outobj=(of=casuser.outtsddist(replace=YES) ); var _TR1_SI_30:; id data interval=day; require tsd; submit; declare object f(DTW); declare object of(OUTTSD); rc=f.Initialize(); rc=f.SetTarget(&si_remi_30); rc=f.SetOption("METRIC", "RSQRDEV", "NORMALIZE", "STD", "TRIM", "BOTH"); rc=f.Run(); if rc < 0 then stop; rc=of.Collect(f); if rc < 0 then stop; endsubmit; print outlog; run; And here is the code with PROC CAS: %macro cmpcode(); declare object f(DTW); declare object of(OUTTSD); rc=f.Initialize(); rc=f.SetTarget(&batch1_p); rc=f.SetOption('METRIC', 'RSQRDEV', 'NORMALIZE', 'STD', 'TRIM', 'BOTH'); rc=f.Run(); if rc < 0 then stop; rc=of.Collect(f); if rc < 0 then stop; %mend; proc cas; * like proc contents or SQL on Dictionary libref ; table.columnInfo result=allvars / table={name="gnc_vol_t_tra_impute_all"}; run; saveresult allvars casout="myallvars"; * reading vars with a custom filter for interval vars; table.fetch result=selectedVars / table={name='myallvars', where=" Column not like '_TR1_tot_%' and Column not in ('_TR1_period', '_TR1_residuo', '_TR1_settimana', 'data', 'int_conf', '_NAME_', '_PGNC_') "}, fetchvars={{name='Column'}} to=&limit maxrows=&limit; run; * array creation for runtimecode and DST object ; varList=${}; oth_varlist=${}; do row over selectedVars.Fetch; singleVar=compress(row.Column); /* varList[row._Index_]= "{name="||quote(singleVar) || "}"; */ varList[row._Index_]= singleVar; end; print varList; cmpcode="%cmpcode()"; timeData.runTimeCode result=run / table={name="gnc_vol_t_tra_impute_all"} logControl={{keep=TRUE, sev="ERROR"}} require={{pkg="TSD"}} series=varList timeid="data" interval="day" objOut={ {objRef="of", table={name='outtsddist' replace=TRUE}} } logout ={name="TSMODEL_LOG" replace=True} code=cmpcode; run; quit; And here I'm reporting the time expended: N# Vars Secs 10 3,60 20 14,40 40 57,60 80 230,40 160 921,60 320 3.686,40 If I project the time needed for the 8k time series, I will need over 40 days of calculation. My question is: has SAS implemented some faster algorithms like MASS (https://www.cs.unm.edu/~mueen/FastestSimilaritySearch.html)? If not, any suggestion is welcome!

andrea_magatti · ‎05-17-2021

Thank you so much! I've used much of your code to adapt it to my specific case, which uses external forecast. I'm posting my code for future users, in search of solutions! options casdatalimit=ALL; cas mysession sessopts=(caslib=casuser timeout=1800); caslib _all_ assign; caslib query datasource=(srctype="path") path="/sasdata/Stl_analysis/query/" local; libname query cas caslib=query; proc cas; table.fileInfo result=fileList / kbytes=true; do row over fileList.FileInfo; print row.name; table.loadtable / path=row.name, caslib='query', casout={caslib='query', name=substr(row.name , 1, index(row.name, '.')-1), replace=true}; end; quit; /* test DS */ data casuser.test; set query.nn_rciv_g_preds_hist; keep date pred_: ae_: stl_tgt_ricciv_all P_rciv_ens; rename P_rciv_ens=prediction; rename stl_tgt_ricciv_all = target; run; proc tsmodel data=casuser.test outlog=casuser.outlog logcontrol=(error=keep warning=keep note=keep none=keep) outobj=( outfor=casuser.forecast outstat=casuser.stats outest =casuser.estimation ); id date interval=day setmissing=missing; var ae_: pred_:; print outlog; require atsm ; require tsm; submit; /* Initialize the data frame with Y-series and independent variables */ declare object dataFrame(tsdf); rc = dataFrame.initialize(); rc = dataFrame.AddY(pred_globale_m1_1g); rc = dataFrame.AddY(pred_globale_m1_2g); rc = dataFrame.AddY(ae_globale_m1_1g); rc = dataFrame.AddY(ae_globale_m1_2g); rc = dataFrame.addX(date,'REQUIRED','YES'); /* object for collecting results */ declare object outstat(outstat); declare object outfor(outfor); declare object outest(outest); /* Specify external models to be imported */ declare object ext_glob_m1_1g(exmspec); declare object ext_glob_m1_2g(exmspec); /* object responsible for combining models */ declare object ensemble(COMBSPEC); /* object responsible for running forecast*/ declare object makeForecast(FORENG); /* TSM object populated with external series*/ declare object tsm_glob_m1_1g(tsm); declare object tsm_glob_m1_2g(tsm); /* for each external model we define the array*/ /* containing Prediction and Absolute Error */ rc = ext_glob_m1_1g.Open(); rc = ext_glob_m1_1g.SetOption('Predict', 'pred_globale_m1_1g' ,'Error', 'ae_globale_m1_1g'); rc = ext_glob_m1_1g.Close(); rc = ext_glob_m1_2g.Open(); rc = ext_glob_m1_2g.SetOption('Predict', 'pred_globale_m1_2g', 'Error', 'ae_globale_m1_2g'); rc = ext_glob_m1_2g.Close(); /* Initialization and run of first model */ rc = tsm_glob_m1_1g.Initialize(ext_glob_m1_1g); rc = tsm_glob_m1_1g.AddExternal(pred_globale_m1_1g, 'predict'); rc = tsm_glob_m1_1g.AddExternal(ae_globale_m1_1g, 'error'); rc = tsm_glob_m1_1g.Run(); /* Initialization and run of second model */ rc = tsm_glob_m1_2g.Initialize(ext_glob_m1_2g); rc = tsm_glob_m1_2g.AddExternal(pred_globale_m1_2g, 'predict'); rc = tsm_glob_m1_2g.AddExternal(ae_globale_m1_2g, 'error'); rc = tsm_glob_m1_2g.Run(); /* Combspec object preparation */ rc=ensemble.Open(2); rc=ensemble.AddFrom(tsm_glob_m1_1g); rc=ensemble.AddFrom(tsm_glob_m1_2g); rc=ensemble.SetOption('WEIGHT', 'RANKWGT'); rc=ensemble.Close(); /* FORENG object execution */ rc=makeforecast.initialize(dataFrame); rc=makeForecast.addFrom(ensemble); rc=makeForecast.SetOption('CRITERION', 'MAE'); rc=makeForecast.run(); rc =outfor.collect(makeForecast, 'ALL'); rc =outest.collect(makeForecast); rc =outest.collect(makeforecast); endsubmit; quit; /* Print the results */ proc print data=casuser.estimation;title 'OUTEST=';run; proc print data=casuser.forecast;title 'OUTFOR=';run; As you can see I've used object both from atsm package and from the tsm package to deal with external models ( Neural Net models) Thanks again

andrea_magatti · ‎05-12-2021

Thank you so much! I have access to the atsm package. I have read something like what you are talking about, and a simple example is really appreciated. Thanks again!

andrea_magatti · ‎05-11-2021

Thanks @GriloCanibal I'm responding now... Better late than never. Thanks for your suggestion. Everything works fine. Just one more doubt: after using the Run method, I'm able to extract the combined forecast, but is it possible to extract the weights used by the specific Criterion that has been used? Thanks

andrea_magatti · ‎07-21-2020

Hi all, I would like to use the TSM package of the TSMODEL procedure, to use the CFC Object using forecasts from external models: The Scenario is: I have 6 NNET models, each one is forecasting the same variable with different approaches, in terms of data and architecture of the Neural Network. I want to use some methods of the CFC object to calculate a weighted average between the six forecasts. As far as I have understood, I could use the EXMSPEC to build a TSM object usable in CFC. The code runs with no error, but seems like is not producing any forecast. What am I missing? proc tsmodel data=casuser.test outobj=( fitstats = casuser.ts_comb_fit forecast = casuser.ts_comb_for ) outarray = casuser.predict outscalar = casuser.inputs; id data_gas_G1 interval=day; /* inscalars &AE_list; */ /* inscalars &P_List; */ var Target; require tsm; outarrays weighted_forecast; outscalars p_fu p_ter; submit; declare object fitstats(tsmstat); declare object forecast(tsmfor); /* Specify external models */ declare object ext_full(exmspec); declare object ext_terna(exmspec); declare object newdafne(CFC); declare object tsm_full(tsm); declare object tsm_terna(tsm); rc = ext_full.Open(); rc = ext_full.SetOption( 'Predict', 'P_Full', 'Error', 'AE_Full'); rc = ext_full.Close(); rc = ext_terna.Open(); rc = ext_terna.SetOption( 'Predict', 'P_Terna', 'Error', 'AE_Terna'); rc = ext_terna.Close(); array p_fu[1] / nosymbols; array ae_fu[1] / nosymbols; array p_ter[1] / nosymbols; array ae_ter[1] / nosymbols; p_fu[1] = P_Full; ae_fu[1] = AE_Full; p_ter[1] = P_Terna; ae_ter[1] = AE_Terna; rc = tsm_full.Initialize(ext_full); rc = tsm_full.AddExternal(p_fu, 'predict'); rc = tsm_full.AddExternal(ae_fu, 'error'); rc = tsm_terna.Initialize(ext_terna); rc = tsm_terna.AddExternal(p_ter, 'predict'); rc = tsm_terna.AddExternal(ae_ter, 'error'); rc = newdafne.Initialize(); rc = newdafne.SetY(Target); rc = newdafne.AddModel(tsm_full); rc = newdafne.AddModel(tsm_terna); rc = newdafne.AddPredict(p_fu, ae_fu, 'Full'); rc = newdafne.AddPredict(p_ter, ae_ter, 'Terna'); rc = newdafne.Criterion('Fit'); rc = newdafne.SetOption('weight', 'ols'); rc = newdafne.Run(); rc = newdafne.GetForecast ('Predict', weighted_forecast); /* nfor = newdafne.nfor(); */ /* put nfor=; */ rc = fitstats.Collect(newdafne); rc = forecast.Collect(newdafne); endsubmit; run;

Online Status	Offline
Date Last Visited	‎05-30-2024 06:03 AM

Re: Calling FEDSQL from call execute loop

Re: Calling FEDSQL from call execute loop

Calling FEDSQL from call execute loop

Re: SAS Viya 3.5: SAS Studio and SAS Compute Server non-functional imp...

Re: SAS Viya 3.5: SAS Studio and SAS Compute Server non-functional imp...

Feature engineering with autopilot actions, and samplig (surveyselect ...

Re: GAMSELECT and values outside of the interior knot ranges

Re: GAMSELECT and values outside of the interior knot ranges

GAMSELECT and values outside of the interior knot ranges

Re: Time series distances issue with time needed for over 8k series

Re: PROC MCMC how to specify a spike and slab prior

Re: Rename all variables at once

Re: log transformation

Re: GAMSELECT and values outside of the interior knot ranges

Re: GAMSELECT and values outside of the interior knot ranges

Re: TSMODEL: using the CFC object with external forecast

TSMODEL: using the CFC object with external forecast

Snam Business Unit Asset - Monitoring of the aggressiveness of the ter...

Re: Calling FEDSQL from call execute loop

Re: Calling FEDSQL from call execute loop

Calling FEDSQL from call execute loop

Re: SAS Viya 3.5: SAS Studio and SAS Compute Server non-functional imp...

Re: SAS Viya 3.5: SAS Studio and SAS Compute Server non-functional imp...

Feature engineering with autopilot actions, and samplig (surveyselect ...

Re: GAMSELECT and values outside of the interior knot ranges

Re: GAMSELECT and values outside of the interior knot ranges

GAMSELECT and values outside of the interior knot ranges

Re: Time series distances issue with time needed for over 8k series

Time series distances issue with time needed for over 8k series

Re: TSMODEL: using the CFC object with external forecast

Re: TSMODEL: using the CFC object with external forecast

Re: TSMODEL: using the CFC object with external forecast

TSMODEL: using the CFC object with external forecast

SAS Hacker's Hub

SAS Inner Circle Panel

SAS Analytics Explorers