Hi,
I'm currently waiting to get access for sas viya but I have a test case scenario to compare between sas eg and sas viya and just want to verify the sas viya code. For bootstrapping in sas eg I have this code
proc surveyselect data=data out=BootSamples noprint seed=25 reps=2000 method=urs samprate=1 outhits; run;
Am I right that this would be the equivalent code in sas viya or am i missing something?
/* Start CAS session and load data into CAS */ cas mysess sessopts=(caslib='casuser'); libname mylib cas sessref=mysess; /* Load example data */ proc casutil; load data=data casout="sample" replace; run; /* Perform bootstrap resampling using the sampling action set */ proc cas; action sampling.srs result=r / table={caslib='casuser', name='sample'} output={casout={caslib='casuser', name='BootSamples', replace=true}} samppct=100 /* Sampling rate of 100% for bootstrap */ method='URS' /* Unrestricted random sampling with replacement */ seed=25 /* Seed for reproducibility */ reps=2000; /* Number of bootstrap replicates */ selection={name='Freq', includeFreq=true}; /* Include frequency counts in the output */ quit; /* Fetch and display some of the bootstrap samples (Optional) */ proc cas; table.fetch / table={caslib='casuser', name='BootSamples'} to=10; quit; /* End CAS session */ cas mysess terminate;
This is basically the code that I need. Thanks for your help
/* Initialize CAS Session */
cas mysess sessopts=(caslib='casuser');
libname mylib cas sessref=mysess;
/* Load Dataset into CAS */
proc casutil;
load data=sashelp.cars casout="sample" replace;
quit;
/* Load and Define Action Set */
proc cas;
builtins.actionSetFromTable / table={caslib="Public" name="resampleActionSet.sashdat"} name="resample";
quit;
/* Bootstrap in CAS with Timing */
%let _time1=%sysfunc(time());
proc cas;
resample.bootstrap /
intable='sample'
B=200
seed=12345
Bpct=1
case='ID'
strata='none'
strata_table='none';
datastep.runcode result=t / code='data sample_bs; set sample_bs; host=_hostname_; threadid=_threadid_; run;';
simple.crossTab / table={name="sample_bs" where="bag=1"} row="bsid" col="host" aggregator="N";
simple.crossTab / table={name="sample_bs" where="bag=1"} row="bsid" col="threadid" aggregator="N";
run;
%let _time2=%sysfunc(time());
%let elapsed = %sysevalf(&_time2 - &_time1);
%let minutes = %sysfunc(floor(%sysevalf(&elapsed/60)));
%let seconds = %sysevalf(&elapsed - (&minutes*60));
%put CAS Bootstrap Time: &minutes minutes &seconds seconds;
/* Clear CAS Session */
*cas mysess clear;
Hello,
Method='URS' in sampling.srs action is not an existing parameter. Where did you get that from?
See :
Home > Analytics > SAS Data Science > CAS Bootstrapping
https://communities.sas.com/t5/SAS-Data-Science/CAS-Bootstrapping/m-p/931229#M10828
Koen
Thanks for the link @sbxkoenk .So if I understand correctly if I'm using sas viya 4 I should use method = SRS instead?
@sbxkoenk is this code more appropriate?
/* Perform bootstrap resampling using the sampling action set */
proc cas;
action loadActionSet / actionSet='sampling';
quit;
proc cas;
action sampling.srs result=r /
table={caslib='casuser', name='sample'}
output={casOut={caslib='casuser', name='BootSamples', replace=true}}
samppct=100 /* Sampling rate of 100% for bootstrap */
replace=True /* Sampling with replacement */
seed=25 /* Seed for reproducibility */
reps=2000; /* Number of bootstrap replicates */
selection={name='Freq', includeFreq=true}; /* Include frequency counts in the output */
quit;
I am unable to test it on sas viya currently but would really appreciate if I can verify it is a more accurate reflection of sas viya 4 functionality so I can compare and contrast with sas 9.4
The code is NOT correct.
The below sampling.srs parameters do not exist:
replace=True /* Sampling with replacement */
reps=2000; /* Number of bootstrap replicates */
Here are a few links to get you started.
Please learn about the APPLYROWORDER procedure-option (e.g. PROC PARTITION) and the ADDROWID data set option, as you will need these to ensure reproducibility.
Last but not least , PROC SURVEYSELECT runs on the compute server (SPRE) in SAS Viya 4 ... you can still use this SAS/STAT procedure (it's just not CAS-enabled).
You can turn to the CAS engine (distributed computing) for the remainder of your calculations.
Koen
This is basically the code that I need. Thanks for your help
/* Initialize CAS Session */
cas mysess sessopts=(caslib='casuser');
libname mylib cas sessref=mysess;
/* Load Dataset into CAS */
proc casutil;
load data=sashelp.cars casout="sample" replace;
quit;
/* Load and Define Action Set */
proc cas;
builtins.actionSetFromTable / table={caslib="Public" name="resampleActionSet.sashdat"} name="resample";
quit;
/* Bootstrap in CAS with Timing */
%let _time1=%sysfunc(time());
proc cas;
resample.bootstrap /
intable='sample'
B=200
seed=12345
Bpct=1
case='ID'
strata='none'
strata_table='none';
datastep.runcode result=t / code='data sample_bs; set sample_bs; host=_hostname_; threadid=_threadid_; run;';
simple.crossTab / table={name="sample_bs" where="bag=1"} row="bsid" col="host" aggregator="N";
simple.crossTab / table={name="sample_bs" where="bag=1"} row="bsid" col="threadid" aggregator="N";
run;
%let _time2=%sysfunc(time());
%let elapsed = %sysevalf(&_time2 - &_time1);
%let minutes = %sysfunc(floor(%sysevalf(&elapsed/60)));
%let seconds = %sysevalf(&elapsed - (&minutes*60));
%put CAS Bootstrap Time: &minutes minutes &seconds seconds;
/* Clear CAS Session */
*cas mysess clear;
Indeed.
Just a reminder ... so you (and others) are aware.
resample.bootstrap is not a SAS-supplied action. It's a custom action programmed by the owner of the GitHub - repository.
So, you also need the code to define the action in your SAS Viya environment.
BR, Koen
@sbxkoenk Just to clarify the resampleActionSet action was already available in the public cas library file without having to download or clone directly from the github repository?. Is that correct?. I guess what I need to understand how is sas viya able to use the .sashdat file to load from github repository? Hope I'm making sense. I know when I run this code
/* Load and Define Action Set */
proc cas;
builtins.actionSetFromTable / table={caslib="Public" name="resampleActionSet.sashdat"} name="resample";
quit;
/* List Available Action Sets */
proc cas;
action builtins.actionSetInfo;
run;
I am able to see resample and that it is user defined
@smackerz1988 wrote:
@sbxkoenk Just to clarify the resampleActionSet action was already available in the public cas library file without having to download or clone directly from the github repository?. Is that correct?.
Not correct.
You have to make the custom action set -- that you defined -- available in the PUBLIC CAS library (as a *.sashdat file) and then load from there.
(No need to clone repository ... just download the *.sas program and run or copy/paste and run)
Here's how to proceed (see also the README of the GitHub repository):
Run the code in resample - defineActionSet.sas. Some lines that may need changing:
To use the actions you will need to load the user defined actions with:
builtins.actionSetFromTable / table={caslib="Public" name="resampleActionSet.sashdat"} name="resample";
BR,
Koen
Ah I understand now! Thanks for clarifying that makes sense!
Hi Koen,
I'm running on Viya V.04.00M0P051324 and the sampling action set does not seem to honor the replace=True option for sampling with replacement AND I can't seem to get the GitHub link for the resample action set to work.
Is there a bootstrap action I should be using for this version or do I need to figure out why I can't find the .sas program to create the resample action set ?
thanks,
Ryan
Hello,
Good luck,
Koen
@sbxkoenk wrote:
- You can do the Unrestricted Random Sample (URS) with a datastep as well, but the datastep uses statements (like retain) that will not work on the CAS engine. See Implement five sampling methods in the SAS DATA step - The DO Loop.
If the RETAIN statement was an obstacle (although it is used in CAS example code), it could be omitted, as it appears to be totally redundant in that URS sampling program.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.
Find more tutorials on the SAS Users YouTube channel.