BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
smackerz1988
Pyrite | Level 9

Hi,

I'm currently waiting to get access for sas viya but I have a test case scenario to compare between sas eg and sas viya and just want to verify the sas viya code. For bootstrapping in sas eg I have this code 

proc surveyselect data=data​

out=BootSamples noprint​

seed=25 ​

reps=2000   ​

method=urs​

samprate=1​

outhits;​

run;

Am I right that this would be the equivalent code in sas viya or am i missing something?

/* Start CAS session and load data into CAS */
cas mysess sessopts=(caslib='casuser');
libname mylib cas sessref=mysess;

/* Load example data */
proc casutil;
    load data=data casout="sample" replace;
run;

/* Perform bootstrap resampling using the sampling action set */
proc cas;
    action sampling.srs result=r /
        table={caslib='casuser', name='sample'}
        output={casout={caslib='casuser', name='BootSamples', replace=true}}
        samppct=100 /* Sampling rate of 100% for bootstrap */
        method='URS' /* Unrestricted random sampling with replacement */
        seed=25 /* Seed for reproducibility */
        reps=2000; /* Number of bootstrap replicates */
        selection={name='Freq', includeFreq=true}; /* Include frequency counts in the output */
quit;

/* Fetch and display some of the bootstrap samples (Optional) */
proc cas;
    table.fetch / table={caslib='casuser', name='BootSamples'} to=10;
quit;

/* End CAS session */
cas mysess terminate;

 

1 ACCEPTED SOLUTION

Accepted Solutions
smackerz1988
Pyrite | Level 9

This is basically the code that I need. Thanks for your help 

/* Initialize CAS Session */
cas mysess sessopts=(caslib='casuser');
libname mylib cas sessref=mysess;

/* Load Dataset into CAS */
proc casutil;
    load data=sashelp.cars casout="sample" replace;
quit;

/* Load and Define Action Set */
proc cas;
    builtins.actionSetFromTable / table={caslib="Public" name="resampleActionSet.sashdat"} name="resample";
quit;

/* Bootstrap in CAS with Timing */
%let _time1=%sysfunc(time());

proc cas;
    resample.bootstrap / 
        intable='sample' 
        B=200 
        seed=12345 
        Bpct=1 
        case='ID' 
        strata='none' 
        strata_table='none';
    datastep.runcode result=t / code='data sample_bs; set sample_bs; host=_hostname_; threadid=_threadid_; run;';
    simple.crossTab / table={name="sample_bs" where="bag=1"} row="bsid" col="host" aggregator="N";
    simple.crossTab / table={name="sample_bs" where="bag=1"} row="bsid" col="threadid" aggregator="N";
run;

%let _time2=%sysfunc(time());
%let elapsed = %sysevalf(&_time2 - &_time1);
%let minutes = %sysfunc(floor(%sysevalf(&elapsed/60)));
%let seconds = %sysevalf(&elapsed - (&minutes*60));
%put CAS Bootstrap Time: &minutes minutes &seconds seconds;

/* Clear CAS Session */
*cas mysess clear; 

 

View solution in original post

9 REPLIES 9
sbxkoenk
SAS Super FREQ

Hello,

 

Method='URS' in sampling.srs action is not an existing parameter. Where did you get that from?

 

See :

Home > Analytics > SAS Data Science > CAS Bootstrapping
https://communities.sas.com/t5/SAS-Data-Science/CAS-Bootstrapping/m-p/931229#M10828

 

Koen

smackerz1988
Pyrite | Level 9

Thanks for the link @sbxkoenk .So if I understand correctly if I'm using sas viya 4 I should use method = SRS instead?

smackerz1988
Pyrite | Level 9

@sbxkoenk is this code more appropriate? 

/* Perform bootstrap resampling using the sampling action set */
proc cas;
    action loadActionSet / actionSet='sampling';
quit;

proc cas;
    action sampling.srs result=r /
        table={caslib='casuser', name='sample'}
        output={casOut={caslib='casuser', name='BootSamples', replace=true}}
        samppct=100 /* Sampling rate of 100% for bootstrap */
        replace=True /* Sampling with replacement */
        seed=25 /* Seed for reproducibility */
        reps=2000; /* Number of bootstrap replicates */
        selection={name='Freq', includeFreq=true}; /* Include frequency counts in the output */
quit;

I am unable to test it on sas viya currently but would really appreciate if I can verify it is a more accurate reflection of sas viya 4 functionality so I can compare and contrast with sas 9.4

sbxkoenk
SAS Super FREQ

The code is NOT correct.

The below sampling.srs parameters do not exist:

        replace=True /* Sampling with replacement */
        reps=2000; /* Number of bootstrap replicates */

 

Here are a few links to get you started.

  1. https://github.com/statmike/Resampling-Methods-in-SAS-Viya     : Fast, easy resampling methods using the SAS Viya CAS Engine (bootstrap, double-Bootstrap, jackknife)
  2. Bootstrap Resampling At Scale: Part 1 (of 3) dd. 03 March 2020
    https://statmike.com/blog/sgf2020p1
  3. Bootstrap Resampling At Scale: Part 2 (of 3) dd. 04 March 2020
    https://statmike.com/blog/sgf2020p2
  4. Bootstrap Resampling At Scale: Part 3 (of 3) dd. 05 March 2020
    https://statmike.com/blog/sgf2020p3

Please learn about the APPLYROWORDER procedure-option (e.g. PROC PARTITION) and the ADDROWID data set option, as you will need these to ensure reproducibility.

Last but not least , PROC SURVEYSELECT runs on the compute server (SPRE) in SAS Viya 4 ... you can still use this SAS/STAT procedure (it's just not CAS-enabled).
You can turn to the CAS engine (distributed computing) for the remainder of your calculations.

Koen

smackerz1988
Pyrite | Level 9

This is basically the code that I need. Thanks for your help 

/* Initialize CAS Session */
cas mysess sessopts=(caslib='casuser');
libname mylib cas sessref=mysess;

/* Load Dataset into CAS */
proc casutil;
    load data=sashelp.cars casout="sample" replace;
quit;

/* Load and Define Action Set */
proc cas;
    builtins.actionSetFromTable / table={caslib="Public" name="resampleActionSet.sashdat"} name="resample";
quit;

/* Bootstrap in CAS with Timing */
%let _time1=%sysfunc(time());

proc cas;
    resample.bootstrap / 
        intable='sample' 
        B=200 
        seed=12345 
        Bpct=1 
        case='ID' 
        strata='none' 
        strata_table='none';
    datastep.runcode result=t / code='data sample_bs; set sample_bs; host=_hostname_; threadid=_threadid_; run;';
    simple.crossTab / table={name="sample_bs" where="bag=1"} row="bsid" col="host" aggregator="N";
    simple.crossTab / table={name="sample_bs" where="bag=1"} row="bsid" col="threadid" aggregator="N";
run;

%let _time2=%sysfunc(time());
%let elapsed = %sysevalf(&_time2 - &_time1);
%let minutes = %sysfunc(floor(%sysevalf(&elapsed/60)));
%let seconds = %sysevalf(&elapsed - (&minutes*60));
%put CAS Bootstrap Time: &minutes minutes &seconds seconds;

/* Clear CAS Session */
*cas mysess clear; 

 

sbxkoenk
SAS Super FREQ

Indeed.

Just a reminder ... so you (and others) are aware.

resample.bootstrap is not a SAS-supplied action. It's a custom action programmed by the owner of the GitHub - repository.

So, you also need the code to define the action in your SAS Viya environment.

BR, Koen

smackerz1988
Pyrite | Level 9

@sbxkoenk Just to clarify the resampleActionSet action was already available in the public cas library file without having to download or clone directly from the github repository?. Is that correct?. I guess what I need to understand how is sas viya able to use the .sashdat file to load from github repository? Hope I'm making sense. I know when I run this code 

/* Load and Define Action Set */
proc cas;
    builtins.actionSetFromTable / table={caslib="Public" name="resampleActionSet.sashdat"} name="resample";
quit;

/* List Available Action Sets */
proc cas;
    action builtins.actionSetInfo;
run;

I am able to see resample and that it is user defined 

smackerz1988_0-1718464453991.png

 

 

sbxkoenk
SAS Super FREQ

@smackerz1988 wrote:

@sbxkoenk Just to clarify the resampleActionSet action was already available in the public cas library file without having to download or clone directly from the github repository?. Is that correct?.


Not correct.

You have to make the custom action set -- that you defined -- available in the PUBLIC CAS library (as a *.sashdat file) and then load from there.

(No need to clone repository ... just download the *.sas program and run or copy/paste and run)

 

Here's how to proceed (see also the README of the GitHub repository):

Setting up the actions in your environment

Run the code in resample - defineActionSet.sas. Some lines that may need changing:

  • line 1: connects to a CAS session
  • To Save the actions for future sessions and use by other users:
    • line 174: create an in-memory table of the action set
    • line 175: persist the in-memory table in .sashdat file. Here it is pointed as caslib="Public".
  • If you need to remove the action set then uncomment and use:
    • line 177: removes the persisted in-memory table

Actions Instructions

To use the actions you will need to load the user defined actions with:

builtins.actionSetFromTable / table={caslib="Public" name="resampleActionSet.sashdat"} name="resample";

BR,
Koen

smackerz1988
Pyrite | Level 9

Ah I understand now! Thanks for clarifying that makes sense!

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to choose a machine learning algorithm

Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 9 replies
  • 698 views
  • 5 likes
  • 2 in conversation