BookmarkSubscribeRSS Feed
phopkinson
Calcite | Level 5

Is there any way of replicating proc surveyselect in CAS for bootstrapping datasets - I have found sampling.srs action but can't seem to get the code working and as a second point proc surveyselect returns the number of times a row has been selected can we do the same with sampling.srs  ? 

2 REPLIES 2
sbxkoenk
SAS Super FREQ

Hello,

 

It's important to know which Viya version you are using , because in Viya 3.5 it is difficult (but possible) to get always the same bootstrap re-samples (across memory purges). In Viya 4, it's easy as there is a "reproducibility button" (plastically expressed).

Please submit :

%put &=sysvlong4;
%put &=SYSVIYARELEASE;
%put &=SYSVIYAVERSION;

... and tell us about the results (see log-screen)!

 

Also, in PROC SURVEYSELECT you probably use method=URS (sampling with equal probability and with replacement, while the sampling.srs action corresponds to method=SRS.

 

Go here for what you need (proper bootstrap re-sampling in CAS), but we may need to assist you more to ensure reproducibility and repeatability (first tell us about your Viya release before I go into details here).

  1. Bootstrap Resampling At Scale: Part 1 (of 3) dd. 03 March 2020
    https://statmike.com/blog/sgf2020p1
  2. Bootstrap Resampling At Scale: Part 2 (of 3) dd. 04 March 2020
    https://statmike.com/blog/sgf2020p2
  3. Bootstrap Resampling At Scale: Part 3 (of 3) dd. 05 March 2020
    https://statmike.com/blog/sgf2020p3

BR, Koen

sbxkoenk
SAS Super FREQ

A note on the above (see my earlier reply).

I understand that eternal reproducibility and repeatability may be desired, but we should not -- in the case of bootstrapping -- exaggerate its importance either. After all, if you get completely different results with slightly different re-samples, that is a good signal that something is wrong with the bootstrapping. Then you probably need more and/or bigger re-samples. As soon as you set up reliable bootstrapping, the actual samples (almost) don't matter anymore. The results will be identical to several decimal places.

 

Koen

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to choose a machine learning algorithm

Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 2 replies
  • 354 views
  • 0 likes
  • 2 in conversation