Proc Survey Select - Stratified Random Sampling - non-reproducible samples

Reply
Contributor
Posts: 46

Proc Survey Select - Stratified Random Sampling - non-reproducible samples

Dear all,

 

I have had to do stratified random sampling in recent months and have posted about Proc Survey Select before.

In the past, I have faced problems where if the input data was slightly different in terms of sort order of the stratification variables, I was not getting the same final sampled data-set back even with a fixed seed value.

 

Now I am facing  anew problem: my system recently changed from SAS 9.4 to Enterprise Guide 5.1. I find that now , with the same seed value as before, I don't get the same output samples back. I sort the data as before by the stratification variables, and I wonder if SAS EG and SAS 9.4 have different Sorting algorithms? Is the difference in output data-sets, given I cannot find any change in the input data (sorting order ), due to some difference between SAS and Enterprise Guide?

I'm very confused.

 

 

Thanks for any help.

SAS Super FREQ
Posts: 4,241

Re: Proc Survey Select - Stratified Random Sampling - non-reproducible samples

SAS Enterprise Guide does not have its own algorithms/procedures. It merely generates code and submits it to SAS.  If you are getting different values, could your version of SAS have changed? Or perhaps your data are now stored and retrieved by using a file system (like Hadoop) that does not guarantee order. At any rate, I'm going to guess that the problem is not EG, but something else that changed.

Contributor
Posts: 46

Re: Proc Survey Select - Stratified Random Sampling - non-reproducible samples

Thanks so much for your reply!

I will probably have to investigate the issue more - what software changes have taken place, or if there are any data changes that were made that I was not aware of. (These datasets were created in August-December last year). This definitely is not a very fun problem to have :-(

I will post again if I make any findings.

Super User
Posts: 13,538

Re: Proc Survey Select - Stratified Random Sampling - non-reproducible samples


@nstdt wrote:

Thanks so much for your reply!

I will probably have to investigate the issue more - what software changes have taken place, or if there are any data changes that were made that I was not aware of. (These datasets were created in August-December last year). This definitely is not a very fun problem to have :-(

I will post again if I make any findings.


 

With EG does that mean you are running code with a server? If so is the server of the same operating system and processor as your previous 9.4 install?

 

If the server is different, especially of a different OS, the I would not expect the seed results to return the same results as the processors are going to behave differently. Also changes in the SAS release could result in minor differences in the algorithms used in the random number sequences.

Changes to order or size of the data will result in different outcomes as well. Check the data sets' created and last modified dates to see if such changes are likely.

 

Contributor
Posts: 46

Re: Proc Survey Select - Stratified Random Sampling - non-reproducible samples

Thanks so much for your reply, Ballardw.

 

I was indeed running "Server" SAS EG - and this turned out to be SAS EG Version 5.1. However, the datasets were created using Proc Survey Select in Base SAS (non-Server) Version 9.4. I have been told that, in our department,  only EG Version 7.1 is compatible with SAS 9.4, and it turned out that when I ran the code on SAS EG 7.1, I got the same results as before.

 

(Btw, the IT people also found an OS issue, since my system was upgraded to Windows 10 - can't say I fully understood the issue, so not commenting.)

Seems like there is a lot that goes on in the background behind these versions of SAS! 

 

Thanks again!

 

 

Ask a Question
Discussion stats
  • 4 replies
  • 114 views
  • 2 likes
  • 3 in conversation