- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I am new to sas programming and am having difficulty trying to perform a certain function. I want to generate 4 random numbers from a given range, without replacement, based on given probabilities that each number will be chosen, and I want to perform 2000 repetitions of this. I have been searching trying to find a way to do this, however everywhere just says to put rand("normal") or rand("uniform") but this will not perform my intended task. I also know the theta and sigma for each of the 4 numbers based on 1800 actual past observations. Any help on how I might perform this would be great.
Thanks
p.s. I am using sas university edition
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
If I may offer some friendly advice, I suggest
1. Talk to your advisor/professor. He/She wants you succeed.
2. Consider changing to the model I proposed. In that model, you would count how many times each ball has EVER appeared (regardless of whether it was the first, second, ... or fifth ball). That is a standard probability model in which the probability of drawing each ball is constant and the draws are independent.
If you attempt the simpler project, it will still be challenging and you will still learn a lot about SAS programming and simulation. However, the simpler problem will be more tractable for someone with your level of experience.
Good luck!
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
What kind of range you do need, provide some example.
What is your theta supposed to represent from which distribuion? if from a Normal distribution you can specify the parameter
x=rand('NORMAL',theta,sigma); if theta is your mean and sigma the standard deviation that you want have your sample represented from.
There is also rand('TABLE'). You provide a list of probabilities.
x=rand('TABLE', 0.1,0.2,0.5,0.2); would returen a 1 with probability .1, a 2 with probability .2, a 3 with probability .5 or a 4 with probability .2
p= 1/6;
x=rand('TABLE',p,p,p,p,p,p); does a good job of simulating a 6-sided die.
There are multiple ways to map the 1,2,3, to other values if needed.
If you need result ranges sometimes you have to recalculate but specific approaches may depend on what you are attempting.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I have a table created with the range of numbers available to be chosen for the first number listed out in one column and the probability of each of those numbers being chosen in a 2nd column. I have done this for each of the 4 drawings. If I used rand('table', 0.1,0.2,0.5,0.2) for example but the numbers available starts with 15 and goes to 40, would this return 15 with a probability of 0.1 etc.?
I feel like rand('Normal',theta,sigma) would not work for this since the distribution is skewed and not perfectly normal, but please correct me if I'm wrong.
Thanks
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Are you talking about a mixture of normal distributions? In a mixture, the parameters are chosen with a specified probability, then a random value is drawn from the appropriate normal distribution. See "Generate a random sample from a mixture distribution" for a discussion and SAS code.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
If the i_th ball was drawn k_i times, then the empirical probability for is p_i = k_i / (5*1800).
You should create a SAS data set that has two coloumns: the ball number and the empirical probability.
You then want to draw a sample (without replacement) of size 5 with those (unequal) probabilities.
See the article "Four essential sampling methods in SAS" which gives the syntax for using PROC SURVEYSELECT or PROC IML to sample according to this scheme. See the upper right corner of the table in the article for the syntax.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
This is what is giving me a hard time. I have to break my original data set into subsamples based on which ball is being drawn to get the probabilities correct. Then I want to draw a sample of 5 balls, the first being based on the probabilities I have found for ball 1, the second being based on the probabilities of ball 2 etc.
I am sorry I feel like an idiot for having this hard of a time understanding this.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
- Are you an undergraduate or graduate student?
- How experienced are you with SAS DATA step programming?
- How experienced are you at SAS/IML (PROC IML) programming?
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
2. Beginner
3. I have only seen SAS/IML in your blog
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
4. When is the project due?
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
If I may offer some friendly advice, I suggest
1. Talk to your advisor/professor. He/She wants you succeed.
2. Consider changing to the model I proposed. In that model, you would count how many times each ball has EVER appeared (regardless of whether it was the first, second, ... or fifth ball). That is a standard probability model in which the probability of drawing each ball is constant and the draws are independent.
If you attempt the simpler project, it will still be challenging and you will still learn a lot about SAS programming and simulation. However, the simpler problem will be more tractable for someone with your level of experience.
Good luck!
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thank you a lot for the help.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
How many records do you have from the lottery where the rules did not change? I am thinking of the PowerBall where they have increased the numbers of the balls in both the main numbers and in the Power ball set? Since the total experience of the results is ever so much smaller than the 69 Choose 5 current possibilities I would be very surprised if many of the individual numbers have result selection rates near the 1/69.
This is likely to be an interesting excercise.
I think the rule you state here "1 has a 9 percent chance of being a 1 and a 0 percent chance of being a 45" is because you are examing the ordered result reported in summaries.
If the order of balls drawn in a lottery like the PowerBall is in order as seen on TV , 23,7, 18,2,53 the summary reported in the data I have would be 2, 7, 18, 23, 53. So the 45 ball has a very small opportunity to be reported in the first postion of the ordered tuple.
So the question is are you concerned with combinations or permutations (without and with order)? The process you describe seems to describe a process that is somewhat permutation but using the probability of appearance in a combination.
I would suggest to start with a subset problem such as 10 balls and picking 2 where you can look at all of the possibilities and see results easier.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thanks