I have a dataset with multiple subjects, and each subjects have multiple entries. Now I need to select a subset of subjects and all entries of each selected subject will be kept in my output.
Eg, I have the 'subject' variable in the original dataset look like:
1111 2 2 2 3 3 4444 555555
If I need to get 2 random subjects and their corresponding entries in the output file, say subject 1 and 4, then the output file should have the 'subject' variable with
You can use ranuni(0) function to generate a random number and then POINT= option in a SET statement.
data random(keep =subject); sampsize=2; do i=1 to sampsize; pickit=ceil(ranuni(11111)*totobs); set work.table point=pickit nobs=totobs; output; end; stop; run;
First reorganise your data in a way that you have one subject per observation.
The following code then creates a sample with a given number of observations, selects all observations with the same likelyhood and doesn't select twice the same observation. The code is as provided by SAS.
data work.rsubset(drop=obsleft sampsize);
if ranuni(0) lt sampsize / obsleft then do;
set sasuser.revenue point=pickit
Vasile: Your code could pick the same observation more than once (i.e. ranuni returns once 0.8... and once 0.7....).