SAS Gurus,
Could you please help me to populate an empty dataset with random data for example:
data class;
set sashelp.class (obs=0);
run;
Since I have empty dataset class, how to populate this dataset with 1000 random observations. Please don't recommend proc iml; (it doesn't work for me and I don't want to download the university edition of SAS)?
Thanks a lot in advance.
@buddha_d wrote:
Thanks mkeintz . Could you please explain how you got this element number (2,541,626)? Below is the error I am getting.
The value arrays include
2 sexes
83 ages (18:100)
61 heights (24:84) - in inches
251 weights (50:300)
Although the layout is treated as a 4-dimensional array (read "matrix") the total number of elements is the product: 2,541,626. So the array statement declared the size and upper/lower bounds for each of the 4 dimensions, and then told sas to initialize all 2,541,626 cells to zero.
As to the error messages, I suppose your version of sas (mine is 9.4 TS1M5) hasn't added the "integer" distribution for the RAND function. But you do have the "uniform" distribution. So instead of:
_sx=rand('integer',1,2);
age=rand('integer',18,100);
height=rand('integer',24,84);
weight=rand('integer',50,300);
use
_sx=ceil(rand('uniform',0,2));
age=ceil(rand('uniform',17,100));
height=ceil(rand('uniform',23,84));
weight=ceil(rand('uniform',49,300));
How does the source data come into play? Do you want 1000 records that are repeat entries of CLASS or ...? Can you show a small example of what your input data looks like and what you expect as output.
@buddha_d wrote:
SAS Gurus,
Could you please help me to populate an empty dataset with random data for example:
data class;
set sashelp.class (obs=0);
run;
Since I have empty dataset class, how to populate this dataset with 1000 random observations. Please don't recommend proc iml; (it doesn't work for me and I don't want to download the university edition of SAS)?
Thanks a lot in advance.
The ideal situation is to have mix and match (repeats and no repeats), but if not then all could be different.
Thanks Reeza
What do you mean by "random data"? Do you mean to randomly sample 1,000 draws from sashelp.class (which only has 19 observations)? If so, then you will certainly get repeats.
Or, I guess there are 58,140 possible combinations of values present in sashelp.class (19 NAMEs * 2 SEXs * 6 AGEs * 17 HEIGHTs * 15 WEIGHTs. Do you want sample without replacement from those 58,140?
Alphabetic List of Variables and Attributes # Variable Type Len 3 Age Num 8 4 Height Num 8 1 Name Char 8 2 Sex Char 1 5 Weight Num 8
The sashelp.class data you used as an example has a combination of Numeric and Character type Fields. Now define what do you mean by populating random values for each type.
Few examples :
Do you have a threshold for populating the numeric data ? (values between 100 and 220 for weight etc .?)
Do you have a rule for generating the character data ( Can it be Any value between A-Z for Sex or just the possibility of only M or F ?)
Sorry guys for hazy explanation. Yes, I want to have weights between 50 - 300 lbs, sex between M or F, Age between 18-100, Name can be any name (no specification) and height between 2-7 feet.
Thanks
You want a sample, with no duplicates taken from an array with values ranging as you specified. Think of the array as 4-dimensional.
First dimension (sex) lower bound 1, upper bound 2
2nd dimension (age) 18:100
3rd dimension (height) 24:84 (in inches)
4th dimension (weight(=) 50:300
That's a total of 2,541,626 elements.
So, after initializing the array to all zero's, repeat 100 times the following:
data want (drop=_:);
if 0 then set sashelp.class;
call streaminit(102985688);
array smpl {2,18:100,24:84,50:300} _temporary_ (2541626*0);
do _n_=1 to 100;
name='NAME_' || put(_n_,z3.);
do until (smpl{_sx,age,height,weight}=0);
_sx=rand('integer',1,2);
age=rand('integer',18,100);
height=rand('integer',24,84);
weight=rand('integer',50,300);
end;
smpl{_sx,age,height,weight}=1;
sex=char('MF',_sx);
output;
end;
stop;
run;
Note the definition of the SMPL array has 4 dimensions. But instead of just specifying the SIZE of each dimension, it specifies the range. That makes it easy to generate random numbers for age between 18 and 100 an assign it directly to the SMPL array.
Thanks mkeintz . Could you please explain how you got this element number (2,541,626)? Below is the error I am getting.
Here is the Log:
@buddha_d wrote:
Thanks mkeintz . Could you please explain how you got this element number (2,541,626)? Below is the error I am getting.
The value arrays include
2 sexes
83 ages (18:100)
61 heights (24:84) - in inches
251 weights (50:300)
Although the layout is treated as a 4-dimensional array (read "matrix") the total number of elements is the product: 2,541,626. So the array statement declared the size and upper/lower bounds for each of the 4 dimensions, and then told sas to initialize all 2,541,626 cells to zero.
As to the error messages, I suppose your version of sas (mine is 9.4 TS1M5) hasn't added the "integer" distribution for the RAND function. But you do have the "uniform" distribution. So instead of:
_sx=rand('integer',1,2);
age=rand('integer',18,100);
height=rand('integer',24,84);
weight=rand('integer',50,300);
use
_sx=ceil(rand('uniform',0,2));
age=ceil(rand('uniform',17,100));
height=ceil(rand('uniform',23,84));
weight=ceil(rand('uniform',49,300));
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.