Apply distribution to dataset

Reply
New Contributor
Posts: 3

Apply distribution to dataset

I have a dataset with a column that has some values filled out, some blank.  I have an expected distribution for these values (ie '01' 10%, '02' 8%, etc). I need to apply this distribution to the dataset so that the distribution of this column is the same as the expected distribution.  I need to keep the values that are already filled out as they are.  How can I do this?

 

Thanks,

Amarpal

Grand Advisor
Posts: 17,325

Re: Apply distribution to dataset

You can use the RAND Table method to randomly assign the data.

http://blogs.sas.com/content/iml/2011/07/13/simulate-categorical-data-in-sas.html
New Contributor
Posts: 3

Re: Apply distribution to dataset

How can I use this while keeping values that are already filled out?  I have a dataset like...
row    _2_digit_code
1       50
2      
3      
4        55
5       
6        80
etc
This table has 53k rows.


Then the distribution table is like
_2_digit_code    freq
01                      10%
02                       8%
etc.

SAS Super FREQ
Posts: 3,406

Re: Apply distribution to dataset

[ Edited ]

From your example, it looks like _2_digit_code is a character variable. Using the notation from the link that Reeza provided, let PROB be an array of probabilities and let VALUE be a (character) array of the values to use. Then

if _2_digit_code = " " then do;   /* missing value */
   k = rand("Table", of prob[*]);
   _2_digit_code = values[k];
end;
Ask a Question
Discussion stats
  • 3 replies
  • 236 views
  • 0 likes
  • 3 in conversation