11-30-2015 10:42 AM
I have a dataset with a column that has some values filled out, some blank. I have an expected distribution for these values (ie '01' 10%, '02' 8%, etc). I need to apply this distribution to the dataset so that the distribution of this column is the same as the expected distribution. I need to keep the values that are already filled out as they are. How can I do this?
11-30-2015 01:10 PM
How can I use this while keeping values that are already filled out? I have a dataset like...
This table has 53k rows.
Then the distribution table is like
11-30-2015 01:19 PM - edited 11-30-2015 01:20 PM
From your example, it looks like _2_digit_code is a character variable. Using the notation from the link that Reeza provided, let PROB be an array of probabilities and let VALUE be a (character) array of the values to use. Then
if _2_digit_code = " " then do; /* missing value */ k = rand("Table", of prob[*]); _2_digit_code = values[k]; end;