04-11-2018 07:31 AM

Folks,

I wonder could anyone provide some guidance for the following query, please?

I have a dataset for 10,000 households on income.

I would like to see the impact of an increase in income of ranging between 500-1000 for a random 200 households can have on headline figures such as the percentage of the sample at risk of poverty.

Would anyone know how the best way to write a command to increase a variable between 500-1000 for a random 200 observations?

Any help would be most welcome.

Kind regards,

Sean

Posted in reply to Sean_OConnor

04-11-2018 08:16 AM

Well, you can use the random number generator:

https://blogs.sas.com/content/iml/2011/08/24/how-to-generate-random-numbers-in-sas.html

Something like:

data A; call streaminit(123); /* set random number seed */ do i = 1 to 200; u = rand("Uniform") * 10000; output; end; run;

This would give you 200 random observation numbers. Merge this to your data based on u=_nobs_ (will need an actual variable), then apply an if - note that this is just pseudocode, I haven't time to test anything right now:

data want; merge have (in=a) a (in=b);

by u;

if b then do;

/* set addition here */

end;

run;

Posted in reply to Sean_OConnor

04-11-2018 08:29 AM

This can be done in 1 step, but the statistical theory proving that this is in fact a random selection is complex.

data want;

set have nobs=denominator;

retain numerator 200;

if ranuni(12345) < numerator / denominator then do;

value = value + 500; /* how do you know how much to add?? */

numerator = numerator - 1;

end;

denominator = denominator - 1;

run;

Posted in reply to Sean_OConnor

04-11-2018 09:44 AM

Rick Wicklin discusses using random number generators in a number of places. Here's an SGF paper that discusses the topic.

Tom

https://www.lexjansen.com/scsug/2016/SAS-Ten-Tips.pdf