Folks,
I wonder could anyone provide some guidance for the following query, please?
I have a dataset for 10,000 households on income.
I would like to see the impact of an increase in income of ranging between 500-1000 for a random 200 households can have on headline figures such as the percentage of the sample at risk of poverty.
Would anyone know how the best way to write a command to increase a variable between 500-1000 for a random 200 observations?
Any help would be most welcome.
Kind regards,
Sean
Well, you can use the random number generator:
https://blogs.sas.com/content/iml/2011/08/24/how-to-generate-random-numbers-in-sas.html
Something like:
data A; call streaminit(123); /* set random number seed */ do i = 1 to 200; u = rand("Uniform") * 10000; output; end; run;
This would give you 200 random observation numbers. Merge this to your data based on u=_nobs_ (will need an actual variable), then apply an if - note that this is just pseudocode, I haven't time to test anything right now:
data want; merge have (in=a) a (in=b);
by u;
if b then do;
/* set addition here */
end;
run;
This can be done in 1 step, but the statistical theory proving that this is in fact a random selection is complex.
data want;
set have nobs=denominator;
retain numerator 200;
if ranuni(12345) < numerator / denominator then do;
value = value + 500; /* how do you know how much to add?? */
numerator = numerator - 1;
end;
denominator = denominator - 1;
run;
Rick Wicklin discusses using random number generators in a number of places. Here's an SGF paper that discusses the topic.
Tom
https://www.lexjansen.com/scsug/2016/SAS-Ten-Tips.pdf
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
Check out this tutorial series to learn how to build your own steps in SAS Studio.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.