Folks,
I wonder could anyone provide some guidance for the following query, please?
I have a dataset for 10,000 households on income.
I would like to see the impact of an increase in income of ranging between 500-1000 for a random 200 households can have on headline figures such as the percentage of the sample at risk of poverty.
Would anyone know how the best way to write a command to increase a variable between 500-1000 for a random 200 observations?
Any help would be most welcome.
Kind regards,
Sean
Well, you can use the random number generator:
https://blogs.sas.com/content/iml/2011/08/24/how-to-generate-random-numbers-in-sas.html
Something like:
data A; call streaminit(123); /* set random number seed */ do i = 1 to 200; u = rand("Uniform") * 10000; output; end; run;
This would give you 200 random observation numbers. Merge this to your data based on u=_nobs_ (will need an actual variable), then apply an if - note that this is just pseudocode, I haven't time to test anything right now:
data want; merge have (in=a) a (in=b);
by u;
if b then do;
/* set addition here */
end;
run;
This can be done in 1 step, but the statistical theory proving that this is in fact a random selection is complex.
data want;
set have nobs=denominator;
retain numerator 200;
if ranuni(12345) < numerator / denominator then do;
value = value + 500; /* how do you know how much to add?? */
numerator = numerator - 1;
end;
denominator = denominator - 1;
run;
Rick Wicklin discusses using random number generators in a number of places. Here's an SGF paper that discusses the topic.
Tom
https://www.lexjansen.com/scsug/2016/SAS-Ten-Tips.pdf
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Check out this tutorial series to learn how to build your own steps in SAS Studio.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.