I want to create a synthetic data that is representative of a population by age, sex, region and education. The original table looks like this (see attached for the full file):
agegr |
edu |
SEX |
region |
Population |
0 |
e1 |
0 |
AD_rural |
2180000 |
0 |
e1 |
0 |
AD_urban |
1084307 |
0 |
e1 |
0 |
AN_rural |
9476 |
0 |
e1 |
0 |
AN_urban |
5178 |
0 |
e1 |
0 |
AR_rural |
58663 |
0 |
e1 |
0 |
AR_urban |
13887 |
… |
… |
... |
… |
… |
100 |
E6 |
1 |
WB_rural |
23 |
In the synthetic dataset, the weight of each individual should not be higher than 10,000. This means that I should create 218 individuals having the first set of variable (agegr=0, edu=e1, sex=0, AD_rural), each of them having a weight of 10,000.
Thank you
Does this do it?
data have;
input agegr edu $ SEX :$1. region :$15. Population;
datalines;
0 e1 0 AD_rural 2180000
0 e1 0 AD_urban 1084307
0 e1 0 AN_rural 9476
0 e1 0 AN_urban 5178
0 e1 0 AR_rural 58663
0 e1 0 AR_urban 13887
;
data want;
set have;
weight = min(10000,population);
gross_weight = weight;
do while (gross_weight < population);
output;
weight = min(10000,population - gross_weight);
gross_weight = gross_weight + weight;
end;
output;
keep agegr edu SEX region weight;
run;
I don't see a weight variable?
Does this do it?
data have;
input agegr edu $ SEX :$1. region :$15. Population;
datalines;
0 e1 0 AD_rural 2180000
0 e1 0 AD_urban 1084307
0 e1 0 AN_rural 9476
0 e1 0 AN_urban 5178
0 e1 0 AR_rural 58663
0 e1 0 AR_urban 13887
;
data want;
set have;
weight = min(10000,population);
gross_weight = weight;
do while (gross_weight < population);
output;
weight = min(10000,population - gross_weight);
gross_weight = gross_weight + weight;
end;
output;
keep agegr edu SEX region weight;
run;
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.
Find more tutorials on the SAS Users YouTube channel.