I'm trying to use census (aggregated) data to simulate the population for a city (~800,000 people). The desired end result is to have a separate record for each individual with their simulated gender, age, and family structure. I've managed to do the gender and age bit, but running into complex issues trying to do the family structure bit.
As a simple example, I know from the census that, in a small defined area, there are 'm' married couples. I have simulated their ages, and now I want to assign spouses. So I've randomly selected, for example, 'm' males aged 18 and over to represent my married men. Now, from the females, I need to randomly select appropriate matching spouses. By appropriate, I mean that I need to do according to the following distributions:
- 50% of the female spouses are either the same age as the male or up to a maximum of 5 years younger
- 26% of the female spouses are older than their male spouses, by a maximum of 10 years (higher probability to be closer in age than farther)
- 24% of the female spouses are between five and twelve years younger than the male, with, again, to keep it simple, a simple inverse linear relationship (i.e. higher probability to be closer in age than farther; though simplest would be to just use a uniform distribution, which is also okay - will just be stated in "assumptions")
- The overall mean age difference is 2.5 years (older males)
Any thoughts, suggestions, code, solutions....anything at all would be much appreciated. I'm sure this will just keep getting complicated, but one step at a time!
Thank you for the post...I think I need to clarify though. I'm not trying to simulate the spouse ages from scratch based on the requirements. Rather, I have two tables - one with all the simulated males, and one with all the simulated females. Now I need to do a matching within those rules.....hope that makes sense?