I have a national father and kids dataset and want to create a variable FAMILY to show the members of the same family. For example, if POSTALCODE, CITY, STATE, and LAST NAME observations are the same then FAMILY = j (where j would be the number of families in my dataset); also I want to sort the members of the same family from older age to younger age. For example, FAMILY = 1 and FAMILY_ORDER = 1 for the father, FAMILY_ORDER = 2 for the oldest kid, FAMILY_ORDER = 3 for the next kid and so on to cover all members of the same family. How may I create these two variables?
Thanks
It would be helpful if you could show us a portion of this data set (perhaps with made up family names). Can we assume that all the people in one family are consecutive in the data set, or are they scattered throughout the data set?
Thanks, @PaigeMiller for your thoughts. The family members are scattered in the data.
So if there are multiple families named SMITH living in the same postalcode, how could we tell them apart?
Good question @PaigeMiller . There is more specific information, phone number (the father reported his phone number for the kids), email address of the father for all family members, and race.
Without seeing your data, here's a guess
1. Sort by Family Name, Phone Number and descending age
2. Each time the phone number changes, add 1 to the family number
@Emma_at_SAS wrote:
Good question @PaigeMiller . There is more specific information, phone number (the father reported his phone number for the kids), email address of the father for all family members, and race.
Be very careful assuming phone numbers or email addresses do not change.
Also a potential headache with father/child data is if the father may be attempting to skip out on support then name changes might pop-up depending on the data sources. "Robert" may become "Bob" or "Bobbie" or similar. Or change to use of a middle name in stead of first.
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.