DATA Step, Macro, Functions and more

Creating an indicator from information in a cluster

Reply
Occasional Contributor
Posts: 14

Creating an indicator from information in a cluster

I would like to create a variable called 'household composition' using survey data. This is based on four indicators

1) household ID (HID)

2) father ID (FID)

3) mother ID (MID)

4) spouse ID (SID)

Here is an example of the data.  PID is participant ID. 

PID   HID   FID    MID   SID

101     1     .          .       102

102     1    .          .       101

103     1   101     102     .

201     2    .         .         .

202     2    201     .         .

301     3    .         .        302

302     3    .         .        301  

401     4    .         .         .

501     5   .          .         502

502     5   .          .         501

I would like to say:

- the household composition of household 1 (hid=1) is a two parent family [mom is 102, dad is 101, child is 103]

- the household composition of household 2 (hid=2) is a single parent family (dad is 201, kid is 202]

- the household composition of household 3 (hid=3) is a couple (301 and 302 are together, no kids).  household 5 are also a couple.

- the household composition of household 4 (hid=4) is single person (id=401)

How do I do this and keep data in a long format?

Thanks!!!

Super User
Posts: 11,343

Re: Creating an indicator from information in a cluster

Will you need this code to be applied to all records of each household or only one? If the former there will be a step to summarize the data and a second to merge it back to the original.

What should your composition variable look like? A single character or digit? Do you need to distinguish between single parent families with only a father and only a mother or just single parent?

It appears that father ids always end in 1, mother ids with 2 and anything else is a child. Is that correct?

And I don't want to use survey software that outputs data that way if the household is a single response to a survey...

Occasional Contributor
Posts: 14

Re: Creating an indicator from information in a cluster

The code needs to be applied to all individuals in the house, not just the household itself.

The composition variable will be a digit; it does not matter if a single parent household is headed by a father or a mother, just that it's a single parent household.

Unfortunately, fathers' codes do not always end in 01. 

Super User
Posts: 11,343

Re: Creating an indicator from information in a cluster

So how to do you tell that a pid is a parent not a child?

Super User
Posts: 19,789

Re: Creating an indicator from information in a cluster

Is the data sorted this way?

Occasional Contributor
Posts: 14

Re: Creating an indicator from information in a cluster

Yes - the data are sorted as i wrote them.  thanks!

Super User
Posts: 19,789

Re: Creating an indicator from information in a cluster

You'll have to merge this back in with the original data.

data want;

    length composition $20.;

    set have;

    by hid notsorted;

    retain flag_child flag_married;

    if first.hid then do;

        call missing (flag_child, flag_married);

    end;

    if (not missing(fid)) or (not missing(mid)) then flag_child=1;

    if not missing(sid) then flag_married=1;

    if first.hid and last.hid then composition='Single Person';

    else if last.hid and flag_child=1 and flag_married=. then composition='Single Parent';

    else if last.hid and flag_child=1 and flag_married=1 then composition='Two Parent Family';

    else if last.hid and flag_child=. and flag_married=1 then composition='Couple';

   

    if last.hid then output;

    *keep hid composition;

run;

Super User
Posts: 10,028

Re: Creating an indicator from information in a cluster

What is your final output ?

Ask a Question
Discussion stats
  • 7 replies
  • 274 views
  • 0 likes
  • 4 in conversation