turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- Analytics
- /
- Stat Procs
- /
- Average age of household adults and children betwe...

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

10-06-2016 02:12 AM

Hi there.

Can I please ask your advise on how can I find out if households in group X are made up of older/younger adults and children than households in group Y e.g. are adults in X an average of 50yrs old, whilst Y families are 40yrs old?

I would do an ANOVA between X and Y but I am stuck on how I should aggregate my data at the household level. Which approach is appropriate? Or are they both incorrect?

Your insight is very much appreciated.

Approach A | Approach B | |||||||

Household | MemberID | Membership | Age | Group | Average Household Adult Age | Average Household Children Age | Total adult's age | Total children's age |

1 | 1 | Adult | 56 | X | 42.8 | 10 | 214 | 10 |

1 | 2 | Adult | 21 | X | ||||

1 | 3 | Adult | 70 | X | ||||

1 | 4 | Adult | 23 | X | ||||

1 | 5 | Child | 10 | X | ||||

1 | 6 | Adult | 44 | X | ||||

. | . | . | . | . | ||||

. | . | . | . | . | ||||

. | . | . | . | . | ||||

256 | 1 | Adult | 88 | X | 88 | 11 | 88 | 22 |

256 | 2 | Child | 7 | X | ||||

256 | 3 | Child | 15 | X | ||||

100 | 1 | Adult | 53 | Y | 37 | 0 | 112 | 0 |

100 | 2 | Adult | 34 | Y | ||||

100 | 3 | Adult | 25 | Y | ||||

. | . | . | . | . | ||||

. | . | . | . | . | ||||

. | . | . | . | . | ||||

300 | 1 | Adult | 34 | Y | 32 | 4 | 64 | 4 |

300 | 2 | Adult | 30 | Y | ||||

300 | 3 | Child | 4 | Y | ||||

Approach A: | ||||||||

The total number of household in group X is 256. | ||||||||

Hence, the average for household in group X is thus | ||||||||

Adult: (42.8+…+88)/256 | ||||||||

Children: (10+…+11)/256 | ||||||||

Approach B: | ||||||||

Hence, the average for household in group X is thus | ||||||||

Adult: (214+…+88)/256 | ||||||||

Children: (10+…+22)/256 |

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

10-06-2016 02:21 AM

```
proc sql;
create table want as select group, mean(age) from have group by group;
quit;
```

---------------------------------------------------------------------------------------------

Maxims of Maximally Efficient SAS Programmers

Maxims of Maximally Efficient SAS Programmers

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

10-06-2016 02:26 AM

Is you metric of interest average adults household age? Or average age of adults in all households?

Thats the difference in your metrics.

One is correct for your purposes - but it depends on your purpose.

Also, maybe you should use a different type,of analysis that can handle the variable number of adults per household.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

10-06-2016 12:22 PM - edited 10-06-2016 12:25 PM

Sorry @Reeza. I am not too sure.

However, my interest is to find out whether

1) households in group X are made up of younger or older adults than households in group Y.

2) households in group X are made up of younger or older children than households in group Y.

I need to produce an average and std error for both group X and Y and p-value for test of difference for (1) and (2) if that sounds correct?

Also is model-based analysis using the individual-level data more appropriate?

- proc mixed/genmod with random effect or repeated subject statement being household and covariate being the interaction term of membership and group in the model?

- proc surveyreg with cluster being household, domain being the membership and group as the covariate?

Thank you.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

10-06-2016 05:47 AM - edited 10-06-2016 05:49 AM

Since it is Lift Time(age) data ,which does not conform to Normal distribution,

and you can not apply it into ANOVA. So I use Gamma distribution + LOG link function,

since age is not censored .

Check the example of GENMOD:

Example 44.3: Gamma Distribution Applied to Life Data

```
data have;
call streaminit(12345678);
do household=1 to 200;
do id=1 to 200;
age=ceil(80*rand('uniform'));
member=ifc(age lt 18,'Child','Adult');
group=ifc(rand('bern',0.5)=0,'X','Y');
output;
end;
end;
run;
proc genmod data=have;
class member group;
model age=member group household member*group
/dist=gamma link=log type3 ;
lsmeans member*group/ ilink exp diff cl;
effectplot interaction(x=group sliceby=member);
run;
```

OUTPUT:

```
Differences of member*group Least Squares Means
member group _member _group Estimate Standard Error z Value Pr > |z| Alpha Lower Upper Exponentiated Exponentiated Lower Exponentiated Upper
Adult X Adult Y 0.003689 0.005204 0.71 0.4784 0.05 -0.00651 0.01389 1.0037 0.9935 1.0140
Adult X Child X 1.6941 0.008011 211.47 <.0001 0.05 1.6784 1.7098 5.4419 5.3571 5.5280
Adult X Child Y 1.6987 0.008052 210.97 <.0001 0.05 1.6830 1.7145 5.4670 5.3814 5.5540
Adult Y Child X 1.6904 0.008010 211.05 <.0001 0.05 1.6747 1.7061 5.4218 5.3374 5.5076
Adult Y Child Y 1.6950 0.008050 210.56 <.0001 0.05 1.6793 1.7108 5.4469 5.3616 5.5335
Child X Child Y 0.004613 0.01009 0.46 0.6477 0.05 -0.01517 0.02440 1.0046 0.9849 1.0247
```

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

10-06-2016 12:40 PM

Hi @Ksharp.

Thank you for yor reply.

I am not familiar with modelling by gamma with log-link.

However, I would have thought stating household in the repeated subject statement instead to account for the clustering.

And are the p-value test for difference also correct on the original scale of the response variable? Since as far as I understand it is testing the response variable on the log scale.

Thank you again.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

10-06-2016 10:45 PM

If it was repeated measure ,then you should use NLMIXED , I will leave it to @Steave .

" are the p-value test for difference also correct on the original scale of the response variable?"

Yes. I used ilink option, if you want see real mean value ,add option mean in it .

`lsmeans member*group/ ilink exp diff cl mean ;`