BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Bautista
Calcite | Level 5

I have a Census data file of persons who occupy various houses. Sample data shown below.

Capture.PNG

In every house, there is always a person numbered "1" who is the primary respondent to the Census survey (known as the head of household). Sometimes, this person lives alone. Or with relatives, or nonrelatives, or a combination of both. I want to create a list of the HouseNumbers that correspond to houses in which the head of household (person 1) is living with at least 2 of his or her relatives. So using the above example, the output would be:

Capture2.PNG

House #128 would not be included, because person 1 ("self", also known as the respondent) is only related to one other person in house #128. Also, house #136 and house #149 would not be included because the respondent lives alone (no relatives or nonrelatives).

In my output table, I only want two variables: ID and House.

1 ACCEPTED SOLUTION

Accepted Solutions
PGStats
Opal | Level 21

If you can't assume that the primary respondent in a house has the smallest ID then you need:

proc sql;

create table SelfWithTwoRelatives as

select b.id, b.house

from (

  select house

  from myData

  group by house

  having sum(RelationshipToRespondent = "Relative") >= 2) as a inner join

  myData as b on a.house=b.house

where b.RelationshipToRespondent="Self";

quit;

(Couldn't test against your data which can't be imported into SAS)

PG

PG

View solution in original post

5 REPLIES 5
PGStats
Opal | Level 21

If you can't assume that the primary respondent in a house has the smallest ID then you need:

proc sql;

create table SelfWithTwoRelatives as

select b.id, b.house

from (

  select house

  from myData

  group by house

  having sum(RelationshipToRespondent = "Relative") >= 2) as a inner join

  myData as b on a.house=b.house

where b.RelationshipToRespondent="Self";

quit;

(Couldn't test against your data which can't be imported into SAS)

PG

PG
stat_sas
Ammonite | Level 13

proc sql;

create table want as

select min(id) as id,a.house from have a

left join (select house,count(*) as freq from have

where relationshiptorespondent='Relative' group by house) b

on a.house=b.house

group by a.house

having freq>=2;

quit;

proc print data=want;

run;

PGStats
Opal | Level 21

If you can assume that MIN(id) is Self for every house then you don't need a join :

proc sql;

create table SelfWithTwoRelatives as

select min(id) as id, house

from myData

group by house

having sum(RelationshipToRespondent = "Relative") >= 2;

quit;

PG

PG
Ksharp
Super User

PG,

HAVING also can filter the obs ,not just the group .

proc sql;

create table SelfWithTwoRelatives as

select *

from myData

group by house

having sum(RelationshipToRespondent = "Relative") >= 2 and PersonNumber=1;

quit;

Xia Keshan

PGStats
Opal | Level 21

   

Great! This query can even be done without a join and without remerging (implicit join) :

proc sql;

create table SelfWithTwoRelatives as

select

    max(case when RelationshipToRespondent="Self" then id else . end) as id,

    house

from myData

group by house

having sum(RelationshipToRespondent = "Relative") >= 2;

quit;

PG

PG

Catch up on SAS Innovate 2026

Nearly 200 sessions are now available on demand in the Innovate Hub.

Watch Now →
What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 3448 views
  • 6 likes
  • 4 in conversation