Fluorite | Level 6

## How to count number of older siblings and the number of younger siblings?

Dataset description:

The local school district wants to survey all sixth grade students and their school-aged siblings. There are three different types of surveys: one for the sixth graders, one for their younger siblings, and one for their older siblings. The SAS data set called SCHOOLSURVEY contains data for all sixth graders in the three middle schools in the district (Rachael Carson, Green Valley, and Redwood Grove). The data set also includes data for all their siblings attending schools in the district, which can be linked back to the sixth grader by Family_ID.

Fam_id  Sdt_ID  Sch Grade DOB        DOB6th  Age_Diff
90021   103699      4th 07/24/2004  01/26/2002  2.49
90021   127945  RG  6th 01/26/2002  01/26/2002  0
90021   149229      2nd 10/28/2005  01/26/2002  3.75
90053   109831  RC  6th 08/27/2002  08/27/2002  0
90053   122779      5th 08/28/2003  08/27/2002  1
90053   124617      8th 05/07/2000  08/27/2002  -2.31
90097   145616      4th 06/06/2004  12/20/2001  2.46
90097   164264  RC  6th 12/20/2001  12/20/2001  0
90112   147688      7th 10/11/2000  02/23/2002  -1.37
90112   171989      9th 06/27/1999  02/23/2002  -2.66
90112   197925  RG  6th 02/23/2002  02/23/2002  0

Here, Fam_id  = Family_id

Std_id = Student_id

Sch = School Code

DOB = Date of birth for all the grade students

DOB6th = Date of birth of the 6th grade student for a given Family_id

Agg_Diff = Age difference in years between 6th grade student and his/her siblings for a given Family_id

I want to count the number of older siblings and the number of younger siblings of the sixth grader in the family, and add these variables to the data set. It would be great if anyone can help me with this problem.

Output should look like this :

Fam_id   Sdt_ID Sch  Grade   DOB       DOB6th  Age_Diff  Y_sib  O_sib
90021   103699      4th 07/24/2004  01/26/2002  2.49
90021   127945  RG  6th 01/26/2002  01/26/2002  0       2       0
90021   149229      2nd 10/28/2005  01/26/2002  3.75
90053   109831  RC  6th 08/27/2002  08/27/2002  0       1       1
90053   122779      5th 08/28/2003  08/27/2002  1
90053   124617      8th 05/07/2000  08/27/2002  -2.31
90097   145616      4th 06/06/2004  12/20/2001  2.46
90097   164264  RC  6th 12/20/2001  12/20/2001  0       1       0
90112   147688      7th 10/11/2000  02/23/2002  -1.37
90112   171989      9th 06/27/1999  02/23/2002  -2.66
90112   197925  RG  6th 02/23/2002  02/23/2002  0       0       2

where, Y_sib = number of younger siblings of the sixth grader

O_sib = number of older siblings of the sixth grader

5 REPLIES 5
Opal | Level 21

## Re: How to count number of older siblings and the number of younger siblings?

@Prat009

This sounds very much like an assignment where you probably should give it a go first and then only come back here if you get stuck and have detailed questions.

If you get stuck: Also provide the not yet working code which you've so far developed so we can understand where you are at and can take it from there.

Obsidian | Level 7

## Re: How to count number of older siblings and the number of younger siblings?

Try this out.

proc sql;
create table Output as
select *,
case
select put(count(*),best.)
from Input b /*Your input file*/
where b.Fam_id = a.Fam_id
else ""
end as Y_sib,
case
select put(count(*),best.)
from Input b /*Your input file*/
where b.Fam_id = a.Fam_id
else ""
end as O_sib
from Input a; /*Your input file*/
quit;
Obsidian | Level 7

## Re: How to count number of older siblings and the number of younger siblings?

Hi,

DATA TEMP;
INPUT Fam_id Sdt_ID Sch \$ Grade \$ DOB DOB6th Age_Diff;
INFORMAT DOB DOB6th mmddyy8.;
FORMAT DOB DOB6th mmddyy10.;
CARDS;
90021 103699 . 4th 07/24/2004 01/26/2002 2.49
90021 127945 RG 6th 01/26/2002 01/26/2002 0
90021 149229 . 2nd 10/28/2005 01/26/2002 3.75
90053 109831 RC 6th 08/27/2002 08/27/2002 0
90053 122779 . 5th 08/28/2003 08/27/2002 1
90053 124617 . 8th 05/07/2000 08/27/2002 -2.31
90097 145616 . 4th 06/06/2004 12/20/2001 2.46
90097 164264 RC 6th 12/20/2001 12/20/2001 0
90112 147688 . 7th 10/11/2000 02/23/2002 -1.37
90112 171989 . 9th 06/27/1999 02/23/2002 -2.66
90112 197925 RG 6th 02/23/2002 02/23/2002 0
;
RUN;

PROC SQL;
CREATE TABLE TEMP_CAL AS
Select
COALESCE(A.Fam_id,B.Fam_id) as Fam_id, /*SAS Functions are also applicable in PROC SQL queries including normal SQL functions*/

CASE WHEN A.Old_Student_Count=. THEN 0
ELSE A.Old_Student_Count
END as Old_Student_Count,

CASE WHEN B.Young_Student_Count=. THEN 0
ELSE B.Young_Student_Count
END as Young_Student_Count
FROM
(
Select
Fam_id,
Count(Distinct Sdt_ID) as Old_Student_Count
FROM
TEMP
WHERE
Age_Diff<0
GROUP BY Fam_id
) A

FULL JOIN

(
Select
Fam_id,
Count(Distinct Sdt_ID) as Young_Student_Count
FROM
TEMP
WHERE
Age_Diff>0
GROUP BY Fam_id
) B

ON A.Fam_id=B.Fam_id
;
QUIT;

DATA TEMP_FINAL;
IF _n_=1 THEN DO;
IF 0 THEN SET TEMP_CAL;

Declare Hash h1(dataset:'TEMP_CAL');
h1.definekey('Fam_id');
h1.definedata('Young_Student_Count','Old_Student_Count');
h1.definedone();
END;

SET TEMP;

h1.find(key:Fam_id);

IF Grade NE '6th' THEN DO;
Young_Student_Count=.;
Old_Student_Count=.;
END;

RUN;

PROC PRINT DATA=TEMP_FINAL;
TITLE 'Final Table';
RUN;
Fluorite | Level 6

## Re: How to count number of older siblings and the number of younger siblings?

Thanks a ton !

I had been trying to come up with a solution from past 4 days but I couldn't. Your code works just fine !

Fluorite | Level 6

## Re: How to count number of older siblings and the number of younger siblings?

How did you calculate the date of birth for only sixth grader ?