BookmarkSubscribeRSS Feed
burtsm
Calcite | Level 5

I'm creating a data set of mothers who have given birth at least 2 times during the study period, with summary statistics to be reported for births at time 1 and time 2.

I created a variable birth_num that indicates the chronological number of each birth for the same woman during the study period. For example:

MOTHER_ID    BIRTH_YR      birth_num

5                     2009              1

7                     2004              1

10                   2005              1

10                   2008              2

10                   2010              3

12                   2006              1

12                   2008              2

14                   2004              1

14                   2006              2

14                   2008              3

I'm finding that as I apply the exclusion criteria to get the sample (only keep records with valid values for first birth, infant health outcomes, prenatal care, mother's race/ethnicity and immigrant status, and women who have given birth at least 2 times), I'm losing either the birth record at time 1 or time 2 for some MOTHER_IDs. For example, if the record for MOTHER_ID=10 where birth_num=2 is excluded, I'm unable to include this record in the summary statistics for births at time 2. However, the record for MOTHER_ID=10 where birth_num=1 is retained and included in summary statistics for births at time 1.

Data after exclusion criteria applied:

MOTHER_ID    BIRTH_YR      birth_num

10                   2005              1

10                   2010              3

12                   2006              1

12                   2008              2 

14                   2004              1

14                   2006              2

14                   2008              3

How can I only retain birth records that meet selection criteria for births at time 1 and time 2?

These are the records I am interested in keeping for summary statistics:

MOTHER_ID    BIRTH_YR      birth_num

12                   2006              1

12                   2008              2 

14                   2004              1

14                   2006              2

Thank you,
Stephanie

3 REPLIES 3
slchen
Lapis Lazuli | Level 10

data have;

input ID    BIRTH_YR      birth_num;

cards;

10                   2005              1

10                   2010              3

12                   2006              1

12                   2008              2

14                   2004              1

14                   2006              2

14                   2008              3

;

data want;

   do until(last.id);

      set have;

   by id;

   if birth_num=2 and lag(birth_num)=1 then flag=1;

  end;

  do until(last.id);

    set have;

    by id;

    if flag=1 then do;

         n+1;

  if n<=2 then output;

     end;

  end;

  call missing(n);

  drop n flag;

run;

Steelers_In_DC
Barite | Level 11

You only have 2 IDs in your output, is that a mistake?  I don't know why you are dropping 10.  Here is a solution but it has three IDs:

data have;

infile cards dsd;

input MOTHER_ID BIRTH_YR birth_num;

cards;

5,2009,1

7,2004,1

10,2005,1

10,2008,2

10,2010,3

12,2006,1

12,2008,2

14,2004,1

14,2006,2

14,2008,3

;

run;

data want;

do until (last.mother_id);

set have;

by mother_id;

if birth_num = 2 then _flag = 1;

end;

do until (last.mother_id);

set have;

by mother_id;

if _flag = 1 then output;

end;

where birth_num < 3;

drop _:;

run;

jakarman
Barite | Level 11

You could use a doubled set reading the dataset as usual and opening the same dataset again using a pointer and renamed versions of the variables.

The effect it is you can read some data back and forward in the pointered dataset (direct access). The complexity to understand can be hurdle but when understanding it .....   It has a marvelous performance on big data.

Interested? SAS(R) 9.4 Statements: Reference, Third Edition

---->-- ja karman --<-----

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 797 views
  • 6 likes
  • 4 in conversation