DATA Step, Macro, Functions and more

Working with incomplete longitudinal data - data cleaning help

Reply
Contributor
Posts: 55

Working with incomplete longitudinal data - data cleaning help

Hi everyone,

 

I’m working with data that has multiple visits. Baseline, 3, 6,9,12 months follows ups, five sessions in total. At each session 7 events are asked, each question will be a row in the database.  So Ideally, each case should have 37 rows.  An ideal “clean case” should look like this:

ChildID

AsthmaEventDate_HW

AsthmaEventCode

AsthmaEventCount

1944

7-Mar-14

1

0

1944

7-Mar-14

2

3

1944

7-Mar-14

3

0

1944

7-Mar-14

4

0

1944

7-Mar-14

5

18

1944

7-Mar-14

6

0

1944

7-Mar-14

7

3

1944

11-Jul-14

1

0

1944

11-Jul-14

2

0

1944

11-Jul-14

4

0

1944

11-Jul-14

3

0

1944

11-Jul-14

5

0

1944

11-Jul-14

6

0

1944

11-Jul-14

7

0

1944

6-Nov-14

1

0

1944

6-Nov-14

2

0

1944

6-Nov-14

3

0

1944

6-Nov-14

4

0

1944

6-Nov-14

5

0

1944

6-Nov-14

6

0

1944

6-Nov-14

7

0

1944

3-Apr-15

1

1

1944

3-Apr-15

2

0

1944

3-Apr-15

3

1

1944

3-Apr-15

4

1

1944

3-Apr-15

5

30

1944

3-Apr-15

6

7

1944

3-Apr-15

7

1

1944

24-Dec-15

1

0

1944

24-Dec-15

2

2

1944

24-Dec-15

3

5

1944

24-Dec-15

4

0

1944

24-Dec-15

5

8

1944

24-Dec-15

6

8

1944

24-Dec-15

7

8

 However, there are examples when the data is incomplete and missing sessions and or events (see below).

ChildID

AsthmaEventDate_HW

AsthmaEventCode

AsthmaEventCount

19

6-May-08

3

1

19

6-May-08

4

1

19

6-May-08

5

5

19

6-May-08

7

2

19

6-Jun-08

5

5

19

22-Dec-08

2

10

19

22-Dec-08

5

20

 

The first step that I would like to take is to see how many cases have all 7 sessions, how many have 6, etc. I’m also open to other suggestions as well.

 

Thanks!

Trusted Advisor
Posts: 1,586

Re: Working with incomplete longitudinal data - data cleaning help

1) Please note: 5 sessions time 7 events per session are 35 lines, not 37. Do I miss anyting ?

    Is session expressed by AsthmaEventCode ?

    ChildID 19 has twice AsthmaEventCode = 5. Is it possible? what does it mean? is it an error ?

 

2) Yo can count frequencies of AsthmaEventCode per child by:

 

    proc freq data= XXXX;

      table childID * AsthmaEventCode;

   run;

 

   Does the report produced answer to your demands?

 

 3)  I need more information and understanding to be mor accuracy.

 

Shmuel

 

 

Ask a Question
Discussion stats
  • 1 reply
  • 160 views
  • 0 likes
  • 2 in conversation