BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
pronabesh
Fluorite | Level 6

Hello All,

I need to reorganize a data set such that observations within subjects can be reordered by date.

The data set looks like this:

Subject_ID  Follow_Up_Time  Interview_Date  Previous_Interview_Date  Adverse_Event_Report_Date

1  1  Feb10  Jan10  Nov10

1  2  Mar10  Feb10  .

1  3  Dec10  Nov10  .

2  1  Jun10  May10  May10

2  2  Jul10  Jun10  Jan11

2  3  Feb11  Jan11  .

The last variable in the data set is not correctly assigned to its corresponding time period (i.e, Adverse_Event_Report_Date should be within the period between Interview_Date and Previous_Interview_Date). For example, in the first case line, the Adverse_Event_Report_Date variable should be missing and should appear in the third row for Subject_ID = 1.

Please advice on the data steps required to restructure this data file.

Thank you in advance.

Sincerely,

Pronabesh

1 ACCEPTED SOLUTION

Accepted Solutions
art297
Opal | Level 21

I, too, don't know if I correctly understand what you want, but the following is a data step method for doing something close to what Haikuo suggested.  It differs, slightly, as it can assign multiple dates for a subject:

data have;

  informat Interview_Date Previous_Interview_Date

    Adverse_Event_Report_Date date9.;

  format Interview_Date Previous_Interview_Date

    Adverse_Event_Report_Date date9.;

  input Subject_ID  Follow_Up_Time  Interview_Date

        Previous_Interview_Date  Adverse_Event_Report_Date;

  cards;

1  1  10Feb2013 10Jan2013 10Nov2013

1  2  10Mar2013 10Feb2013 .

1  3  10Dec2013 10Nov2013 .

2  1  10Jun2013 10May2013 10May2013

2  2  10Jul2013 10Jun2013 11Jan2013

2  3  11Feb2013 11Jan2013 .

;

data want (drop=_:);

  set have (rename=(Adverse_Event_Report_Date=_date));

  retain _holddate;

  format Adverse_Event_Report_Date date9.;

  by Subject_ID;

  if first.Subject_ID then call missing(_holddate);

  if not missing(_date) then _holddate=_date;

  if not missing(_holddate) and

    (Previous_Interview_Date le _holddate le Interview_Date)

     then Adverse_Event_Report_Date=_holddate;

run;

View solution in original post

11 REPLIES 11
Haikuo
Onyx | Level 15

It would be ideal that you have an sample output for your sample input data, so here is only my guess, if it is not meeting your requirement, then please provide your sample output:

data have;

input Subject_ID  Follow_Up_Time  (Interview_Date  Previous_Interview_Date  Adverse_Event_Report_Date) (:monyy5.);

format Interview_Date  Previous_Interview_Date  Adverse_Event_Report_Date monyy5.;

cards;

1  1  Feb10  Jan10  Nov10

1  2  Mar10  Feb10  .

1  3  Dec10  Nov10  .

2  1  Jun10  May10  May10

2  2  Jul10  Jun10  Jan11

2  3  Feb11  Jan11  .

;

proc sql;

  create table want as

    select Subject_ID,  Follow_Up_Time,  Interview_Date,  Previous_Interview_Date, 

             case when  Previous_Interview_Date <= Adverse_Event_Report_Date <= Interview_Date

                  then Adverse_Event_Report_Date

                  when not (Previous_Interview_Date <= Adverse_Event_Report_Date <= Interview_Date )

                       and Follow_Up_Time = max(Follow_Up_Time) then max(Adverse_Event_Report_Date)

                       else . end as Adverse_Event_Report_Date format=monyy5.

  from have

  group by Subject_ID

  order by Subject_ID,Follow_Up_Time

  ;

  quit;

Haikuo

pronabesh
Fluorite | Level 6

Thank You Haikuo. This is very helpful.

art297
Opal | Level 21

I, too, don't know if I correctly understand what you want, but the following is a data step method for doing something close to what Haikuo suggested.  It differs, slightly, as it can assign multiple dates for a subject:

data have;

  informat Interview_Date Previous_Interview_Date

    Adverse_Event_Report_Date date9.;

  format Interview_Date Previous_Interview_Date

    Adverse_Event_Report_Date date9.;

  input Subject_ID  Follow_Up_Time  Interview_Date

        Previous_Interview_Date  Adverse_Event_Report_Date;

  cards;

1  1  10Feb2013 10Jan2013 10Nov2013

1  2  10Mar2013 10Feb2013 .

1  3  10Dec2013 10Nov2013 .

2  1  10Jun2013 10May2013 10May2013

2  2  10Jul2013 10Jun2013 11Jan2013

2  3  11Feb2013 11Jan2013 .

;

data want (drop=_:);

  set have (rename=(Adverse_Event_Report_Date=_date));

  retain _holddate;

  format Adverse_Event_Report_Date date9.;

  by Subject_ID;

  if first.Subject_ID then call missing(_holddate);

  if not missing(_date) then _holddate=_date;

  if not missing(_holddate) and

    (Previous_Interview_Date le _holddate le Interview_Date)

     then Adverse_Event_Report_Date=_holddate;

run;

pronabesh
Fluorite | Level 6

Thank You Arthur. This is exactly what I was looking for.

pronabesh
Fluorite | Level 6

I thought I had the solution to this, but my problem is probably a little more complex. To demonstrate, I have listed one example from a single individual with multiple assessment points and described the variables below.

The purpose of this is to test the agreement between self_reported hospitalization (Hospitalization_date) and hospitalization from medical records (Admission_date). The assessment_date is the date of self reported assessment and lag_fu is the date of prior self reported assessment.

Now there are a couple of problems:

1. The admission_date variable is not properly merged to the corresponding period between the lag_fu and assessment_date

2. Each assessment date per person is listed thrice as there can be a max of three hospitalization records in that time period.

3. If there are multiple admission_date(s) at each assessment, all the dates need to be listed as shown in the output data set.

Your help is very much appreciated!

The input and output data set for one individual is:

Input:

Hospitalization_date

Record

Assessment_date

Admission_date

lag_fu

.

1

04NOV10

08DEC10

04OCT10

.

2

04NOV10

20JAN11

04OCT10

.

3

04NOV10

16DEC11

04OCT10

15DEC10

4

03JAN11

19DEC11

04NOV10

.

5

03JAN11

.

04NOV10

.

6

03JAN11

.

04NOV10

15JAN11

7

01MAR11

.

03JAN11

.

8

01MAR11

.

03JAN11

.

9

01MAR11

.

03JAN11

.

10

27APR11

.

01MAR11

.

11

27APR11

.

01MAR11

.

12

27APR11

.

01MAR11

.

13

29JUL11

.

27APR11

.

14

29JUL11

.

27APR11

.

15

29JUL11

.

27APR11

.

16

06OCT11

.

29JUL11

.

17

06OCT11

.

29JUL11

.

18

06OCT11

.

29JUL11

.

19

27MAR12

.

06OCT11

.

20

27MAR12

.

06OCT11

.

21

27MAR12

.

06OCT11

.

22

16MAY12

.

27MAR12

.

23

16MAY12

.

27MAR12

.

24

16MAY12

.

27MAR12

Output: 

Hospitalization_date

Record

Assessment_date

Admission_date

lag_fu

.

1

04NOV10

04OCT10

.

2

04NOV10

04OCT10

.

3

04NOV10

04OCT10

08DEC10

4

03JAN11

08DEC10

04NOV10

.

5

03JAN11

04NOV10

.

6

03JAN11

.

04NOV10

20JAN11

7

01MAR11

20JAN11.

03JAN11

.

8

01MAR11

.

03JAN11

.

9

01MAR11

.

03JAN11

.

10

27APR11

.

01MAR11

.

11

27APR11

.

01MAR11

.

12

27APR11

.

01MAR11

.

13

29JUL11

.

27APR11

.

14

29JUL11

.

27APR11

.

15

29JUL11

.

27APR11

.

16

06OCT11

.

29JUL11

.

17

06OCT11

.

29JUL11

.

18

06OCT11

.

29JUL11

.

19

27MAR12

16DEC11

06OCT11

.

20

27MAR12

19DEC11

06OCT11

.

21

27MAR12

.

06OCT11

.

22

16MAY12

.

27MAR12

.

23

16MAY12

.

27MAR12

.

24

16MAY12

.

27MAR12

art297
Opal | Level 21

Pronabesh: totally un-cool removing a correct rating because you realized that you had a different problem than the one that you originally posted. In the future, I'd really hope that you simply create a new discussion.

That said, while I may have misinterpreted what you are trying to do, the following appears to provide your posted desired output:

data want (drop=_:);

  array _dates(99) _temporary_;

  do until (last.patient);

    set have;

    by patient;

    if first.patient then do;

      call missing(of _dates(*));

      _counter=0;

    end;

    if not missing(Admission_date) then do;

      _counter+1;

      _dates(_counter)=Admission_date;

      call missing(Admission_date);

    end;

  end;

  do until (last.patient);

    set have (drop=Admission_date);

    by patient;

    if _counter gt 0 then do;

      do _i=1 to _counter;

        if (Assessment_date le _dates(_i) le lag_fu) or

           (lag_fu le _dates(_i) le Assessment_date) then do;

          Admission_date=_dates(_i);

          call missing (_dates(_i));

          _i=_counter;

        end;

      end;

    end;

    if not missing(Hospitalization_date) and not missing(Admission_date)

      then Hospitalization_date=Admission_date;

    else call missing(Hospitalization_date);

    output;

    call missing(Admission_date);

  end;

run;

pronabesh
Fluorite | Level 6

Hello Arthur,

Thank you for taking the time to help out on this. Sincere apologies for removing the correct rating. I am new to SAS communities and was under the false impression that a correct rating means that the discussion is closed. Will amend my mistake.

Once again your help and timely response is much appreciated!

art297
Opal | Level 21

Not a problem and I was pleased to see that you found my last post so quickly.  Yes, a correct classification indicates that a question has been answered, however, if one discovers that the problem is more complex than originally posted, a new question/discussion should be started.

pronabesh
Fluorite | Level 6

Thank you!

This also answers my question posted in the other thread about merging.

Best,

Pronabesh

art297
Opal | Level 21

Glad to hear that both of your posted question have been answered.  I am all too familiar with the anguish involved in learning the expected protocol of a new site and completely understand.  Hope you continue to post here.

pronabesh
Fluorite | Level 6

Thank You Arthur.

This code worked!

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to connect to databases in SAS Viya

Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 11 replies
  • 1423 views
  • 3 likes
  • 3 in conversation