DATA Step, Macro, Functions and more

About DATA STEP, DATA reconstruction~

Reply
Occasional Contributor THS
Occasional Contributor
Posts: 5

About DATA STEP, DATA reconstruction~

Hi~ currently I have a data set as followed:

[DATA 1 ]

ID       Drug      Start_DAY         End_DAY             

A          Tx1       2013/01/01        2013/01/03                 

            Tx1       2013/01/04        2013/01/08                    

            Tx2       2013/01/09        2013/01/12

            Tx1       2013/01/14        2013/01/16

            Tx1       2013/01/22        2013/01/26   

B

C

D......

and I'm trying to reconstruct the data sets as followed :

[DATA2]

ID        Drug        Start                             End

A         Tx1         2013/01/01                    2013/01/08

           Tx2         2013/01/09                    2013/01/12

           Missing   2013/01/13                    2013/01/13

           Tx1         2013/01/14                    2013/01/16

           Missing   2013/01/17                    2013/01/21

           Tx1         2013/01/22                    2013/01/26

I've tried to use arrays for these reconstruction but it didn't work at all .....
Can any one give me a hand or suggestion??
Thanks ~ !!!

PROC Star
Posts: 7,356

Re: About DATA STEP, DATA reconstruction~

There is probably an easier way to do this, but the following will likely accomplish what you want:

data have;

  informat Start_DAY  End_DAY yymmdd10.;

  format Start_DAY  End_DAY yymmdd10.;

  input (ID Drug) ($) Start_DAY  End_DAY;

  cards;            

A   Tx1   2013/01/01  2013/01/03                

A   Tx1   2013/01/04  2013/01/08                   

A   Tx2   2013/01/09  2013/01/12

A   Tx1   2013/01/14  2013/01/16

A   Tx1   2013/01/22  2013/01/26  

B   Tx2   2013/01/09  2013/01/15

B   Tx1   2013/01/22  2013/01/26  

;

data want;

  set have;

  by id;

  array jan('01jan2013'd:'31jan2013'd) $ _temporary_;

  retain jan;

  if first.id then do _n_='01jan2013'd to '31jan2013'd;

    jan(_n_)="Missing";

  end;

  do _n_=Start_DAY to End_day;

    jan(_n_)=Drug;

  end;

  if last.id then do;

    do _n_='01jan2013'd to '31jan2013'd;

      if _n_ eq '01jan2013'd then do;

        Drug=jan(_n_);

        Start_Day=_n_;

      end;

      else do;

        if _n_ eq'31jan2013'd then do;

          End_day=_n_;

          if jan(_n_) ne jan(_n_-1) then do;

            Start_Day=_n_;

            Drug=jan(_n_);

          end;

          output;

        end;

        else if jan(_n_) ne jan(_n_-1) then do;

          End_day=_n_-1;

          output;

          Start_Day=_n_;

          Drug=jan(_n_);

        end;

      end;

    end;

  end;

run;

PROC Star
Posts: 1,227

Re: About DATA STEP, DATA reconstruction~

Here is a non-array approach, just looking for gaps and outputting the a record whenever a gap is found.

data want;
  set have;
  by id start_day;
  lag_end=lag(end_day);
  output;
  if first.id=0 then do;
    if start_day ne lag_end+1 then do;
      Drug="Missing";
      end_day=start_day-1;
      start_day=lag_end+1;
      output;
    end;
  end;
  drop lag_end;
run;

proc sort data=want;
  by id start_day;
run;
Occasional Contributor THS
Occasional Contributor
Posts: 5

Re: About DATA STEP, DATA reconstruction~

Thanks for the reply ;

but In addition to recognize the missing part,

the reconstructed date sets are more complicated cause it need to combine the date if it's the same drug with no missing day in different row.

such as

   Tx1       2013/01/01        2013/01/03                

   Tx1       2013/01/04        2013/01/08

would turned to be Tx1        2013/01/01             2013/01/08 ;

thanks a lot ~

Maybe I can try to do the missing part first and then try to figure out how to combine dates in different row ....

Occasional Contributor THS
Occasional Contributor
Posts: 5

Re: About DATA STEP, DATA reconstruction~

Hi ~~ thanks for the reply ;

But while I tried to put the code in SAS 9.3

it didn't work though ~

the log file would show

array jan('01jan2013'd.:'31jan2013'd.) $ _temporary_;

                 ------------

                 22

                 76

ERROR 22-322: Syntax error, expecting one of the following: a name, an integer constant, *.

ERROR 76-322: Syntax error, statement will be ignored.

did I type it wrong or did I miss something ?

thanks a lot ~

PROC Star
Posts: 7,356

Re: About DATA STEP, DATA reconstruction~

My mistake.  The code wasn't tested and, apparently, SAS won't accept the array bounds as written.  The following, though, did run successfully on 9.3:

data want;

  set have;

  by id;

  array jan(19359:19389) $ _temporary_;

  retain jan;

  if first.id then do _n_='01jan2013'd to '31jan2013'd;

    jan(_n_)="Missing";

  end;

  do _n_=Start_DAY to End_day;

    jan(_n_)=Drug;

  end;

  if last.id then do;

    do _n_='01jan2013'd to '31jan2013'd;

      if _n_ eq '01jan2013'd then do;

        Drug=jan(_n_);

        Start_Day=_n_;

      end;

      else do;

        if _n_ eq'31jan2013'd then do;

          End_day=_n_;

          if jan(_n_) ne jan(_n_-1) then do;

            Start_Day=_n_;

            Drug=jan(_n_);

          end;

          output;

        end;

        else if jan(_n_) ne jan(_n_-1) then do;

          End_day=_n_-1;

          output;

          Start_Day=_n_;

          Drug=jan(_n_);

        end;

      end;

    end;

  end;

run;

Respected Advisor
Posts: 3,777

Re: About DATA STEP, DATA reconstruction~

How about SYSEVALF and the BOUND functions.

data _null_;
  
array jan(%sysevalf('01jan2013'd):%sysevalf('31jan2013'd)) $ _temporary_;
  
do _n_=lbound(jan) to hbound(jan);
      put 'NOTE: ' _n_=date.;
     
end;
  
run;
PROC Star
Posts: 7,356

Re: About DATA STEP, DATA reconstruction~

Much better than my hard coding the date values!

Ask a Question
Discussion stats
  • 7 replies
  • 429 views
  • 1 like
  • 4 in conversation