DATA Step, Macro, Functions and more

insert rows based on output from subtracting two rows in the same column

Reply
Occasional Contributor
Posts: 12

insert rows based on output from subtracting two rows in the same column

proc format ;

value formula

             low - 15 =0

            16-40 =1

           41-70=2

         71-100=3

         101-140=4

         141-175=5

       176-225=6

       226-300=7

       301-high=8

;

       run;

data have;

input ID Email $ type $ edate mmddyy10. ;

format edate mmddyy10.;

datalines;

1 1 rdate 05/20/2015

1 2 rdate 05/21/2015

1 3 Sdate 10/05/2015

1 4 Sdate 11/09/2015

1 5 Sdate 12/11/2015

2 1 rdate 06/21/2015

2 2 rdate 07/05/2015

2 3 rdate 07/09/2015

2 4 Sdate 06/21/2016

2 5 Sdate 07/05/2016

2 6 Sdate 07/09/2016

;

run;

Im trying to subtract the last rdate  ( which is email 2 in this case) from first sdate ( email 3 in this case) for example if the diff between rdate and sdate = 130 . ( 4 from the proc format ) this means a 4 rows will be inserted to the table . and will be evenly distributed 130/4+1.

below is the final desire output .

any ideas how to approach that ?

;

Hi

Below is an example of a data I have and what im trying to achieve.

data have;

input ID Email $ type $ edate mmddyy10. ;

format edate mmddyy10.;

datalines;

1 1 rdate 05/20/2015

1 2 rdate 05/21/2015

1 3 Formula 06/26/2015

1 4 Formula 08/02/2015

1 5 Formula 09/07/2015

1 6 Formula 10/13/2015

1 7 Sdate 11/18/2015

1 8 Sdate 11/30/2015

1 9 Sdate 12/11/2015

2 1 rdate 06/21/2015

2 2 rdate 07/05/2015

2 3 rdate 07/31/2015

2 3 Formula 08/21/2015

2 3 Formula 09/11/2015

2 4 Sdate 10/01/2016

2 5 Sdate 11/10/2015

2 6 Sdate 01/09/2016

;

run;

Contributor
Posts: 53

Re: insert rows based on output from subtracting two rows in the same column

I am no SAS expert and i'm sure the real experts on this sight can make this better.  I would love some suggestions but this is what i came up with.  There is currently no logic to increment your email numbers up but i wasn't sure that was necessary

proc format ;

value formula

    low - 15 =0

       16-40 =1

        41-70=2

       71-100=3

      101-140=4

      141-175=5

      176-225=6

      226-300=7

      301-high=8

;

run;

data have;

input ID Email $ type $ edate mmddyy10. ;

format edate mmddyy10.;

datalines;

1 1 rdate 05/20/2015

1 2 rdate 05/21/2015

1 3 Sdate 10/05/2015

1 4 Sdate 11/09/2015

1 5 Sdate 12/11/2015

2 1 rdate 06/21/2015

2 2 rdate 07/05/2015

2 3 rdate 07/09/2015

2 4 Sdate 06/21/2016

2 5 Sdate 07/05/2016

2 6 Sdate 07/09/2016

;

run;

/*Sort have dataset*/

proc sort data=have;

    by id email type edate;

run;

data want;

    /*merge have dataset to itself with on incremented forward on record*/

    merge have

          have(firstobs=2 rename=(id=_nxt_id email=_nxt_email

                                type=_nxt_type edate=_nxt_edate));

  

    /*test if current and next record are same id and expected types*/

    if id=_nxt_id and type = 'rdate' and _nxt_type = 'Sdate' then do;

        output; /*ouput rdate observations*/

        /*calculate the interval between the rdate and sdate records*/

        /*then get formula count from the format*/

        _interval = intck('days', edate, _nxt_edate);

        _form_cnt = put(_interval, formula.);

      

        type = 'formula';

        /*loop through and create formula observations based on above calc's*/

        do i=1 to _form_cnt;

            edate = intnx('days',edate, _interval/(_form_cnt+1));

            output; /*output formula observations*/

        end;

    end;

    else output; /*output the Sdate observations*/

    drop _: i; /*drop temp variables and loop variable*/

run;

Regular Contributor
Posts: 180

Re: insert rows based on output from subtracting two rows in the same column

You can recreate the email sequence with the following additional Data step:

data want(rename=seq=email);

  set want(drop=email);

  by ID;

  retain seq;

  if first.id then seq=1;

     else seq=seq+1;

run;

CTorres

Valued Guide
Posts: 858

Re: insert rows based on output from subtracting two rows in the same column

Is it only this instance, or is it every rdate/sdate combination?

Valued Guide
Posts: 858

Re: insert rows based on output from subtracting two rows in the same column

I think this is what you are looking for but I"m not sure:

data want;

set have;

by type notsorted;

format l_edate mmddyy10.;

l_edate = lag(edate);

if first.type and type = 'Sdate' then do;

diff_date = edate - l_edate;

end;

run;

Ask a Question
Discussion stats
  • 4 replies
  • 337 views
  • 0 likes
  • 4 in conversation