Desktop productivity for business analysts and programmers

row difference for dates and other columns while assigning zero in the first row of distinct records

Accepted Solution Solved
Reply
Frequent Contributor
Posts: 87
Accepted Solution

row difference for dates and other columns while assigning zero in the first row of distinct records

[ Edited ]

I have date like the one below

 

MachineDateMiles
HP000110DEC071885462.00000
HP000112DEC071885469.00000
HP000130APR091885473.00000
HP000115JUL091886577.00000
HP000116JUL091887667.00000
HP000117JUL091889181.00000
HP000117JUL091888825.00000
HP000118JUL091889621.00000
HP000120JUL091891363.00000
HP000121JUL091892398.00000
HP000122JUL091893513.00000
HP000215DEC071985462.00000
HP000217DEC071985462.00000
HP000228APR091985462.00000
HP000210JUL091986577.00000
HP000217JUL091987667.00000
HP000220JUL091989181.00000
HP000222JUL091988825.00000
HP000228JUL091989621.00000
HP000231JUL091991363.00000
HP000221AUG091992398.00000
HP000224AUG091993513.00000
HP000310NOV072085462.00000
HP000318NOV072085462.00000
HP000330FEB092085462.00000
HP000315JUN092086577.00000
HP000316JUL092087667.00000
HP000317AUG092089181.00000
HP000310SEP092088825.00000
HP000318OCT092089621.00000
HP000330OCT092091363.00000
HP000321NOV092092398.00000
HP000322DEC092093513.00000

 

 

What I want to do is to calculate difference between dates and miles something like the sown in the table below:

 

 1. first row will have DaysDiff values as zero because that is first record of data for each machine

 2. similarly MilesDiff is zero in the first row, but that it is a difference between first row and second row data.

 

MachineDateMilesDaysDiffMilesDiff
HP000110DEC071885462.0000000
HP000112DEC071885469.0000029
HP000130APR091885473.00000 4
HP000115JUL091886577.00000  
HP000116JUL091887667.00000  
HP000117JUL091889181.00000  
HP000117JUL091888825.00000  
HP000118JUL091889621.00000  
HP000120JUL091891363.00000  
HP000121JUL091892398.00000  
HP000122JUL091893513.00000  
HP000215DEC071985462.0000000
HP000217DEC071985465.000002 3
HP000228APR091985468.00000  3
HP000210JUL091986577.00000  
HP000217JUL091987667.00000  
HP000220JUL091989181.00000  
HP000222JUL091988825.00000  
HP000228JUL091989621.00000  
HP000231JUL091991363.00000  
HP000221AUG091992398.00000  
HP000224AUG091993513.00000  
HP000310NOV072085462.0000000
HP000318NOV072085465.000008 3
HP000330FEB092085469.00000  6
HP000315JUN092086577.00000  
HP000316JUL092087667.00000  
HP000317AUG092089181.00000  
HP000310SEP092088825.00000  
HP000318OCT092089621.00000  
HP000330OCT092091363.00000  
HP000321NOV092092398.00000  
HP000322DEC092093513.00000  

 

I applied lag but it did not work and also for each row when Machine is changing, I am not able to put zero. I was trying something like this. I dont know how to check each machine type and then run the rest calculations.

 

data want;
set have;
by MachCode;
   DateDiff = lag( Date );
   MilesDiff = dif( MileageValue );
run;

 

Please help I am new to SAS and learning it.


Accepted Solutions
Solution
‎11-08-2016 06:02 AM
Super User
Posts: 1,227

Re: row difference for dates and other columns while assigning zero in the first row of distinct rec

[ Edited ]

For HP0001, 2rd line: (03OCT14) - (02OCT14) = 1 = one day difference is right.

                                   2900334 - 2899148 = 1186   - is correct

    why is in RED

 

For HP0002 - It seems that LAG function act in a different way than I toght,

maybe it is because I don't use LAG on firt.MachCode row.

 

Therefore I will use RETAIN instead LAG, to save data for the next observation:

 

data want;

 set have;

    by machCode;   /* assumed data is sorted by machine date */

          RETAIN lag_date lag_miles;

          DROP   lag_date lag_miles; 

           if first.machCode

              then do;                                  /* 1st row of a machine */

                      DaysDiff=0;

                      MilesDiff=0;

 

              end; 

              else do;

                  daysdiff = date - lag_date;            /* calculating the difference */

                  milesdiff = miles - lag_miles;

          end;

          lag_date = date;

          lag_miles = miles;

run;

View solution in original post


All Replies
Super User
Posts: 1,227

Re: row difference for dates and other columns while assigning zero in the first row of distinct rec

Use LAG function to get a variable value from previous row.

According to your post you want to do it per machine:

 

data want;

 set have;

    by machine;   /* assumed data is sorted by machine date */

           if first.machine then do; DaysDiff=0; MilesDiff=0; end;

           else do;

                  daysdiff = date - lag(date);

                  milesdiff = miles - lag(miles);

          end;

run;

 

Frequent Contributor
Posts: 87

Re: row difference for dates and other columns while assigning zero in the first row of distinct rec

[ Edited ]

Hello Shmuel,

 

Thanks for your efforts. Unfortunately, there is an error. In the second row, it is putting a dot (.) value, i think for NULL. and from third row it is calculating. I tried with _N_ = 1 as well, but whenever Machine is changing, both methods fail. Smiley Sad''

 

This is what your code is doing, when Machine changes form one to another.. It is not taking last recort for lag but second last row of the HP001 (i.e.  instead t - (t-1) it is calculating t - (t-2) ):

 

MachineDateMilesDateDiffMilesDiff
HP000101OCT142898108.0000011059
HP000102OCT142899148.0000011040
HP000103OCT142900334.0000011186
HP000217JAN08926384.0000000
HP000218JAN08926384.00000-2450-1973950
HP000228APR091237332.00000466310948
HP000229APR091238599.0000011267

 

Regards,

Manu

Super User
Posts: 1,227

Re: row difference for dates and other columns while assigning zero in the first row of distinct rec

[ Edited ]

Are dates converted to SAS dates using informat date7. ?

Is MilesDiff on second row is missing value too ?

Frequent Contributor
Posts: 87

Re: row difference for dates and other columns while assigning zero in the first row of distinct rec

I changed datepart(DateTimeStamp) to Date using date. formate.

MilesDiff is 0 in first row but in second row it is a dot . for HP0001 but for next machine HP0002 it is something t - (t-2), i explained that earlier.

 

 

Super User
Posts: 1,227

Re: row difference for dates and other columns while assigning zero in the first row of distinct rec

[ Edited ]

If next lines is your code ?

    data want;
      set have;
      by MachCode;
           DateDiff = lag( Date );                    /* this does not calculate the difference */
          MilesDiff = dif( MileageValue );                   /* even not this */
run;

 

than you have not used code as I post:

  

data want;

 set have;

    by machCode;   /* assumed data is sorted by machine date */

           if first.machCode then do; DaysDiff=0; MilesDiff=0; end;  /* 1st row of a machine */

           else do;

                  daysdiff = date - lag(date);            /* calculating the difference */

                  milesdiff = miles - lag(miles);

          end;

run;

Frequent Contributor
Posts: 87

Re: row difference for dates and other columns while assigning zero in the first row of distinct rec

[ Edited ]

I am sorry confusing you.

 

I used your code, excatly

 

data want;

 set have;

    by machCode;   /* assumed data is sorted by machine date */

           if first.machCode then do; DaysDiff=0; MilesDiff=0; end;  /* 1st row of a machine */

           else do;

                  daysdiff = date - lag(date);            /* calculating the difference */

                  milesdiff = miles - lag(miles);

          end;

run;

 

 

And what I got result is as:

 

 

MachineDateMilesDateDiffMilesDiff
HPUS000110DEC071885462.0000000
HPUS000112DEC071885462.00000..
HPUS000130APR091885462.000005050
HPUS000115JUL091886577.00000761115
HPUS000116JUL091887667.0000011090
HPUS000117JUL091889181.0000011514
HPUS000117JUL091888825.000000-356

.

.

.

.

.

MachineDateMilesDateDiffMilesDiff
HP000101OCT142898108.0000011059
HP000102OCT142899148.0000011040
HP000103OCT142900334.0000011186
HP000217JAN08926384.0000000
HP000218JAN08926384.00000-2450-1973950
HP000228APR091237332.00000466310948
HP000229APR091238599.0000011267

 

when machine changed, lag is taken as t - (t-2) and not as t - (t-1)

 

NOTE: Please ignore the data in the first three columns as they are different from the data posted in the original quesiton. I am concentrating only on results.

 

Thanks very much.

Esteemed Advisor
Posts: 6,641

Re: row difference for dates and other columns while assigning zero in the first row of distinct rec

For the lag() function to always have the value of the preceding observation, it must not be called conditionally, as it only puts data into its FIFO chain when it is called!

data want;
set have;
by machCode;   /* assumed data is sorted by machine date */
daysdiff = date - lag(date);            /* calculating the difference */
milesdiff = miles - lag(miles);
if first.machCode   /* 1st row of a machine */
then do;
  DaysDiff=0;
  MilesDiff=0;
end;
run;

If you don't want the NOTE about missing values created in the first iteration, use 2 additional variables to hold the pure value of the lag() calls temporarily and drop them.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Solution
‎11-08-2016 06:02 AM
Super User
Posts: 1,227

Re: row difference for dates and other columns while assigning zero in the first row of distinct rec

[ Edited ]

For HP0001, 2rd line: (03OCT14) - (02OCT14) = 1 = one day difference is right.

                                   2900334 - 2899148 = 1186   - is correct

    why is in RED

 

For HP0002 - It seems that LAG function act in a different way than I toght,

maybe it is because I don't use LAG on firt.MachCode row.

 

Therefore I will use RETAIN instead LAG, to save data for the next observation:

 

data want;

 set have;

    by machCode;   /* assumed data is sorted by machine date */

          RETAIN lag_date lag_miles;

          DROP   lag_date lag_miles; 

           if first.machCode

              then do;                                  /* 1st row of a machine */

                      DaysDiff=0;

                      MilesDiff=0;

 

              end; 

              else do;

                  daysdiff = date - lag_date;            /* calculating the difference */

                  milesdiff = miles - lag_miles;

          end;

          lag_date = date;

          lag_miles = miles;

run;

Super User
Posts: 1,227

Re: row difference for dates and other columns while assigning zero in the first row of distinct rec

About LAG meaning see next blog in this forum:

 

https://communities.sas.com/t5/Base-SAS-Programming/Lag-Logic/m-p/309971

Frequent Contributor
Posts: 87

Re: row difference for dates and other columns while assigning zero in the first row of distinct rec

[ Edited ]

Just a quick query... in the last column are the absolute values. I want to sum them up group by date. e.g. on 01Aug09 total will be 1707+733
similarly on 03AUG09 the total = 2261+487+974
and so forth. how to compute TOTAL?

For absolute values, we can tweak the code like this
DaysDiff = abs(Date - lag_date); /* calculating the difference */
MilesDiff = abs(miles - lag_miles);


HPUS0001 01AUG09 1903209.00000 1 1707 1707
HPUS0001 01AUG09 1902476.00000 0 -733 733
HPUS0001 03AUG09 1904737.00000 2 2261 2261
HPUS0001 03AUG09 1905224.00000 0 487 487
HPUS0001 03AUG09 1904250.00000 0 -974 974
HPUS0001 06AUG09 1906078.00000 3 1828 1828
HPUS0001 08AUG09 1907846.00000 2 1768 1768
HPUS0001 08AUG09 1906766.00000 0 -1080 1080
HPUS0001 10AUG09 1909027.00000 2 2261 2261
HPUS0001 10AUG09 1908540.00000 0 -487 487

Super User
Posts: 1,227

Re: row difference for dates and other columns while assigning zero in the first row of distinct rec

Adiing  TOTALs colomns, using same code with some changes and addon:

 

data tmp1 (drop=tot_days tot_miles)

        tmp2 (keep= machCode tot_days tot_miles)

    ;

 set have;

    by machCode;   /* assumed data is sorted by machine date */

          RETAIN lag_date lag_miles tot  tot_days tot_miles;

          DROP   lag_date lag_miles; 

           if first.machCode

              then do;                                  /* 1st row of a machine */

                      DaysDiff=0;

                      MilesDiff=0;

                      tot_days=0;

                      tot_miles=0;

              end; 

              else do;

                  daysdiff = date - lag_date;            /* calculating the difference */

                  milesdiff = miles - lag_miles;

                  tot_days+daysdiff;

                  tot_miles+abs(milesdiff);

          end;

          lag_date = date;

          lag_miles = miles;

          output tmp1;

         if last machCode then output tmp2;

run;

 

data want;

  merge tmp1 tmp2;

       by machCode;

run;

Frequent Contributor
Posts: 87

Re: row difference for dates and other columns while assigning zero in the first row of distinct rec

Thank you so much. It worked. Smiley Happy

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 12 replies
  • 269 views
  • 3 likes
  • 3 in conversation