Hi all,
First of all I must thankful to everyone there who kept helping me to learn SAS fast. Almost a month ago, I was zero with SAS but now, at least I am feeling confident and have grown my love towards it. 🙂
Now, to my problem...
Suppose I have the following dataset:
data have;
infile cards truncover expandtabs;
input MC $ ET $ Date :Date9. Time :time8. Dia KMs;
format Date date9. Time time8.;
cards;
US0001 Lath 12JAN15 7:00 890.8 2900334.00000
US000328 Lath 13JUL15 0:00 890.2 2232830.00000
US000328 Lath 02JAN16 0:00 885.9 2310299.00000
US000328 Lath 13JUN16 0:00 881.4 2450616.00000
US000329 Lath 12JAN15 0:00 885.7 2235626.00000
US000329 Lath 30MAY15 0:00 880.6 2373070.00000
US000329 Lath 14JAN16 0:00 872.6 2558987.00000
US000329 Lath 05AUG16 0:00 863.1 2740207.00000
US000331 Lath 13JUN15 0:00 860.7 2689105.00000
US000331 Lath 12JUN16 0:00 860.7 2730534.00000
US000334 Lath 02JUL15 0:00 859.6 3034370.00000
US000334 Lath 24JAN16 0:00 853.2 3230211.00000
US000334 Lath 27AUG16 0:00 845.9 3411050.00000
US000335 Lath 17MAR15 0:00 847.3 2183273.00000
US000335 Lath 15JAN16 0:00 847.3 2258321.00000
US000335 Lath 05AUG16 0:00 841.9 2438931.00000
US000336 Lath 16MAR15 0:00 853.6 2365989.00000
US000336 Lath 02MAY15 0:00 850.6 2405547.00000
US000336 Lath 28JAN16 0:00 846.3 2522017.00000
US000336 Lath 14MAR16 0:00 839.2 2554822.00000
;
run;
What I want to do, is that for each MC above:
1. Calculate Dia difference in such a way that if there is only one row, keep the dia otherwise DiaDiff = Dia in current row - Dia in previous row (please Note: for practice, I used Case When Else option here, but you still can be better than me, I am sure)
2. If DiaDiff = Dia in that row, Dia_Flag = 0, otherwise =1
3. Difference between the Days in two rows. If first row, DaysDiff = 0, otherwise DaysDiff = Date in current row - Date in previous row
4. Similary for KMs Difference. If first row, KM_Diff = 0, otherwise KM_Diff = KMs in current row - KMs in previous row
And result would look like something below:
| MC | ET | Date | Time | Dia | KMs | DiaDiff | Dia_Flag | DaysDiff | KM_Diff |
| US0001 | Lath | 12Jan2015 | 7:00:00 | 890,8 | 2900334 | 890,8 | 0 | 0 | 0 |
| US000328 | Lath | 13Jul2015 | 0:00:00 | 890,2 | 2232830 | 890,2 | 0 | 0 | 0 |
| US000328 | Lath | 02Jan2016 | 0:00:00 | 885,9 | 2310299 | 4,3 | 1 | 173 | 77469 |
| US000328 | Lath | 13Jun2016 | 0:00:00 | 881,4 | 2450616 | 4,5 | 1 | 163 | 140317 |
| US000329 | Lath | 12Jan2015 | 0:00:00 | 885,7 | 2235626 | 885,7 | 0 | 0 | 0 |
| US000329 | Lath | 30May2015 | 0:00:00 | 880,6 | 2373070 | 5,1 | 1 | 138 | 137444 |
| US000329 | Lath | 14Jan2016 | 0:00:00 | 872,6 | 2558987 | 8 | 1 | 229 | 185917 |
| US000329 | Lath | 05Aug2016 | 0:00:00 | 863,1 | 2740207 | 9,5 | 1 | 204 | 181220 |
| US000331 | Lath | 13Jun2015 | 0:00:00 | 860,7 | 2689105 | 860,7 | 0 | 0 | 0 |
| US000331 | Lath | 12Jun2016 | 0:00:00 | 860,7 | 2730534 | 0 | 0 | 365 | 41429 |
| US000334 | Lath | 02Jul2015 | 0:00:00 | 859,6 | 3034370 | 1,1 | 1 | 0 | 0 |
| US000334 | Lath | 24Jan2016 | 0:00:00 | 853,2 | 3230211 | 6,4 | 1 | 206 | 195841 |
| US000334 | Lath | 27Aug2016 | 0:00:00 | 845,9 | 3411050 | 7,3 | 1 | 216 | 180839 |
| US000335 | Lath | 17Mar2015 | 0:00:00 | 847,3 | 2183273 | 847,3 | 0 | 0 | 0 |
| US000335 | Lath | 15Jan2016 | 0:00:00 | 847,3 | 2258321 | 0 | 0 | 304 | 75048 |
| US000335 | Lath | 05Aug2016 | 0:00:00 | 841,9 | 2438931 | 5,4 | 1 | 203 | 180610 |
| US000336 | Lath | 16Mar2015 | 0:00:00 | 853,6 | 2365989 | 853,6 | 0 | 0 | 0 |
| US000336 | Lath | 02May2015 | 0:00:00 | 850,6 | 2405547 | 3 | 1 | 47 | 39558 |
| US000336 | Lath | 28Jan2016 | 0:00:00 | 846,3 | 2522017 | 4,3 | 1 | 271 | 116470 |
| US000336 | Lath | 14Mar2016 | 0:00:00 | 839,2 | 2554822 | 7,1 | 1 | 46 | 32805 |
Currently I am learning loops and hence this case.
Thanks.
Use the lag() function and first. condition to reset for every first observation in a group
data want;
set have;
by MC;
diadiff = dia - lag(dia);
daysdiff = date - lag(date);
km_diff = kms - lag(kms);
if first.MC
then do;
diadiff = dia;
daysdiff = 0;
km_diff = 0;
end;
run;
If you want your difference values to always be positive, use the abs() function.
Hi,
Whilst I don't have time to do all that for you, some tips to get you started:
Use a datastep with a by group.
Use lag
E.g.
data want;
set have;
by mc;
retain diadiff dia_flag daysdiff km_diff;
if first.mc then do;
diadiff=dia;
dia=0;
...;
end;
else do;
dia_flag=1;
daysdiff=date-lag(date);
...;
end;
run;
Note, I will repeat, its not a good idea to call variables type names e.g. date or time.
Use the lag() function and first. condition to reset for every first observation in a group
data want;
set have;
by MC;
diadiff = dia - lag(dia);
daysdiff = date - lag(date);
km_diff = kms - lag(kms);
if first.MC
then do;
diadiff = dia;
daysdiff = 0;
km_diff = 0;
end;
run;
If you want your difference values to always be positive, use the abs() function.
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Still thinking about your presentation idea? The submission deadline has been extended to Friday, Nov. 14, at 11:59 p.m. ET.
Check out this tutorial series to learn how to build your own steps in SAS Studio.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.