Hi all,
First of all I must thankful to everyone there who kept helping me to learn SAS fast. Almost a month ago, I was zero with SAS but now, at least I am feeling confident and have grown my love towards it. 🙂
Now, to my problem...
Suppose I have the following dataset:
data have;
infile cards truncover expandtabs;
input MC $ ET $ Date :Date9. Time :time8. Dia KMs;
format Date date9. Time time8.;
cards;
US0001 Lath 12JAN15 7:00 890.8 2900334.00000
US000328 Lath 13JUL15 0:00 890.2 2232830.00000
US000328 Lath 02JAN16 0:00 885.9 2310299.00000
US000328 Lath 13JUN16 0:00 881.4 2450616.00000
US000329 Lath 12JAN15 0:00 885.7 2235626.00000
US000329 Lath 30MAY15 0:00 880.6 2373070.00000
US000329 Lath 14JAN16 0:00 872.6 2558987.00000
US000329 Lath 05AUG16 0:00 863.1 2740207.00000
US000331 Lath 13JUN15 0:00 860.7 2689105.00000
US000331 Lath 12JUN16 0:00 860.7 2730534.00000
US000334 Lath 02JUL15 0:00 859.6 3034370.00000
US000334 Lath 24JAN16 0:00 853.2 3230211.00000
US000334 Lath 27AUG16 0:00 845.9 3411050.00000
US000335 Lath 17MAR15 0:00 847.3 2183273.00000
US000335 Lath 15JAN16 0:00 847.3 2258321.00000
US000335 Lath 05AUG16 0:00 841.9 2438931.00000
US000336 Lath 16MAR15 0:00 853.6 2365989.00000
US000336 Lath 02MAY15 0:00 850.6 2405547.00000
US000336 Lath 28JAN16 0:00 846.3 2522017.00000
US000336 Lath 14MAR16 0:00 839.2 2554822.00000
;
run;
What I want to do, is that for each MC above:
1. Calculate Dia difference in such a way that if there is only one row, keep the dia otherwise DiaDiff = Dia in current row - Dia in previous row (please Note: for practice, I used Case When Else option here, but you still can be better than me, I am sure)
2. If DiaDiff = Dia in that row, Dia_Flag = 0, otherwise =1
3. Difference between the Days in two rows. If first row, DaysDiff = 0, otherwise DaysDiff = Date in current row - Date in previous row
4. Similary for KMs Difference. If first row, KM_Diff = 0, otherwise KM_Diff = KMs in current row - KMs in previous row
And result would look like something below:
MC | ET | Date | Time | Dia | KMs | DiaDiff | Dia_Flag | DaysDiff | KM_Diff |
US0001 | Lath | 12Jan2015 | 7:00:00 | 890,8 | 2900334 | 890,8 | 0 | 0 | 0 |
US000328 | Lath | 13Jul2015 | 0:00:00 | 890,2 | 2232830 | 890,2 | 0 | 0 | 0 |
US000328 | Lath | 02Jan2016 | 0:00:00 | 885,9 | 2310299 | 4,3 | 1 | 173 | 77469 |
US000328 | Lath | 13Jun2016 | 0:00:00 | 881,4 | 2450616 | 4,5 | 1 | 163 | 140317 |
US000329 | Lath | 12Jan2015 | 0:00:00 | 885,7 | 2235626 | 885,7 | 0 | 0 | 0 |
US000329 | Lath | 30May2015 | 0:00:00 | 880,6 | 2373070 | 5,1 | 1 | 138 | 137444 |
US000329 | Lath | 14Jan2016 | 0:00:00 | 872,6 | 2558987 | 8 | 1 | 229 | 185917 |
US000329 | Lath | 05Aug2016 | 0:00:00 | 863,1 | 2740207 | 9,5 | 1 | 204 | 181220 |
US000331 | Lath | 13Jun2015 | 0:00:00 | 860,7 | 2689105 | 860,7 | 0 | 0 | 0 |
US000331 | Lath | 12Jun2016 | 0:00:00 | 860,7 | 2730534 | 0 | 0 | 365 | 41429 |
US000334 | Lath | 02Jul2015 | 0:00:00 | 859,6 | 3034370 | 1,1 | 1 | 0 | 0 |
US000334 | Lath | 24Jan2016 | 0:00:00 | 853,2 | 3230211 | 6,4 | 1 | 206 | 195841 |
US000334 | Lath | 27Aug2016 | 0:00:00 | 845,9 | 3411050 | 7,3 | 1 | 216 | 180839 |
US000335 | Lath | 17Mar2015 | 0:00:00 | 847,3 | 2183273 | 847,3 | 0 | 0 | 0 |
US000335 | Lath | 15Jan2016 | 0:00:00 | 847,3 | 2258321 | 0 | 0 | 304 | 75048 |
US000335 | Lath | 05Aug2016 | 0:00:00 | 841,9 | 2438931 | 5,4 | 1 | 203 | 180610 |
US000336 | Lath | 16Mar2015 | 0:00:00 | 853,6 | 2365989 | 853,6 | 0 | 0 | 0 |
US000336 | Lath | 02May2015 | 0:00:00 | 850,6 | 2405547 | 3 | 1 | 47 | 39558 |
US000336 | Lath | 28Jan2016 | 0:00:00 | 846,3 | 2522017 | 4,3 | 1 | 271 | 116470 |
US000336 | Lath | 14Mar2016 | 0:00:00 | 839,2 | 2554822 | 7,1 | 1 | 46 | 32805 |
Currently I am learning loops and hence this case.
Thanks.
Use the lag() function and first. condition to reset for every first observation in a group
data want;
set have;
by MC;
diadiff = dia - lag(dia);
daysdiff = date - lag(date);
km_diff = kms - lag(kms);
if first.MC
then do;
diadiff = dia;
daysdiff = 0;
km_diff = 0;
end;
run;
If you want your difference values to always be positive, use the abs() function.
Hi,
Whilst I don't have time to do all that for you, some tips to get you started:
Use a datastep with a by group.
Use lag
E.g.
data want; set have; by mc; retain diadiff dia_flag daysdiff km_diff; if first.mc then do; diadiff=dia; dia=0; ...; end; else do; dia_flag=1; daysdiff=date-lag(date); ...; end; run;
Note, I will repeat, its not a good idea to call variables type names e.g. date or time.
Use the lag() function and first. condition to reset for every first observation in a group
data want;
set have;
by MC;
diadiff = dia - lag(dia);
daysdiff = date - lag(date);
km_diff = kms - lag(kms);
if first.MC
then do;
diadiff = dia;
daysdiff = 0;
km_diff = 0;
end;
run;
If you want your difference values to always be positive, use the abs() function.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.