I am doing a time series analysis for ticket created over time and would like to monitor consecutive concurrent decrease month over month for our clients. Below is my original data set:
Open Date |
| |
N of Clients | Sum of Tickets | |
01NOV14:00:00:00 | 1070 | 3530 |
01DEC14:00:00:00 | 1448 | 8486 |
01JAN15:00:00:00 | 1821 | 23580 |
01FEB15:00:00:00 | 1796 | 20648 |
01MAR15:00:00:00 | 1756 | 22872 |
01APR15:00:00:00 | 1789 | 21601 |
01MAY15:00:00:00 | 1746 | 20059 |
01JUN15:00:00:00 | 1757 | 12946 |
01JUL15:00:00:00 | 1780 | 22874 |
01AUG15:00:00:00 | 1799 | 24045 |
01SEP15:00:00:00 | 1855 | 27344 |
01OCT15:00:00:00 | 1777 | 23124 |
01NOV15:00:00:00 | 1682 | 17861 |
01DEC15:00:00:00 | 1868 | 19191 |
If there is a one month decrease in sum of tickets, then clients will receive a score of 0.9. If there are a two-month decrease in sum of tickets, then clients will receive a score of 0.7 and so on until the score is 0.1. When there is a one month increase in sum of tickets, the score should return to the prior level. The new data should look like this:
Open Date |
|
| |
N of Clients | Sum of Tickets | Consecutive | |
01NOV14:00:00:00 | 1070 | 3530 |
|
01DEC14:00:00:00 | 1448 | 8486 | 1 |
01JAN15:00:00:00 | 1821 | 23580 | 1 |
01FEB15:00:00:00 | 1796 | 20648 | 0.9 |
01MAR15:00:00:00 | 1756 | 22872 | 1 |
01APR15:00:00:00 | 1789 | 21601 | 0.9 |
01MAY15:00:00:00 | 1746 | 20059 | 0.7 |
01JUN15:00:00:00 | 1757 | 12946 | 0.5 |
01JUL15:00:00:00 | 1780 | 22874 | 0.7 |
01AUG15:00:00:00 | 1799 | 24045 | 0.9 |
01SEP15:00:00:00 | 1855 | 27344 | 1 |
01OCT15:00:00:00 | 1777 | 23124 | 0.9 |
01NOV15:00:00:00 | 1682 | 17861 | 0.7 |
01DEC15:00:00:00 | 1868 | 19191 | 0.9 |
I have tried different step by using do or LAG but can not code the consecutive score correct. My score always returns to 1 instead of the prior level when there is an increase. Can somebody help me on this programming? Thanks a lot!
I don't get the same values as you, but I also don't quite see your logic with your third record.
Here's what I have, perhaps you can modify it to your logic?
data have;
informat date anydtdtm. sum total score 8.;
format date datetime21.;
input date sum total score @@;
cards;
01NOV14:00:00:00
1070
3530
0
01DEC14:00:00:00
1448
8486
1
01JAN15:00:00:00
1821
23580
1
01FEB15:00:00:00
1796
20648
0.9
01MAR15:00:00:00
1756
22872
1
01APR15:00:00:00
1789
21601
0.9
01MAY15:00:00:00
1746
20059
0.7
01JUN15:00:00:00
1757
12946
0.5
01JUL15:00:00:00
1780
22874
0.7
01AUG15:00:00:00
1799
24045
0.9
01SEP15:00:00:00
1855
27344
1
01OCT15:00:00:00
1777
23124
0.9
01NOV15:00:00:00
1682
17861
0.7
01DEC15:00:00:00
1868
19191
0.9
;
run;
proc format;
value order
1 = 1
2 = 0.9
3 = 0.7
4 = 0.5
5 = 0.3
6 = 0.1
;
run;
data want;
set have;
retain order 1;
lag_sum=lag(sum);
if _n_ ne 1 then do;
if sum>lag_sum then order=max(order-1, 1);
else order+1;
end;
value=put(order, order.);
run;
Lag + retain are what you should be using. Can you post the code you tried?
Hi Reeza,
Thank you for your input. Yes, forgot to mention that I also used retain. Below please forgive my ugly code:
retain consecutive 1;
do consecutive= 1;
if LAG(Sum_of_Tickets)>0;
do consecutive= 0.9;
if LAG(Sum_of_Tickets)<0;
do consecutive= 0.7;
if LAG(Sum_of_Tickets)<0;
do consecutive= 0.5;
if LAG(Sum_of_Tickets)<0;
do consecutive= 0.3;
if LAG(Sum_of_Tickets)<0;
do consecutive= 0.1;
if LAG(Sum_of_Tickets)<0;
end;
end;
end;
end;
end;
end;
run;
I don't get the same values as you, but I also don't quite see your logic with your third record.
Here's what I have, perhaps you can modify it to your logic?
data have;
informat date anydtdtm. sum total score 8.;
format date datetime21.;
input date sum total score @@;
cards;
01NOV14:00:00:00
1070
3530
0
01DEC14:00:00:00
1448
8486
1
01JAN15:00:00:00
1821
23580
1
01FEB15:00:00:00
1796
20648
0.9
01MAR15:00:00:00
1756
22872
1
01APR15:00:00:00
1789
21601
0.9
01MAY15:00:00:00
1746
20059
0.7
01JUN15:00:00:00
1757
12946
0.5
01JUL15:00:00:00
1780
22874
0.7
01AUG15:00:00:00
1799
24045
0.9
01SEP15:00:00:00
1855
27344
1
01OCT15:00:00:00
1777
23124
0.9
01NOV15:00:00:00
1682
17861
0.7
01DEC15:00:00:00
1868
19191
0.9
;
run;
proc format;
value order
1 = 1
2 = 0.9
3 = 0.7
4 = 0.5
5 = 0.3
6 = 0.1
;
run;
data want;
set have;
retain order 1;
lag_sum=lag(sum);
if _n_ ne 1 then do;
if sum>lag_sum then order=max(order-1, 1);
else order+1;
end;
value=put(order, order.);
run;
Hi Reeza,
Thank you for your logic. Learnt a lot from it!
I love this question. data have; input date : $20. N Sum; cards; 01NOV14:00:00:00 1070 3530 01DEC14:00:00:00 1448 8486 01JAN15:00:00:00 1821 23580 01FEB15:00:00:00 1796 20648 01MAR15:00:00:00 1756 22872 01APR15:00:00:00 1789 21601 01MAY15:00:00:00 1746 20059 01JUN15:00:00:00 1757 12946 01JUL15:00:00:00 1780 22874 01AUG15:00:00:00 1799 24045 01SEP15:00:00:00 1855 27344 01OCT15:00:00:00 1777 23124 01NOV15:00:00:00 1682 17861 01DEC15:00:00:00 1868 19191 ; run; data temp; set have; sign=sign(dif(sum)); run; data want; set temp; array x{6} _temporary_ (0.1 0.3 0.5 0.7 0.9 1); retain index 6; if _n_=1 then score=.; else do; index+sign; if index gt 6 then index=6; else if index lt 1 then index=1; score=x{index}; end; drop sign index; run;
Thank you so much for your input. That works out very well!
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.