I am doing a time series analysis for ticket created over time and would like to monitor consecutive concurrent decrease month over month for our clients. Below is my original data set:
Open Date |
| |
N of Clients | Sum of Tickets | |
01NOV14:00:00:00 | 1070 | 3530 |
01DEC14:00:00:00 | 1448 | 8486 |
01JAN15:00:00:00 | 1821 | 23580 |
01FEB15:00:00:00 | 1796 | 20648 |
01MAR15:00:00:00 | 1756 | 22872 |
01APR15:00:00:00 | 1789 | 21601 |
01MAY15:00:00:00 | 1746 | 20059 |
01JUN15:00:00:00 | 1757 | 12946 |
01JUL15:00:00:00 | 1780 | 22874 |
01AUG15:00:00:00 | 1799 | 24045 |
01SEP15:00:00:00 | 1855 | 27344 |
01OCT15:00:00:00 | 1777 | 23124 |
01NOV15:00:00:00 | 1682 | 17861 |
01DEC15:00:00:00 | 1868 | 19191 |
If there is a one month decrease in sum of tickets, then clients will receive a score of 0.9. If there are a two-month decrease in sum of tickets, then clients will receive a score of 0.7 and so on until the score is 0.1. When there is a one month increase in sum of tickets, the score should return to the prior level. The new data should look like this:
Open Date |
|
| |
N of Clients | Sum of Tickets | Consecutive | |
01NOV14:00:00:00 | 1070 | 3530 |
|
01DEC14:00:00:00 | 1448 | 8486 | 1 |
01JAN15:00:00:00 | 1821 | 23580 | 1 |
01FEB15:00:00:00 | 1796 | 20648 | 0.9 |
01MAR15:00:00:00 | 1756 | 22872 | 1 |
01APR15:00:00:00 | 1789 | 21601 | 0.9 |
01MAY15:00:00:00 | 1746 | 20059 | 0.7 |
01JUN15:00:00:00 | 1757 | 12946 | 0.5 |
01JUL15:00:00:00 | 1780 | 22874 | 0.7 |
01AUG15:00:00:00 | 1799 | 24045 | 0.9 |
01SEP15:00:00:00 | 1855 | 27344 | 1 |
01OCT15:00:00:00 | 1777 | 23124 | 0.9 |
01NOV15:00:00:00 | 1682 | 17861 | 0.7 |
01DEC15:00:00:00 | 1868 | 19191 | 0.9 |
I have tried different step by using do or LAG but can not code the consecutive score correct. My score always returns to 1 instead of the prior level when there is an increase. Can somebody help me on this programming? Thanks a lot!
I don't get the same values as you, but I also don't quite see your logic with your third record.
Here's what I have, perhaps you can modify it to your logic?
data have;
informat date anydtdtm. sum total score 8.;
format date datetime21.;
input date sum total score @@;
cards;
01NOV14:00:00:00
1070
3530
0
01DEC14:00:00:00
1448
8486
1
01JAN15:00:00:00
1821
23580
1
01FEB15:00:00:00
1796
20648
0.9
01MAR15:00:00:00
1756
22872
1
01APR15:00:00:00
1789
21601
0.9
01MAY15:00:00:00
1746
20059
0.7
01JUN15:00:00:00
1757
12946
0.5
01JUL15:00:00:00
1780
22874
0.7
01AUG15:00:00:00
1799
24045
0.9
01SEP15:00:00:00
1855
27344
1
01OCT15:00:00:00
1777
23124
0.9
01NOV15:00:00:00
1682
17861
0.7
01DEC15:00:00:00
1868
19191
0.9
;
run;
proc format;
value order
1 = 1
2 = 0.9
3 = 0.7
4 = 0.5
5 = 0.3
6 = 0.1
;
run;
data want;
set have;
retain order 1;
lag_sum=lag(sum);
if _n_ ne 1 then do;
if sum>lag_sum then order=max(order-1, 1);
else order+1;
end;
value=put(order, order.);
run;
Lag + retain are what you should be using. Can you post the code you tried?
Hi Reeza,
Thank you for your input. Yes, forgot to mention that I also used retain. Below please forgive my ugly code:
retain consecutive 1;
do consecutive= 1;
if LAG(Sum_of_Tickets)>0;
do consecutive= 0.9;
if LAG(Sum_of_Tickets)<0;
do consecutive= 0.7;
if LAG(Sum_of_Tickets)<0;
do consecutive= 0.5;
if LAG(Sum_of_Tickets)<0;
do consecutive= 0.3;
if LAG(Sum_of_Tickets)<0;
do consecutive= 0.1;
if LAG(Sum_of_Tickets)<0;
end;
end;
end;
end;
end;
end;
run;
I don't get the same values as you, but I also don't quite see your logic with your third record.
Here's what I have, perhaps you can modify it to your logic?
data have;
informat date anydtdtm. sum total score 8.;
format date datetime21.;
input date sum total score @@;
cards;
01NOV14:00:00:00
1070
3530
0
01DEC14:00:00:00
1448
8486
1
01JAN15:00:00:00
1821
23580
1
01FEB15:00:00:00
1796
20648
0.9
01MAR15:00:00:00
1756
22872
1
01APR15:00:00:00
1789
21601
0.9
01MAY15:00:00:00
1746
20059
0.7
01JUN15:00:00:00
1757
12946
0.5
01JUL15:00:00:00
1780
22874
0.7
01AUG15:00:00:00
1799
24045
0.9
01SEP15:00:00:00
1855
27344
1
01OCT15:00:00:00
1777
23124
0.9
01NOV15:00:00:00
1682
17861
0.7
01DEC15:00:00:00
1868
19191
0.9
;
run;
proc format;
value order
1 = 1
2 = 0.9
3 = 0.7
4 = 0.5
5 = 0.3
6 = 0.1
;
run;
data want;
set have;
retain order 1;
lag_sum=lag(sum);
if _n_ ne 1 then do;
if sum>lag_sum then order=max(order-1, 1);
else order+1;
end;
value=put(order, order.);
run;
Hi Reeza,
Thank you for your logic. Learnt a lot from it!
I love this question. data have; input date : $20. N Sum; cards; 01NOV14:00:00:00 1070 3530 01DEC14:00:00:00 1448 8486 01JAN15:00:00:00 1821 23580 01FEB15:00:00:00 1796 20648 01MAR15:00:00:00 1756 22872 01APR15:00:00:00 1789 21601 01MAY15:00:00:00 1746 20059 01JUN15:00:00:00 1757 12946 01JUL15:00:00:00 1780 22874 01AUG15:00:00:00 1799 24045 01SEP15:00:00:00 1855 27344 01OCT15:00:00:00 1777 23124 01NOV15:00:00:00 1682 17861 01DEC15:00:00:00 1868 19191 ; run; data temp; set have; sign=sign(dif(sum)); run; data want; set temp; array x{6} _temporary_ (0.1 0.3 0.5 0.7 0.9 1); retain index 6; if _n_=1 then score=.; else do; index+sign; if index gt 6 then index=6; else if index lt 1 then index=1; score=x{index}; end; drop sign index; run;
Thank you so much for your input. That works out very well!
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.