Hi Everyone. I'm relatively new to SAS Programming and would like suggestions on how to solve the following problem:
Lets say I have 6 months record of the total sales made by several salespeople.
I will like to know the highest number of consecutive months of sales increase for each of them, during the 6 month period.
Please see sample data below
Id Jan Feb Mar Apr May Jun
A 2 4 5 8 4 3
B 0 3 2 5 6 6
C 7 6 9 2 4 1
From the above sample date, the desired result is as follows:
For row A, output will be equal to 3 ( since the longest consecutive period of increase is from Jan to Apr i.e sales increased for 3 consecutive months.
For row B, output will be equal to 2 ( since the longest consecutive period of increase is from Mar to May i.e sales increased for 2 consecutive months)
For row C, output will be equal to 0 ( since the sales did not increase in any consecutive months)
I look forward to your suggestions.
data have ;
input id :$1. jan feb mar apr may jun ;
cards ;
A 2 4 5 8 4 3
B 0 3 2 5 6 6
C 7 6 9 2 4 1
;
data want ;
set have ;
array x{*} jan--jun ;
do i = 1 to dim (x)-1;
n=0;
do j = i+1 to dim (x) ;
if x{j}>x{j-1} then n+1;
else leave;
end;
want=max(want,n);
end;
drop i j n;
run;
Here is an attempt to achieve this.
Hope this help!
data have;
input Id $ Jan Feb Mar Apr May Jun;
datalines;
A 2 4 5 8 4 3
B 0 3 2 5 6 6
C 7 6 9 2 4 1
;
run;
proc transpose data=have out=have_tr (rename=(col1=Sales)) name=Month;
var _numeric_;
by Id;
run;
data have_tr2;
set have_tr;
by Id Month notsorted;
_lag = lag(Sales);
if first.Id then flag = 0;
else if sales > _lag then flag + 1;
else flag = 0;
run;
proc sql;
create table want as
select Id,
case when max(flag) > 1 then max(flag)
when max(flag) <= 1 then 0
end as Month_increase label='Consecutive month increase'
from have_tr2
group by Id;
quit;
Yes it definitely helps.
As pointed out by another user, output of row C is misidentified; desired output should be 1, not 0.
But I understand your solution perfectly. I will only need to exclude the case statement.
Thank you!
First, your case C is misindentified with respect to the output, as in this row there's 1 time increase from Feb (6) to Mar (9), so for this row the output should be 1. Other than that, the task readily yields to simple array processing:
data have ;
input id :$1. jan feb mar apr may jun ;
cards ;
A 2 4 5 8 4 3
B 0 3 2 5 6 6
C 7 6 9 2 4 1
;
data want (drop = _:) ;
set have ;
array mm [*] jan--jun ;
do _i = 2 to dim (mm) ;
if mm[_i] > mm[_i-1] then _q = sum (_q, 1) ;
else do ;
output = _q max output ;
call missing (_q) ;
end ;
end ;
run ;
Kind regards
Paul D.
Yes, you are right! Case C is misidentified with respect to the output. Output should be 1.
Your solution works perfectly, and is very neat.
I definitely need to study more on arrays. Perhaps you could suggest some resources.
Thanks you!
Hi Paul,
Upon further review, I realised that your solution doesn't fully solve my problem.
It doesn't return desired output for some scenarios (Please see new input below)
jan feb mar apr may jun
A 2 3 0 5 6 7
B 5 4 3 2 4 5
C 2 2 2 4 5 6
After running the programme, the outputs for A, B & C are 1, 'null' & 'null' respectively.
These are incorrect.
Output for A should be 3 (i.e. mar to jun)
Output for B should be 2 (i.e. apr to jun)
Output for C should be 3 (i.e. mar to jun)
data have ;
input id :$1. jan feb mar apr may jun ;
cards ;
A 2 4 5 8 4 3
B 0 3 2 5 6 6
C 7 6 9 2 4 1
;
data want ;
set have ;
array x{*} jan--jun ;
do i = 1 to dim (x)-1;
n=0;
do j = i+1 to dim (x) ;
if x{j}>x{j-1} then n+1;
else leave;
end;
want=max(want,n);
end;
drop i j n;
run;
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.