BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
damilaresamuel
Fluorite | Level 6

Hi Everyone. I'm relatively new to SAS Programming and would like suggestions on how to solve the following problem:

 

Lets say I have 6 months record of the total sales made by several salespeople.

I will like to know the highest number of consecutive months of sales increase for each of them, during the 6 month period.

 

Please see sample data below

 

Id      Jan   Feb    Mar    Apr   May  Jun 

A        2        4        5       8       4       3

B       0         3        2      5        6       6

C       7         6       9       2        4       1

 

From the above sample date, the desired result is as follows:

 

For row A, output will be equal to 3 ( since the longest consecutive period of increase is from Jan to Apr  i.e sales increased for 3 consecutive months.

 

For row B, output will be equal to 2 ( since the longest consecutive period of increase is from Mar to May i.e sales increased for 2 consecutive months)

 

For row C, output will be equal to 0 ( since the sales did not increase in any consecutive months)

 

 

I look forward to your suggestions.

 

1 ACCEPTED SOLUTION

Accepted Solutions
Ksharp
Super User
data have ;                                                                                                                             
  input id :$1. jan feb mar apr may jun ;                                                                                               
  cards ;                                                                                                                               
A  2  4  5  8  4  3                                                                                                                     
B  0  3  2  5  6  6                                                                                                                     
C  7  6  9  2  4  1                                                                                                                     
;                                                                                                                                       
                                                                                                                                        
data want  ;                                                                                                                 
  set have ;                                                                                                                            
  array x{*} jan--jun ;   
  do i = 1 to dim (x)-1;
     n=0;
     do j = i+1 to dim (x) ;
        if x{j}>x{j-1} then n+1;
         else leave;
     end;
     want=max(want,n);
  end;
drop i j n;
run;

View solution in original post

6 REPLIES 6
ed_sas_member
Meteorite | Level 14

Hi @damilaresamuel 

Here is an attempt to achieve this.

 

Hope this help!

data have;
	input Id $ Jan Feb Mar Apr May Jun;
	datalines;
A        2        4        5       8       4       3
B       0         3        2      5        6       6
C       7         6       9       2        4       1
;
run;

proc transpose data=have out=have_tr (rename=(col1=Sales)) name=Month;
	var _numeric_;
	by Id;
run;

data have_tr2;
	set have_tr;
	by Id Month notsorted;
	_lag = lag(Sales);
	if first.Id then flag = 0;
	else if sales > _lag then flag + 1;
	else flag = 0;
run;

proc sql;
	create table want as 
	select Id,
		   case when max(flag) > 1 then max(flag)
		        when max(flag) <= 1 then 0
		        end as Month_increase label='Consecutive month increase'
	from have_tr2
	group by Id;
quit;
damilaresamuel
Fluorite | Level 6

Yes it definitely helps.

 

As pointed out by another user, output of row C is misidentified; desired output should be 1, not 0.

But I understand your solution perfectly. I will only need to exclude the case statement.

 

Thank you!

hashman
Ammonite | Level 13

@damilaresamuel:

 

First, your case C is misindentified with respect to the output, as in this row there's 1 time increase from Feb (6) to Mar (9), so for this row the output should be 1. Other than that, the task readily yields to simple array processing:

data have ;                                                                                                                             
  input id :$1. jan feb mar apr may jun ;                                                                                               
  cards ;                                                                                                                               
A  2  4  5  8  4  3                                                                                                                     
B  0  3  2  5  6  6                                                                                                                     
C  7  6  9  2  4  1                                                                                                                     
;                                                                                                                                       
                                                                                                                                        
data want (drop = _:) ;                                                                                                                 
  set have ;                                                                                                                            
  array mm [*] jan--jun ;                                                                                                               
  do _i = 2 to dim (mm) ;                                                                                                               
    if mm[_i] > mm[_i-1] then _q = sum (_q, 1) ;                                                                                        
    else do ;                                                                                                                           
      output = _q max output ;                                                                                                          
      call missing (_q) ;                                                                                                               
    end ;                                                                                                                               
  end ;                                                                                                                                 
run ;           

Kind regards

Paul D. 

damilaresamuel
Fluorite | Level 6

Yes, you are right!  Case C is misidentified with respect to the output. Output should be 1.

Your solution works perfectly, and is very neat.

 

I definitely need to study more on arrays. Perhaps you could suggest some resources.

 

Thanks you! 

damilaresamuel
Fluorite | Level 6

Hi Paul,

 

Upon further review, I realised that your solution doesn't fully solve my problem.
It doesn't return desired output for some scenarios (Please see new input below)

 

      jan  feb  mar  apr  may  jun

A    2     3      0      5      6      7
B    5     4      3      2     4      5
C    2     2     2       4     5      6 

After running the programme, the outputs for A, B & C are 1, 'null' & 'null' respectively.
These are incorrect.

 

Output for A should be 3 (i.e. mar to jun) 

Output for B should be 2 (i.e. apr to jun)

Output for C should be 3 (i.e. mar to jun)

Ksharp
Super User
data have ;                                                                                                                             
  input id :$1. jan feb mar apr may jun ;                                                                                               
  cards ;                                                                                                                               
A  2  4  5  8  4  3                                                                                                                     
B  0  3  2  5  6  6                                                                                                                     
C  7  6  9  2  4  1                                                                                                                     
;                                                                                                                                       
                                                                                                                                        
data want  ;                                                                                                                 
  set have ;                                                                                                                            
  array x{*} jan--jun ;   
  do i = 1 to dim (x)-1;
     n=0;
     do j = i+1 to dim (x) ;
        if x{j}>x{j-1} then n+1;
         else leave;
     end;
     want=max(want,n);
  end;
drop i j n;
run;
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 1841 views
  • 4 likes
  • 4 in conversation