Subtracting Variables between two Rows based on other similar variables

Reply
Regular Learner
Posts: 1

Subtracting Variables between two Rows based on other similar variables

Hello.  I am working with customer data that looks like this:

 

Order_id Customer_Idorder_dateProduct_idQuantity
1c11/23/2014P110
2c11/23/2016P14
3c22/5/2015P26
4c26/6/2015P22

 

I would like to find customers whose quantity is decreasing over time based on product_ID and Customer_ID.  My final table would like this: 

 

CustomerProduct_IDN orderN-1 OrderDifference
C1P11046
C2P2624

 

My original idea is to use Proc SQL but I am having trouble figuring out how to the combine the rows based on customer and procuct id.   Can anyone provide some help?

PROC Star
Posts: 653

Re: Subtracting Variables between two Rows based on other similar variables

A data step solution could be

 

data have;
input Order_id$ Customer_Id$ order_date:mmddyy10. Product_id$ Quantity;
format order_date mmddyy10.;
datalines; 
1 c1 1/23/2014 P1 10 
2 c1 1/23/2016 P1 4 
3 c2 2/5/2015 P2 6 
4 c2 6/6/2015 P2 2 
;

proc sort data = have;
	by Customer_Id Product_id;
run;

data want;
	format Customer_Id Product_id lag_N_Order N_Order Difference;
	set have;
	by Customer_Id Product_id;

	lag_N_Order = lag1(Quantity);
	N_Order = Quantity;
	Difference = lag_N_Order - N_Order;

	keep Customer_Id Product_id N_Order lag_N_Order Difference;
	if last.Customer_Id & last.Product_id & Difference>0;
run;
Super User
Posts: 7,433

Re: Subtracting Variables between two Rows based on other similar variables

First, sort by customer, product_id and order_date.

Then, in a data step, use

data want;
set have;
by customer product_id;
n_order = quantity;
n_order_prev = lag(quantity);
difference = n_order_prev - n_order;
if not first.product_id then output;
drop quantity;
run;

The lag() function retrieves the previous value of a certain variable, it works as a FIFO chain.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
PROC Star
Posts: 7,433

Re: Subtracting Variables between two Rows based on other similar variables

Same approach (i.e., data step), but I don't think you need to create extra variables:

data decreases;
  set have;
  by customer_id Product_id;
  if quantity lt lag(quantity) and first.Product_id eq 0 then output;
run;

Art, CEO, AnalystFinder.com

Ask a Question
Discussion stats
  • 3 replies
  • 158 views
  • 1 like
  • 4 in conversation