05-22-2017 09:55 AM
Hello. I am working with customer data that looks like this:
I would like to find customers whose quantity is decreasing over time based on product_ID and Customer_ID. My final table would like this:
|Customer||Product_ID||N order||N-1 Order||Difference|
My original idea is to use Proc SQL but I am having trouble figuring out how to the combine the rows based on customer and procuct id. Can anyone provide some help?
05-22-2017 10:05 AM
A data step solution could be
data have; input Order_id$ Customer_Id$ order_date:mmddyy10. Product_id$ Quantity; format order_date mmddyy10.; datalines; 1 c1 1/23/2014 P1 10 2 c1 1/23/2016 P1 4 3 c2 2/5/2015 P2 6 4 c2 6/6/2015 P2 2 ; proc sort data = have; by Customer_Id Product_id; run; data want; format Customer_Id Product_id lag_N_Order N_Order Difference; set have; by Customer_Id Product_id; lag_N_Order = lag1(Quantity); N_Order = Quantity; Difference = lag_N_Order - N_Order; keep Customer_Id Product_id N_Order lag_N_Order Difference; if last.Customer_Id & last.Product_id & Difference>0; run;
05-22-2017 10:06 AM
First, sort by customer, product_id and order_date.
Then, in a data step, use
data want; set have; by customer product_id; n_order = quantity; n_order_prev = lag(quantity); difference = n_order_prev - n_order; if not first.product_id then output; drop quantity; run;
The lag() function retrieves the previous value of a certain variable, it works as a FIFO chain.
05-22-2017 10:26 AM
Same approach (i.e., data step), but I don't think you need to create extra variables:
data decreases; set have; by customer_id Product_id; if quantity lt lag(quantity) and first.Product_id eq 0 then output; run;
Art, CEO, AnalystFinder.com