Re: How to get product count which are new to customers?

ihtishamsultan · Posted 09-21-2018 09:10 PM

Hello,

I am a newbie trying to learn and had a query.

I am trying to identify new products being ordered by customers. I want to check if a customer has placed a order(say in 2016 and 2017) ,did he/she order the same product in 2014-2015 or is it something new to him/her. If it is new then I want to have the count of new ordered products(how many products new to the customer were ordered), the count of customers ordering new product and sales man id associated with order.

Every item purchased has a unique ID Prod_id, Customer is represented by Customer_id, date of order is Order_date and sales man is represented by sales_id.

Any help will be highly appreciated.

mkeintz · Posted 09-21-2018 09:15 PM

Help us help you. Show what the incoming data looks like. Show what you want the result data to look like.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

ihtishamsultan · Posted 09-21-2018 09:27 PM

incoming data is sales data for a store from 2014-2017.

Cust_id, order_date and sales id are character variables. Prod_id is numeric variable.

I want the results in a tabular form and to only include products that were new to customer(not ordered in the previous two years)

for example

Cust_id prod_id sales_id order_date

A345 74 S231 09/09/2017

I am sorry, I am new to all this, is this what was expected ??

VDD · Posted 09-21-2018 09:39 PM

you could do this using Proc SQL and a union statement for customer Id and product.

if you would like a sample please post sample data that represents your case with more than one sample. While we can create a lot of sample data using your example is best because the solutions will fit your case study.

ihtishamsultan · Posted 09-21-2018 10:27 PM

Salesman_name	Prod_id	Order_date	Cust_name	Sales_id	Cust_id	Paymode_ID
john	233	3/27/2017	Bill.N	S1922	J231	1
john	235	3/27/2017	Bill.N	S1922	J231	1
john	244	3/27/2017	Bill.N	S1922	J231	1
john	245	3/27/2017	Bill.N	S1922	J231	1
Bail	252	3/27/2017	Susan	S1750	S19	1
Bail	257	3/27/2017	Susan	S1750	S19	1
Chuck	262	3/27/2017	Areeb	S2011	A116	1
Chuck	266	3/27/2017	Areeb	S2011	A116	1
Chuck	419	3/27/2017	Areeb	S2011	A116	1
Chuck	420	3/27/2017	Areeb	S2011	A116	1
phoung	428	3/28/2017	Michelle	S1872	M103	3
Chuck	431	3/28/2017	Liz	S2011	L103	2
Chuck	434	3/28/2017	Liz	S2011	L103	2
Christy	443	3/28/2017	Laxmi	S1881	L72	2
Christy	446	3/28/2017	Laxmi	S1881	L72	2
john	455	3/29/2017	James	S1922	J24	3
john	457	3/29/2017	James	S1922	J24	3
john	460	3/29/2017	James	S1922	J24	3
john	596	3/29/2017	James	S1922	J24	3
phoung	603	3/29/2017	Rana	S1872	R189	1

Couldn't copy the whole things, the data set has 53 variables and around 300,000 rows for 4 years.

VDD · Posted 09-21-2018 11:14 PM

You did not supply data that would meet the selection criteria you have listed.

I created extra records to confirm this works. But you will need to create the order_year our of the order_date for the logic to work.

proc sql;
	create table want as
	select a.*, b.order_date as prior_order_date
	from have2 as a, have2 as b
	where a.salesman_name = b.salesman_name 
		  and a.prod_id = b.prod_id 
		  and b.order_year in (2014,2015) 
		  and a.order_year in (2016,2017);
quit;

Note this in not using a union but works for your case study.

ihtishamsultan · Posted 09-22-2018 07:05 PM

Thanks,

but when I run the code it says work.have created with 0 rows and 8 columns

please I am also attaching a sample data set if you can have a look

Reeza · Posted 09-22-2018 09:30 PM

Post the code and log from what you tried as well.

ihtishamsultan · Posted 09-22-2018 09:37 PM

95 proc sql;

96 create table want as

97 select a.*, b.order_date as prior_order_date

98 from work.a as a, work.a as b

99 where a.Cust_id = b.Cust_id

100 and a.Prod_id = b.Prod_id

101 and b.order_date in (2014,2015)

102 and a.order_date in (2016,2017);

NOTE: Table WORK.WANT created, with 0 rows and 8 columns.

103 quit;

NOTE: PROCEDURE SQL used (Total process time):

real time 0.00 seconds

cpu time 0.00 seconds

mkeintz · Posted 09-21-2018 11:34 PM

If the data are sorted by date, or by date within cust_id then this works:

data want;
  set have;
  if _n_=1 then do;
    declare hash h ();
      h.definekey('cust_id','prod_id');
      h.definedone();
  end;
  set have;
  if date<'01jan2016'd then h.add();
  if date>='01jan2016'd and h.find()^=0;
run;

At the first incoming observation, this program instantiates a hash object (a "lookup table" in this case) keyed on cust_id/prod_id. The object only is populated (one record at a time) by instances in which date < '01jan2016'.

Then when later dates are encountered, the h.find method looks to see if the incoming cust_id/prod_id is already in the object (indicating a pre-2016 purchase). If it is not, the find method returns a non-zero. These are the cases kept for output.

I repeat, this depends on records being sorted by date, or by date within cust_id.

If the data is not sorted, then either it has to be sorted and submitted to the code above, or more complicated coding could be written for a single data step.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

ihtishamsultan · Posted 09-22-2018 07:26 PM

When I try it, it gives me the observations which took place after 2016. Irrespective of it being repeated in the previous year.

I am trying to exclude products ordered by a customers more than once.

Say for example

in the attached excel sheet there are 10 products(highlighted in yellow) new to the customer(not ordered in their previous encounters). I want those rows to be shown in the results

VDD · Posted 09-23-2018 10:29 AM

sort you data by customer Id and purchase date, prior to doing the proc sql process;

then those sql selections will work.

ihtishamsultan · Posted 09-23-2018 11:14 AM

Still no luck, table created with 0 observations and 7 rows

311 proc sort data=work.a;

312 by order_date Cust_id;run;

NOTE: There were 108 observations read from the data set WORK.A.

NOTE: The data set WORK.A has 108 observations and 7 variables.

NOTE: PROCEDURE SORT used (Total process time):

real time 0.00 seconds

cpu time 0.00 seconds

313 proc sql;

314 create table S as

315 select a.*, b.order_date as prior_order_date

316 from work.a as a, work.a as b

317 where a.Cust_id = b.Cust_id

318 and a.prod_id = b.prod_id

319 and b.order_date in (2014,2015)

320 and a.order_date in (2016,2017);

NOTE: Table WORK.S created, with 0 rows and 8 columns.

321 quit;

NOTE: PROCEDURE SQL used (Total process time):

real time 0.00 seconds

cpu time 0.00 seconds

VDD · Posted 09-23-2018 11:38 AM

you need to obtain the year part from your date field. your dates are in the format of mmddyy, where you need to test the year not the month a day parts of the date.

ihtishamsultan · Posted 09-23-2018 12:22 PM

Converted dates into year4. format, still no luck. I am sorry, I am naive at this doing it for the first time.

379 data year;

380 set work.a;

381 format order_date year4.;

382 run;

NOTE: There were 108 observations read from the data set WORK.A.

NOTE: The data set WORK.YEAR has 108 observations and 7 variables.

NOTE: DATA statement used (Total process time):

real time 0.00 seconds

cpu time 0.01 seconds

383 proc sort data=work.year;

384 by order_date cust_id;run;

NOTE: There were 108 observations read from the data set WORK.YEAR.

NOTE: The data set WORK.YEAR has 108 observations and 7 variables.

NOTE: PROCEDURE SORT used (Total process time):

real time 0.00 seconds

cpu time 0.03 seconds

385 proc sql;

386 create table japan as

387 select a.*, b.order_date as prior_order_date

388 from work.year as a, work.year as b

389 where a.Cust_id = b.Cust_id

390 and a.Prod_id = b.Prod_id

391 and b.order_date in (2015)

392 and a.order_date in (2017);

NOTE: Table WORK.JAPAN created, with 0 rows and 8 columns.

393 quit;

NOTE: PROCEDURE SQL used (Total process time):

real time 0.00 seconds

cpu time 0.00 seconds

Registration is open

SAS Training: Just a Click Away