BookmarkSubscribeRSS Feed
ihtishamsultan
Obsidian | Level 7

Hello,

 

I am a newbie trying to learn and had a query.

 

I am trying to identify new products being ordered by  customers.  I want to check if a customer has placed a order(say in 2016 and 2017) ,did he/she order the same product in 2014-2015 or is it something new to him/her.  If it is new then I want to have the count of new ordered products(how many products new to the customer were ordered), the count of customers ordering new product and sales man id associated with order.

 

Every item purchased has a unique ID Prod_id, Customer is represented by Customer_id, date of order is Order_date and sales man is represented by sales_id.

 

Any help will be highly appreciated.

16 REPLIES 16
mkeintz
PROC Star

Help us help you.  Show what the incoming data looks like. Show what you want the result data  to look like.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
ihtishamsultan
Obsidian | Level 7

 

incoming data is sales data for a store from 2014-2017.

 

Cust_id, order_date and sales id are character variables. Prod_id is numeric variable.

 

I want the results in a tabular form and to only include products that were new to customer(not ordered in the previous two years)

 

for example

 

Cust_id            prod_id                sales_id       order_date

 

A345                   74                      S231             09/09/2017

 

 

I am sorry, I am new to all this, is this what was expected ??

VDD
Ammonite | Level 13 VDD
Ammonite | Level 13

you could do this using Proc SQL and a union statement for customer Id and product.

if you would like a sample please post sample data that represents your case with more than one sample.  While we can create a lot of sample data using your example is best because the solutions will fit your case study.

 

 

ihtishamsultan
Obsidian | Level 7
Salesman_nameProd_idOrder_dateCust_nameSales_idCust_idPaymode_ID
john2333/27/2017Bill.NS1922J2311
john2353/27/2017Bill.NS1922J2311
john2443/27/2017Bill.NS1922J2311
john2453/27/2017Bill.NS1922J2311
Bail2523/27/2017SusanS1750S191
Bail2573/27/2017SusanS1750S191
Chuck2623/27/2017AreebS2011A1161
Chuck2663/27/2017AreebS2011A1161
Chuck4193/27/2017AreebS2011A1161
Chuck4203/27/2017AreebS2011A1161
phoung4283/28/2017MichelleS1872M1033
Chuck4313/28/2017LizS2011L1032
Chuck4343/28/2017LizS2011L1032
Christy4433/28/2017LaxmiS1881L722
Christy4463/28/2017LaxmiS1881L722
john4553/29/2017JamesS1922J243
john4573/29/2017JamesS1922J243
john4603/29/2017JamesS1922J243
john5963/29/2017JamesS1922J243
phoung6033/29/2017RanaS1872R1891

 

Couldn't copy the whole things, the data set has 53 variables and around 300,000 rows for 4 years.

 

 

VDD
Ammonite | Level 13 VDD
Ammonite | Level 13

You did not supply data that would meet the selection criteria you have listed.

I created extra records to confirm this works.  But you will need to create the order_year our of the order_date for the logic to work.

proc sql;
	create table want as
	select a.*, b.order_date as prior_order_date
	from have2 as a, have2 as b
	where a.salesman_name = b.salesman_name 
		  and a.prod_id = b.prod_id 
		  and b.order_year in (2014,2015) 
		  and a.order_year in (2016,2017);
quit;

Note this in not using a union but works for your case study.

ihtishamsultan
Obsidian | Level 7

Thanks,

 

but when I run the code it says work.have created with 0 rows and 8 columns

 please I am also attaching a sample data set if you can have a look

Reeza
Super User

Post the code and log from what you tried as well. 

 

ihtishamsultan
Obsidian | Level 7

95 proc sql;

96 create table want as

97 select a.*, b.order_date as prior_order_date

98 from work.a as a, work.a as b

99 where a.Cust_id = b.Cust_id

100 and a.Prod_id = b.Prod_id

101 and b.order_date in (2014,2015)

102 and a.order_date in (2016,2017);

NOTE: Table WORK.WANT created, with 0 rows and 8 columns.

103 quit;

NOTE: PROCEDURE SQL used (Total process time):

real time 0.00 seconds

cpu time 0.00 seconds

mkeintz
PROC Star

If the data are sorted by date,  or by date within cust_id then this works:

 

data want;
  set have;
  if _n_=1 then do;
    declare hash h ();
      h.definekey('cust_id','prod_id');
      h.definedone();
  end;
  set have;
  if date<'01jan2016'd then h.add();
  if date>='01jan2016'd and h.find()^=0;
run;

At the first incoming observation, this program instantiates a hash object (a "lookup table" in this case) keyed on cust_id/prod_id.  The object only is populated (one record at a time) by instances in which date < '01jan2016'.

 

 

Then when later dates are encountered, the h.find method looks to see if the incoming cust_id/prod_id is already in the object (indicating a pre-2016 purchase).  If it is not, the find method returns a non-zero.  These are the cases kept for output.

 

I repeat, this depends on records being sorted by date, or by date within cust_id.

 

If the data is not sorted, then either it has to be sorted and submitted to the code above, or more complicated coding could be written for a single data step.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
ihtishamsultan
Obsidian | Level 7

When I try it, it gives me the observations which took place after 2016. Irrespective of it being repeated in the previous year.

 

I am trying to exclude products ordered by a customers more than once.

 

Say for example

in the attached excel sheet there are 10 products(highlighted in yellow) new to the customer(not ordered in their previous encounters). I want  those rows to be shown in the results

 

 

VDD
Ammonite | Level 13 VDD
Ammonite | Level 13

sort you data by customer Id and purchase date, prior to doing the proc sql process;

then those sql selections will work.

 

ihtishamsultan
Obsidian | Level 7
Still no luck, table created with 0 observations and 7 rows


311 proc sort data=work.a;

312 by order_date Cust_id;run;

NOTE: There were 108 observations read from the data set WORK.A.

NOTE: The data set WORK.A has 108 observations and 7 variables.

NOTE: PROCEDURE SORT used (Total process time):

real time 0.00 seconds

cpu time 0.00 seconds

313 proc sql;

314 create table S as

315 select a.*, b.order_date as prior_order_date

316 from work.a as a, work.a as b

317 where a.Cust_id = b.Cust_id

318 and a.prod_id = b.prod_id

319 and b.order_date in (2014,2015)

320 and a.order_date in (2016,2017);

NOTE: Table WORK.S created, with 0 rows and 8 columns.

321 quit;

NOTE: PROCEDURE SQL used (Total process time):

real time 0.00 seconds

cpu time 0.00 seconds

VDD
Ammonite | Level 13 VDD
Ammonite | Level 13

you need to obtain the year part from your date field.  your dates are in the format of mmddyy, where you need to test the year not the month a day parts of the date.

 

 

ihtishamsultan
Obsidian | Level 7

 

 

 

 

Converted dates into year4. format, still no luck. I am sorry, I am naive at this doing it for the first time. 

 

379 data year;

380 set work.a;

381 format order_date year4.;

382 run;

NOTE: There were 108 observations read from the data set WORK.A.

NOTE: The data set WORK.YEAR has 108 observations and 7 variables.

NOTE: DATA statement used (Total process time):

real time 0.00 seconds

cpu time 0.01 seconds

 

383 proc sort data=work.year;

384 by order_date cust_id;run;

NOTE: There were 108 observations read from the data set WORK.YEAR.

NOTE: The data set WORK.YEAR has 108 observations and 7 variables.

NOTE: PROCEDURE SORT used (Total process time):

real time 0.00 seconds

cpu time 0.03 seconds

 

385 proc sql;

386 create table japan as

387 select a.*, b.order_date as prior_order_date

388 from work.year as a, work.year as b

389 where a.Cust_id = b.Cust_id

390 and a.Prod_id = b.Prod_id

391 and b.order_date in (2015)

392 and a.order_date in (2017);

NOTE: Table WORK.JAPAN created, with 0 rows and 8 columns.

393 quit;

NOTE: PROCEDURE SQL used (Total process time):

real time 0.00 seconds

cpu time 0.00 seconds

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 16 replies
  • 4185 views
  • 0 likes
  • 4 in conversation