Help using Base SAS procedures

Proc SQL Query for 1st, 2nd and last purchase dates

Reply
Frequent Contributor
Posts: 101

Proc SQL Query for 1st, 2nd and last purchase dates

Hi all,

 

I have a table with customer transaction history and I am trying to write a query which brings back these variables:

 

  • Customer’s First Purchase date
  • Customer’s Second Purchase date
  • Customer’s Last Purchase Date

So, I start with a table (transaction) which looks like this:

 

Customer_ID transaction_date
124 6/14/2017
656 6/14/2017
124 4/17/2017
656 7/23/2016
656 4/17/2016
124 3/15/2016
124 12/20/2014
656 4/16/2014
333 8/5/2013
454 11/5/2012
656 1/20/2011
454 10/22/2010
233 4/30/2010

 

And I am looking for a result which looks like this:

 

Customer_ID First_purchase_Date Second_purchase_Date last_purchase_date
124 12/20/2014 3/15/2016 6/14/2017
333 8/5/2013   8/5/2013
656 1/20/2011 4/16/2014 6/14/2017
233 4/30/2010   4/30/2010
454 10/22/2010 11/5/2012 11/5/2012

 

I have to query the data from an Oracle Environment. I was able to figure out how to write a query which pulls in the first purchase date (see below ) but I’m having trouble figuring how to incorporate/add/pull the second purchase date and the last purchase date variables.    

 

proc sql;
connect to oracle;
create table   want   as
select
customer_id,
datepart(First_purchase_Date) as First_purchase_Date format=date9.
from connection to oracle (
select
t.customer_id,
t.transaction_date AS First_purchase_Date
from transaction t
  inner join  
(select   
customer_id,
min(transaction_date) as First_purchase_Date
from sas_transaction 
where   sales_credit in (1) 
and    business_unit in (3)         
group by customer_id
having max(case when sales_credit in (1) 
and transaction_date >= '01JAN2010'
and transaction_date  <= '20JUN2017'   then 1 else 0 end)=1 ) f     
on   (t.customer_id = f.customer_id 
and   t.transaction_date = f.First_purchase_Date)
where t.business_unit in (3)
and t.sales_credit in (1)  
group by t.First_purchase_Date, t.transaction_date  );
disconnect from oracle;
quit;

 

Any assistance will be greatly aperciated. 

 

Thanks! 

 

 

Trusted Advisor
Posts: 1,929

Re: Proc SQL Query for 1st, 2nd and last purchase dates

Posted in reply to RobertNYC

It has to be done via PROC SQL???

 

This can be done via a very simple data step after the data is extracted from Oracle.

Frequent Contributor
Posts: 101

Re: Proc SQL Query for 1st, 2nd and last purchase dates

Posted in reply to PaigeMiller

Hi PaigeMiller, 

 

It’s preferable for it to be done in the SQL statement but if you have a way to do it using a data step I’m totally game.

 

Thanks!

Trusted Advisor
Posts: 1,929

Re: Proc SQL Query for 1st, 2nd and last purchase dates

[ Edited ]
Posted in reply to RobertNYC

My experience with extracting data from relational databases like Oracle is that you are better off doing math like this (and also means, sums, counts, standard deviations, etc.) outside of SQL. Better off? Yes, in terms of speed, in terms of code simplicity, as well as probably for other reasons.

 

UNTESTED CODE (assumes data is sorted by customer and date)

data first second last;
    set oracle_extract;
    by customer date;
    if first.customer then purchase_count=0;
    purchase_count+1;
    if purchase_count=1 then output first;
    else if purchase_count=2 then output second;
    if last.customer then output last;
run;

Once you have the three output data sets, FIRST, SECOND and LAST, it's easy to merge them to produce the final data set you described above.

 

PROC Star
Posts: 325

Re: Proc SQL Query for 1st, 2nd and last purchase dates

[ Edited ]
Posted in reply to RobertNYC

It is much better off doing this in datastep and code by @PaigeMiller is much cleaner and I think will be efficient too. But you may able to get the answer by sql with below code. Sorry I just saw that you are running this query in an explicit pass through and sql query I gave will not in explicit pass through as it has sas compoenents in it. Below query should work in explicit pass through.

 

data abc;
input transaction_id transaction_date:mmddyy10.;
format transaction_date date9.;
datalines;
124 6/14/2017
656 6/14/2017
124 4/17/2017
656 7/23/2016
656 4/17/2016
124 3/15/2016
124 12/20/2014
656 4/16/2014
333 8/5/2013
454 11/5/2012
656 1/20/2011
454 10/22/2010
233 4/30/2010
;

proc sql;
select distinct bde.transaction_id,
       first_transaction_date ,
       second_transaction_date ,
     last_transaction_date 
 from 
(select a.transaction_id, first_transaction_date, last_transaction_date from 
(select transaction_id, transaction_date as first_transaction_date from abc
group by transaction_id
having transaction_date = min(transaction_date))a
inner join
(select transaction_id, transaction_date as last_transaction_date from abc
group by transaction_id
having transaction_date = max(transaction_date))b
on a.transaction_id =b.transaction_id)bde
left join
(select a.transaction_id, min(a.transaction_date) as second_transaction_date from
(select transaction_id, transaction_date  from
abc)a
inner join
(select transaction_id, min(transaction_date)as min_trasaction_date  from abc b
group by transaction_id)b
on a.transaction_id = b.transaction_id
and a.transaction_date > min_trasaction_date)cvcv
on bde.transaction_id =cvcv.transaction_id;

 

 

Ask a Question
Discussion stats
  • 4 replies
  • 283 views
  • 1 like
  • 3 in conversation