Hi ,
This is the dataset I have . I have one date of which a book was purchased. Each customer has a unique Id (shown as customer_id below) . So for each customer have a purchase date. I want to separate the purchase date into three separate group.
This is what I want. For each customer Id, the first date is the same as the purchase_date, so everyone has a purchase date. And their subsequent dates after the purchase is recording in second date and third if they have any. So for example custome_id T002 made purchases twice, one on 09/05/2019 which will be the first date and the second purchase was on 05/07/2021 that's the second date . Since this particular customer doesn't make a third purchase, third_date is recorded as NA.
1. Sort the data by Customer ID and Purchase Date
proc sort data=have;
by Customer_ID purchase_date;
run;
2. Add a number to indicate the purchase order
data purchaseOrder;
set have;
by Customer_ID purchase_date;
if first.customer_id then order=1;
else order+1;
run;
3. Transpose it to a wide format
proc transpose data=purchaseOrder out=purchaseWide prefix=Purchase;
by CustomerID;
id order;
var purchase_date;
run;
Your columns will be called Purchase1, Purchase2, Purchase3 instead of first_date, second_date, etc.
@nab102 wrote:
Hi ,
This is the dataset I have . I have one date of which a book was purchased. Each customer has a unique Id (shown as customer_id below) . So for each customer have a purchase date. I want to separate the purchase date into three separate group.
This is what I want. For each customer Id, the first date is the same as the purchase_date, so everyone has a purchase date. And their subsequent dates after the purchase is recording in second date and third if they have any. So for example custome_id T002 made purchases twice, one on 09/05/2019 which will be the first date and the second purchase was on 05/07/2021 that's the second date . Since this particular customer doesn't make a third purchase, third_date is recorded as NA.
How is it that customer T007 has no 2nd date but does have a 3rd date? That seems illogical.
1. Sort the data by Customer ID and Purchase Date
proc sort data=have;
by Customer_ID purchase_date;
run;
2. Add a number to indicate the purchase order
data purchaseOrder;
set have;
by Customer_ID purchase_date;
if first.customer_id then order=1;
else order+1;
run;
3. Transpose it to a wide format
proc transpose data=purchaseOrder out=purchaseWide prefix=Purchase;
by CustomerID;
id order;
var purchase_date;
run;
Your columns will be called Purchase1, Purchase2, Purchase3 instead of first_date, second_date, etc.
@nab102 wrote:
Hi ,
This is the dataset I have . I have one date of which a book was purchased. Each customer has a unique Id (shown as customer_id below) . So for each customer have a purchase date. I want to separate the purchase date into three separate group.
This is what I want. For each customer Id, the first date is the same as the purchase_date, so everyone has a purchase date. And their subsequent dates after the purchase is recording in second date and third if they have any. So for example custome_id T002 made purchases twice, one on 09/05/2019 which will be the first date and the second purchase was on 05/07/2021 that's the second date . Since this particular customer doesn't make a third purchase, third_date is recorded as NA.
You really don't need to add the ORDER variable.
PROC TRANSPOSE will number the variables sequentially automatically.
proc transpose data=have out=purchaseWide prefix=Purchase;
by CustomerID;
var purchase_date;
run;
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.