I have a dataset that looks like the one below. The date column is sorted. I’m trying to create a table which will summarize
• If the “first” purchase in 2005 is product ‘2’ and then
• How many units of products 7 & 9 were bought by the person (ID) over the following years
I tried using Proc Summary. The problem is that there are multiple cases in which the person bought products 2 along with something else on their first purchase in 2005 (hence the multiple units & cost entries in certain rows). I have put them with commas to make a distinction. However, my dataset has them in the same box and separated by a space.
ID date product units cost($) trip
1 1/28/2005 2 1 20 1
1 2/3/2005 99 1 24 2
1 2/4/2007 9 1 24 3
1 4/13/2008 8 1 30 6
1 7/27/2008 8 1 30 7
1 11/9/2009 8 1 30 8
2 7/31/2005 7 1 25 1
2 3/20/2006 9 1 24 2
2 1/12/2007 9 1 14 3
2 2/6/2008 9 1 40 4
2 10/16/2009 9 1 19 5
3 3/24/2005 2,9 1,1 40 1
3 10/18/2006 4 3 45 2
3 5/6/2007 2 1 39 3
3 5/7/2007 99 1 39 4
3 4/17/2005 2,9 3,2 75,120 1
3 4/23/2005 7, 9 1,1 25,40 2