[Sorry for double post, not sure if best here or in Data Step forum]
Hello, I have a data set with two sorts of “attributes” (probably not using term in strict database sense).
I have a perminant attribute of the unique identifier (for example gender). I have multiple records per identifier which describe a transaction (like an item purchased).
“ID” – unique identifier
“Sex” – gender of ID person
“Product” – what was purchased (beer, diapers, pencils)
Purch_dt – date of purchase
1234 F beer Nov 1, 2008
1234 F diapers Nov 24, 2008
5678 M beer Nov 17, 2008
5678 M pencils Nov 30, 2008
I want to
- count the number of people who purchased both beer and diapers
- count the number of people who purchased beer and no diapers
- determine the number of women who purchased beer and diapers
- for those who purchased both, determine the time lag between the purchase of beer and diapers
I have not written successful code but had two different thoughts (but am willing to do anything that will work).
I tried proc SQL “group by ID” but my instinct in English would to say … by ID where Product = beer and Product = diapers but I do not believe this is correct.
I tried a do loop by ID and flag if Product = beer and if Product = diapers then do, by unique IDs count flags but this was cumbersome and while the code did something I am not sure it was correct.