Data Input;
input Acct_num Tran_Amt:dollar6. Tran_DT:mmddyy10. Item_ID:$3.;
format tran_Dt yymmdd10.;
datalines;
111 $100 6/1/2014 AAA
111 $200 6/2/2014 AAA
111 $500 6/4/2014 ABC
111 $500 6/10/2014 AAA
111 $100 6/11/2014 CCC
222 $200 6/4/2014 AAA
222 $500 6/4/2014 AAA
222 $600 6/4/2014 AAA
222 $600 6/11/2014 ABC
222 $600 6/19/2014 ABC
222 $600 6/28/2014 CCC
222 $600 6/29/2014 AAA
;
run;
Proc sort data = input;
by acct_num Tran_Dt Item_ID;
run;
data output(drop =PrevItemID PrevTranDt);
set input;
by acct_num Tran_dt Item_ID;
PrevItemID=lag1(Item_ID);
PrevTranDt = lag1(Tran_DT);
if first.acct_num=1 and first.Item_ID then
FirstTimeItemID=1;
else if PrevItemID = Item_ID AND PrevTranDt = Tran_Dt then
FirstTimeItemID = 1;
else if PrevItemID = Item_ID AND PrevTranDt ne Tran_Dt then
FirstTimeItemID = 0;
else if PrevItemID ne Item_ID then
FirstTimeItemID = 1;
;
run;
Infrequent poster and a bit of a novice, so all help is appreciated! I need to determine if for a given acct_num, if that's the first time that acct_num has purchased an item. What I have works... almost. It fails when an acct_num purchases an item, purchases a different item, then purchases the original item again. If items are purchased on the same day, and it's the first time purchased, they should all be marked as FirstTimeItemID = 1.
I knew this code was "close", but had this limitation, but now it's become an issue. I tried, unsuccessfully, to use an array to store items previously purchased by acct_id, then trying to create a new array at each change in acct_num. I'm not sure if I implemented wrong or if it's possible. I've never used an array so I was flying blind with internet searches.
Sample data and current code is below as well as the results and a column with a "want" to show what I'd like the results to be.
Thanks in advance.
How big is your data?
Do you want a flag for every person, item?
Your sort needs to be by account number, item and then date.
Proc sort data = input;
by acct_num Item_ID tran_dt;
run;
data want;
set input;
by acct_num item_id;
first_flag=0;
if first.item_id then first_flag=1;
run;
Maybe this will work for you?
Data Input;
input Acct_num Tran_Amt:dollar6. Tran_DT:mmddyy10. Item_ID:$3.;
format tran_Dt yymmdd10.;
datalines;
111 $100 6/1/2014 AAA
111 $200 6/2/2014 AAA
111 $500 6/4/2014 ABC
111 $500 6/10/2014 AAA
111 $100 6/11/2014 CCC
222 $200 6/4/2014 AAA
222 $500 6/4/2014 AAA
222 $600 6/4/2014 AAA
222 $600 6/11/2014 ABC
222 $600 6/19/2014 ABC
222 $600 6/28/2014 CCC
222 $600 6/29/2014 AAA
;
run;
proc sql;
CREATE TABLE want AS
SELECT *, (tran_dt = min(tran_dt)) AS first_time_flag
FROM input
GROUP BY acct_num, item_id
ORDER BY acct_num, tran_dt;
quit;
With that many records, I'd go with Reeza's solution, which will do what you want more efficiently.
How big is your data?
Do you want a flag for every person, item?
Your sort needs to be by account number, item and then date.
Proc sort data = input;
by acct_num Item_ID tran_dt;
run;
data want;
set input;
by acct_num item_id;
first_flag=0;
if first.item_id then first_flag=1;
run;
Thank you! That worked, it seemed really unintuitive...Intuitively I thought tran_dt would need sorted before item_id. I really appreciate the help!
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.