Data Input;
input Acct_num Tran_Amt:dollar6. Tran_DT:mmddyy10. Item_ID:$3.;
format tran_Dt yymmdd10.;
datalines;
111 $100 6/1/2014 AAA
111 $200 6/2/2014 AAA
111 $500 6/4/2014 ABC
111 $500 6/10/2014 AAA
111 $100 6/11/2014 CCC
222 $200 6/4/2014 AAA
222 $500 6/4/2014 AAA
222 $600 6/4/2014 AAA
222 $600 6/11/2014 ABC
222 $600 6/19/2014 ABC
222 $600 6/28/2014 CCC
222 $600 6/29/2014 AAA
;
run;
Proc sort data = input;
by acct_num Tran_Dt Item_ID;
run;
data output(drop =PrevItemID PrevTranDt);
set input;
by acct_num Tran_dt Item_ID;
PrevItemID=lag1(Item_ID);
PrevTranDt = lag1(Tran_DT);
if first.acct_num=1 and first.Item_ID then
FirstTimeItemID=1;
else if PrevItemID = Item_ID AND PrevTranDt = Tran_Dt then
FirstTimeItemID = 1;
else if PrevItemID = Item_ID AND PrevTranDt ne Tran_Dt then
FirstTimeItemID = 0;
else if PrevItemID ne Item_ID then
FirstTimeItemID = 1;
;
run;
Infrequent poster and a bit of a novice, so all help is appreciated! I need to determine if for a given acct_num, if that's the first time that acct_num has purchased an item. What I have works... almost. It fails when an acct_num purchases an item, purchases a different item, then purchases the original item again. If items are purchased on the same day, and it's the first time purchased, they should all be marked as FirstTimeItemID = 1.
I knew this code was "close", but had this limitation, but now it's become an issue. I tried, unsuccessfully, to use an array to store items previously purchased by acct_id, then trying to create a new array at each change in acct_num. I'm not sure if I implemented wrong or if it's possible. I've never used an array so I was flying blind with internet searches.
Sample data and current code is below as well as the results and a column with a "want" to show what I'd like the results to be.
Thanks in advance.
How big is your data?
Do you want a flag for every person, item?
Your sort needs to be by account number, item and then date.
Proc sort data = input;
by acct_num Item_ID tran_dt;
run;
data want;
set input;
by acct_num item_id;
first_flag=0;
if first.item_id then first_flag=1;
run;
Maybe this will work for you?
Data Input;
input Acct_num Tran_Amt:dollar6. Tran_DT:mmddyy10. Item_ID:$3.;
format tran_Dt yymmdd10.;
datalines;
111 $100 6/1/2014 AAA
111 $200 6/2/2014 AAA
111 $500 6/4/2014 ABC
111 $500 6/10/2014 AAA
111 $100 6/11/2014 CCC
222 $200 6/4/2014 AAA
222 $500 6/4/2014 AAA
222 $600 6/4/2014 AAA
222 $600 6/11/2014 ABC
222 $600 6/19/2014 ABC
222 $600 6/28/2014 CCC
222 $600 6/29/2014 AAA
;
run;
proc sql;
CREATE TABLE want AS
SELECT *, (tran_dt = min(tran_dt)) AS first_time_flag
FROM input
GROUP BY acct_num, item_id
ORDER BY acct_num, tran_dt;
quit;
With that many records, I'd go with Reeza's solution, which will do what you want more efficiently.
How big is your data?
Do you want a flag for every person, item?
Your sort needs to be by account number, item and then date.
Proc sort data = input;
by acct_num Item_ID tran_dt;
run;
data want;
set input;
by acct_num item_id;
first_flag=0;
if first.item_id then first_flag=1;
run;
Thank you! That worked, it seemed really unintuitive...Intuitively I thought tran_dt would need sorted before item_id. I really appreciate the help!
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.