DATA Step, Macro, Functions and more

How to identify the first record which satisfy certain criteria?

Reply
Occasional Learner
Posts: 1

How to identify the first record which satisfy certain criteria?

[ Edited ]

Hi all, 

 

I'm not a very advanced SAS user and only use pretty basic coding. But now I need to use do loop to meet certain criteria. Specifically, I have a list of customers which are identified by their cus_id. Each customer has their purchase history during a month, with purchase dates and purchase amounts. Now I want to identify the purchase date on which each customer has reached or gone beyond $100 purchase amount. 

 

Data is as below: 

 

cust_id     purch_dt     purch_amt

1               1/1               100

1               1/3                30

1                1/29             200

2                1/5               20

2                1/15             65

2                 1/30            190

2                 1/31             10

...

 

For cust_id=1, I want to pick out 1/1, as  that's the date when cust_id=1 made total purchase of 100 and thus meets the criteria of "reach or beyond 100". For cust_id=2, I want to pick out 1/30 for that's the date cust_id=2 made total purchase of 275 and thus beyond 100...

 

After researching online, I found that I probably would need to use "if...then...else...then...end" and "do until" statements. I've composed a piece of code in SAS, but it has been running for 2 hours so there must be something wrong. 

 

data x; 

set y; 

by cust_id purch_dt; 

retain cust_id purch_dt purch_amt; 

if fist.cust_id then do; 

cnt=1; 

total_purch_amt=purch_amt; 

end; 

else do (until total_purch_amt >=100); 

cnt=cnt+1; 

total_purch_amt=purch_amt+total_purch_amt; 

end; 

output; 

run; 

 

Can anyone please take a look at my code and kindly let me know where I could possibly go wrong? 

 

Thank you so much! Really appreciate that! 

 

PROC Star
Posts: 1,570

Re: How to identify the first record which satisfy certain criteria?

[ Edited ]
Posted in reply to CrystalSmile
data have;
input cust_id     purch_dt $    purch_amt ;
cards;
1               1/1               100
1               1/3                30
1                1/29             200
2                1/5               20
2                1/15             65
2                 1/30            190
2                 1/31             10
;

data want;
set have;
by cust_id;
if first.cust_id then do;_t=purch_amt;__t=0; end;
else _t+purch_amt;
if _t>=100 then __t+1;
if __t=1 then output;
drop _:;
run;

You wrote -"

 

For cust_id=1, I want to pick out 1/29, as  that's the date when cust_id=1 made total purchase of 330 and thus beyond 100" 

130 on 1/3 is beyond 100 fyi

 

In response to your edit, i edited 

if _t>=100 then __t+1;
PROC Star
Posts: 503

Re: How to identify the first record which satisfy certain criteria?

Posted in reply to CrystalSmile

one way to do in sql is 

 

data have;
input cust_id     purch_dt $    purch_amt ;
cards;
1               1/1               100
1               1/3                30
1               1/9                130
1                1/29             200
2                1/5               20
2                1/15             65
2                1/17             165
2                 1/30            190
2                 1/31             10
;

proc sql;
select * from have
where purch_amt ge 100
group by cust_id
having purch_amt = min(purch_amt);
PROC Star
Posts: 1,570

Re: How to identify the first record which satisfy certain criteria?

@kiranv_ Your Sql approach will require a tweak besides the fact it is not a one pass solution on account of having to remerge and then filer groups. Well, let's suppose the OP is ok with the remerge aka extra pass as SQL seems syntactically easier and neat with no need for a presort.

Consider the following scenario where there are ties in purch_amt 

1 1/1 100
1 1/2 100 /*ties*/

Your code will return both the records, of course you can add a very simple fix like:

proc sql;
create table want as
select * from have
where purch_amt ge 100
group by cust_id
having purch_amt = min(purch_amt) and purch_dt=min(purch_dt);
quit;

The idea is that OP wants to get the earliest record of a transaction made amount to 100 $. Just an intuition.  Thank you!

 

 

Esteemed Advisor
Posts: 5,479

Re: How to identify the first record which satisfy certain criteria?

Posted in reply to CrystalSmile

Assuming your data is sorted by cust_id and purchase date:

 

data have;
input cust_id     purch_dt $    purch_amt ;
cards;
1               1/1               100
1               1/3                30
1                1/29             200
2                1/5               20
2                1/15             65
2                 1/30            190
2                 1/31             10
;

data want;
do until (last.cust_id);
    set have; by cust_id;
    if total < 100 then do;
        total = sum(total, purch_amt);
        if total >= 100 then output;
        end;
    end;
run;

proc print data=want noobs; run;
PG
Ask a Question
Discussion stats
  • 4 replies
  • 134 views
  • 0 likes
  • 4 in conversation