BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Ramgem
Calcite | Level 5

I'm launching a marketing campaign in which I grant every client that reached a transaction sum of 100$ on his credit card a free gift card,

The problem is that I need to choose the first 100 clients that reached this threshold, my data table include:

- client id

- transaction date

- transaction amount

Thank you in advance

1 ACCEPTED SOLUTION

Accepted Solutions
mohamed_zaki
Barite | Level 11

data client;

      input ID @3 Date date7. @12 Amount;

   datalines;

1  13may14 85

2  17oct14 120

3  03feb14 160

4  28feb14 80

5  10nov14 77

6  25apr14 1

7  03jun14 140

8  14jan14 900

9  24oct14 1245

10 27aug14 78

1  1may14 85

2  7oct14 120

3  13feb14 160

4  8feb14 5

5  1nov14 976

6  5apr14 88

7  23jun14 140

8  24jan14 900

9  14oct14 1245

10 2oct14 78

;

proc sort data=client;

by id date;

run;

data test;

set client;

retain sum 0;

retain f 0;

by id;

sum= sum + amount;

if first.id then do; sum=amount; f=0; end;

if (sum>100 and f = 0) then do; f=1;output; end;

else f= 0;

run;

proc sort data=test;

by date;

run;

data wanted (keep= ID Date Sum);

set test (obs=5);

run;

View solution in original post

15 REPLIES 15
mohamed_zaki
Barite | Level 11

1- Sort your data

2- Use OBS option with Where condition in Proc Print

Example:

data client;

      input ID @3 Date date7. @12 Amount;

   datalines;

1  13may14 85

2  17oct14 120

3  03feb14 160

4  28feb14 80

5  10nov14 976

6  25apr14 88

7  03jun14 140

8  14jan14 900

9  24oct14 1245

10 27aug14 78

;

proc sort data=client;

by Date;

run;

proc print data=Client (obs= 5);

where amount > 100;

run;

Ramgem
Calcite | Level 5

Thank you but maybe my question was not very clear, the data is as follows:

Client id           date                   amount

10001         20141215                 50

10002         20141215                 30

10003         20141215                 90

10001         20141216                 70

10002         20141216                 80

10003         20141216                 5

10003         20141217                 100

So if I have to choose 2 between these 3 clients I would be choosing 10001 and 10002, now consider this situation but with 100000 clients

mohamed_zaki
Barite | Level 11

data client;

      input ID @3 Date date7. @12 Amount;

   datalines;

1  13may14 85

2  17oct14 120

3  03feb14 160

4  28feb14 80

5  10nov14 77

6  25apr14 1

7  03jun14 140

8  14jan14 900

9  24oct14 1245

10 27aug14 78

1  1may14 85

2  7oct14 120

3  13feb14 160

4  8feb14 5

5  1nov14 976

6  5apr14 88

7  23jun14 140

8  24jan14 900

9  14oct14 1245

10 2oct14 78

;

proc sort data=client;

by id date;

run;

data test;

set client;

retain sum 0;

retain f 0;

by id;

sum= sum + amount;

if first.id then do; sum=amount; f=0; end;

if (sum>100 and f = 0) then do; f=1;output; end;

else f= 0;

run;

proc sort data=test;

by date;

run;

data wanted (keep= ID Date Sum);

set test (obs=5);

run;

Ramgem
Calcite | Level 5

Thank you

Ksharp
Super User

If I understood what you mean .

data client;

input Client_id           date     : yymmdd10.              amount ;

format date yymmddn8.;

cards;

10001         20141215                 50

10002         20141215                 30

10003         20141215                 90

10001         20141216                 70

10002         20141216                 80

10003         20141216                 5

10003         20141217                 100

;

proc sort data=client;

by Client_id           date ;

run;

%let n=2;

data want;

set client;

by Client_id;

retain found 0 ;

if first.Client_id then do;sum=0; found=0;end;

sum+amount;

if sum ge 100 and not found then do;found=1;n+1;output;end;

if n eq &n then stop;

run;

Xia Keshan

Message was edited by: xia keshan

Ramgem
Calcite | Level 5

Hello xia keshan and thank you for your answer, but what I need in my output table is 100 unique client id's this method displayed the same id many times with different dates.

Ksharp
Super User

Sorry my bad. Code updated .

Haikuo
Onyx | Level 15

This may be a Proc SQL alternative that does not require to presort the data. However, it takes toll on the efficiency for other reasons.

data client;

     input ID @3 Date date7. @12 Amount;

     datalines;

1 13may14 85

2 17oct14 120

3 03feb14 160

4 28feb14 80

5 10nov14 77

6 25apr14 1

7 03jun14 140

8 14jan14 900

9 24oct14 1245

10 27aug14 78

1 1may14 85

2 7oct14 120

3 13feb14 160

4 8feb14 5

5 1nov14 976

6 5apr14 88

7 23jun14 140

8 24jan14 900

9 14oct14 1245

10 2oct14 78

;

proc sql outobs=3;

     create table want as

           select id,

                case

                     when (select sum(amount) from client where id=a.id and date <= a.date)>=100 then date

                     else .

                end

           as _date format=date9.

                from client a

                     group by id

                           having _date=min(_date) and not missing (_date)

                                order by _date

     ;

quit;

stat_sas
Ammonite | Level 13

Although, Hai.kuo has suggested a sql solution. My try using proc sql:

proc sql outobs=2;

create table want as

select client_id from client

group by client_id

having date=max(date) and sum(amount)>=100

order by date;

quit;

Haikuo
Onyx | Level 15

Hmm. You don't have a chronological filtering mechanism (order by date is for the final output, doesn't count) in your code, so you  are not providing what OP asked for. 

stat_sas
Ammonite | Level 13

Hi Haikuo,

I think chronological filtering mechanism is required only for the final output after having maximum date with aggregated transaction amount >=100 for each client_id to generate the desired output.

Regards,

Haikuo
Onyx | Level 15

This maybe just due to different opinions or different takes on OP's question as :

"I'm launching a marketing campaign in which I grant every client that reached a transaction sum of 100$ on his credit card a free gift card,The problem is that I need to choose the first 100 clients that reached this threshold"

From what I understand, OP's request has 2 layers (OP please correct me if I am mistaken):

1) a client start aggregating transaction amounts until the date when $100 is reached, and this date is marked, because this would be the earliest date this particular client is eligible for the gift card award.

2).By the end of the promotion (or whatever cutoff date), the top 100 clients will be selected based on when they became eligible. "choose the first 100 clients that reached this threshold"

Let 's take a look at some beefed up mocked data:

data client;

      input ID @3 Date date7. @12 Amount;

datalines;

data client;

     input ID @3 Date date7. @12 Amount;

     datalines;

1 13may14 85

2 17oct14 120

3 03feb14 160

4 28feb14 80

5 10nov14 77

6 25apr14 1

7 03jun14 140

8 14jan14 900

9 24oct14 1245

10 27aug14 78

1 1may14 85

2 7oct14 120

3 13feb14 160

4 8feb14 5

5 1nov14 976

6 5apr14 88

7 23jun14 140

8 24jan14 900

9 14oct14 1245

10 2oct14 78

1 2may14 85

2 2oct14 120

3 23feb14 160

4 2feb14 5

5 2nov14 976

6 2apr14 88

7 3jun14 140

8 4jan14 900

9 24oct14 1245

10 22oct14 78

;

proc sql outobs=3;

create table want_stat as

select id, date format=date9. from client

group by id

having date=max(date) and sum(amount)>=100

order by date;

quit;

title "Stat's output";

Proc print;run;

proc sql outobs=3;

     create table want_haikuo as

           select id,

                case

                     when (select sum(amount) from client where id=a.id and date <= a.date)>=100 then date

                     else .

                end

           as _date format=date9.

                from client a

                     group by id

                           having _date=min(_date) and not missing (_date)

                                order by _date

     ;

quit;

title "Haikuo's output";

proc print;run;

Stat's output
ObsIDDate
1323-Feb-01
2924-Oct-01
3924-Oct-01
Haikuo's output
ObsID_date
1814-Jan-01
233-Feb-01
373-Jun-01

The dups in your output can be taken care of easily, but the date you are using concerns me. For instance ID=3, I used the date when ID=3 reaches $100, while you used the max date for all the dates available for ID=3. If you have worked in a Marketing department of a bank, you would have known this is unfair, doing so will put eligible customers way down the queue just because they have been recently active.

was using the same Date as I understand, however, he chooses winners by their positions (client ID in this case) in the data, which is also unfair IMHO.

Just my 2 cents,

Regards,

Haikuo

stat_sas
Ammonite | Level 13

Thanks Haikuo - Helpful to understand the concept.

Regards,

Haikuo
Onyx | Level 15

BTW, Congratulations for becoming a Master! Smiley Happy

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 15 replies
  • 2143 views
  • 5 likes
  • 5 in conversation