Selecting a block of similar records

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 9
Accepted Solution

Selecting a block of similar records

Hello:

 

I have a dataset containing customer #, debit card # and associated transactions.  Problem is a customer may have more than one debit card in use and I need to select the debit card with the greatest number of transactions. My example:

 

 

Before data:

 

Cust#              Debit card#                Transaction Amount

123                   456                               $7

123                    456                                6

123                    456                                 1

123                    789                                2

123                    789                                4

123                     789                               5

123                     789                               3

123                      222                              1

123                      222                              9

 

 

After data should look like this:

 

Cust#              Debit card#                Transaction Amount

123                    789                                2

123                    789                                4

123                     789                               5

123                     789                               3

 

 

Hence the debit card with the greatest number of transactions should be outputted to a new data set.  Problem is how???!!!

 

Many thanks for your assistance!


Accepted Solutions
Solution
‎12-30-2015 03:19 PM
Respected Advisor
Posts: 3,124

Re: Selecting a block of similar records

To this point, SQL is still a viable solution, however other approaches, such as data step may be more efficient. Since we are already far into the SQL, here it is:

proc sql;
	create table new as
		select * from (select * from 
		(select *, count(*) as ct from old
			group by cust, debitcard
			)
			group by cust
				having ct=max(ct))
					group by cust
						having debitcard=max(debitcard)
	;
quit;

Don't worry, it should work even debitcard is char, as char is stored in binary anyway, so they do have a hiearchy order.

 

View solution in original post


All Replies
Respected Advisor
Posts: 3,124

Re: Selecting a block of similar records

I see this to be easier if using Proc SQL:

 

proc sql;
	create table new as
		select * from 
		(select *, count(*) as ct from old
			group by cust, debitcard
			)
			group by cust
				having ct=max(ct)
	;
quit;
Occasional Contributor
Posts: 9

Re: Selecting a block of similar records

INGENIOUS!  Thank you Haikuo.

Occasional Contributor
Posts: 9

Re: Selecting a block of similar records

[ Edited ]

Sorry; but just discovered another variation in my data.  In the case where a customer has multiple debit cards but only 1 transaction on each card the above code will select these as well.  Hence, all  of customer 666 gets included too which I do not want in final output.  I would only need any one record from customer 666.  Apologies for not clarifying earlier.

 

before:

 

Cust#              Debit card#                Transaction Amount

123                   456                               $7

123                    456                                6

123                    456                                 1

123                    789                                2

123                    789                                4

123                     789                               5

123                     789                               3

123                      222                              1

123                      222                              9

666                     111                              3

666                      112                             4

666                      113                             5

 

 

After:

Obs cust debitcard transaction ct
1 123 789 3 4
2 123 789 5 4
3 123 789 4 4
4 123 789 2 4
5 666 111 3 1
6 666 112 4 1
7 666 113 5 1
Respected Advisor
Posts: 3,124

Re: Selecting a block of similar records

Well, the plot always thickens. Consider this:

 

666                     111                              3

666                      112                             4

666                      113                             5

And this:

 

666                     111                              3
666                     111                              3

666                      112                             4
666                      112                             4

666                      113                             5
666                      113                             5

And this:

 

666                     111                              3
666                     111                              3
666                     111                              3
666                     111                              3

666                      112                             4
666                      112                             4
666                      112                             4
666                      112                             4

666                      113                             5
666                      113                             5

What do you want from above scenarios? This will be more a business decision than a coding requirement. A good business decision will be something like:

 

1. Consistent

2. Make sense

3. Cover as much ground as posible

4. Easy to convert to computer language.

Occasional Contributor
Posts: 9

Re: Selecting a block of similar records

Yes, you are correct as it is more of a business decision.  In my particular case the final output must contain 1 customer to 1 debit card (but choose the debit card # with the most transactions when multiple transactions occur).  In case of a tie choose any one. 

 

In your examples - the first and second blocks would be a matter of selecting any one of the records since the customer has used each card once or twice.  In the last example, card 111 and 112 were utilized four times; hence, choose either 111 or 112. 

 

Many thanks again!

Solution
‎12-30-2015 03:19 PM
Respected Advisor
Posts: 3,124

Re: Selecting a block of similar records

To this point, SQL is still a viable solution, however other approaches, such as data step may be more efficient. Since we are already far into the SQL, here it is:

proc sql;
	create table new as
		select * from (select * from 
		(select *, count(*) as ct from old
			group by cust, debitcard
			)
			group by cust
				having ct=max(ct))
					group by cust
						having debitcard=max(debitcard)
	;
quit;

Don't worry, it should work even debitcard is char, as char is stored in binary anyway, so they do have a hiearchy order.

 

Occasional Contributor
Posts: 9

Re: Selecting a block of similar records

Worked like a dream!  Thank you Haikuo once again.

Frequent Contributor
Posts: 89

Re: Selecting a block of similar records

Thanks
☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 8 replies
  • 440 views
  • 1 like
  • 3 in conversation