BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
David_Billa
Rhodochrosite | Level 12

I've the data as follows. I want to identify the customer details who made only CASH as payment method. Here if I use where clause as 'Cash' in payment_method then I get Rohan also in my result and it is not correct.

 

ID CUST_NAME PAYMENT_METHOD PAYNT_DT
6 Geetha Cash 03-02-2021
8 Geetha Cash 04-02-2021
2 Ram Online 03-01-2021
1 Rohan Cash 01-01-2021
3 Rohan Cash 02-02-2021
5 Vijay Online 03-02-2021
7 Vijay Cash 03-02-2021

 

Desired Result:

 

ID CUST_NAME PAYMENT_METHOD PAYNT_DT
6 Geetha Cash 03-02-2021
8 Geetha Cash 04-02-2021
1 Rohan Cash 01-01-2021
3 Rohan Cash 02-02-2021

 

Any help?

1 ACCEPTED SOLUTION

Accepted Solutions
11 REPLIES 11
Kurt_Bremser
Super User
data want,
merge
  have
  have (
    in=no
    keep=name payment_method
    rename=(payment_method=_pay)
    where=(_pay ne 'Cash')
  )
;
if not no;
drop _pay;
run,

Untested, no usable data supplied.

ChrisNZ
Tourmaline | Level 20

@Kurt_Bremser forgot the BY statement., and to account for possible duplicates.

proc sort data=HAVE out=NOCASH(keep=ID) nodupkey
  by ID;
  where PAYMENT_METHOD ne 'Cash';
run;
data WANT,
  merge HAVE NOCASH;
  by ID;
  if not NOCASH;
run,

 

 

 

Kurt_Bremser
Super User

Yep, I forgot the BY. But you do not need to dedup, because a) the lookup is always a subset of the whole, and b) as soon as any observation is found there, the whole customer is discarded anyway, so it can't cause additional observations. It will cause a NOTE about multiple occurences (leaving a comment about that in the code will be nice), but that's it.

David_Billa
Rhodochrosite | Level 12

@ChrisNZ I'm getting an error message saying that 'By variables are not properly sorted on dataset WORK.HAVE' although it was sorted in your code.

ChrisNZ
Tourmaline | Level 20

One of the tables is sorted in my code. You need both sorted (or indexed).

But you only need one (sorted) table: Kurt's code should work once the input table is sorted (or indexed), and after you add the BY statement.

Ksharp
Super User
data have;
infile cards expandtabs truncover;
input ID CUST_NAME $ PAYMENT_METHOD $ PAYNT_DT : $20.;
cards;
6 Geetha Cash 03-02-2021
8 Geetha Cash 04-02-2021
2 Ram Online 03-01-2021
1 Rohan Cash 01-01-2021
3 Rohan Cash 02-02-2021
5 Vijay Online 03-02-2021
7 Vijay Cash 03-02-2021
;

proc sql;
create table want as
select * from have
 group by CUST_NAME
  having count(*)=sum(PAYMENT_METHOD='Cash');
quit;
David_Billa
Rhodochrosite | Level 12

@Ksharp Thanks, your code is producing the desired results. Could you please explain this step? How you're summing the character field - 'PAYMENT_METHOD'?

 

having count(*)=sum(PAYMENT_METHOD='Cash')
Kurt_Bremser
Super User

It does not sum the variable (which would not be possible with a character value), but the result of the comparison, which is Boolean. Boolean in SAS is numeric, with values 0 (false) or 1 (true).

David_Billa
Rhodochrosite | Level 12

@Kurt_Bremser May I know how this comparison will be true? 

 

having count(*)=sum(PAYMENT_METHOD='Cash')
Kurt_Bremser
Super User

In every observation where payment_method equals 'Cash', a 1 is counted, otherwise a zero. SUM is a SQL Summary Function that works over all observations (in a group, if GROUP BY is used).

ChrisNZ
Tourmaline | Level 20
having count(*)=sum(PAYMENT_METHOD='Cash')

This means: keep groups where number of records = number of records with PAYMENT_METHOD='Cash'

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 11 replies
  • 2467 views
  • 12 likes
  • 4 in conversation