I've the data as follows. I want to identify the customer details who made only CASH as payment method. Here if I use where clause as 'Cash' in payment_method then I get Rohan also in my result and it is not correct.
ID | CUST_NAME | PAYMENT_METHOD | PAYNT_DT |
6 | Geetha | Cash | 03-02-2021 |
8 | Geetha | Cash | 04-02-2021 |
2 | Ram | Online | 03-01-2021 |
1 | Rohan | Cash | 01-01-2021 |
3 | Rohan | Cash | 02-02-2021 |
5 | Vijay | Online | 03-02-2021 |
7 | Vijay | Cash | 03-02-2021 |
Desired Result:
ID | CUST_NAME | PAYMENT_METHOD | PAYNT_DT |
6 | Geetha | Cash | 03-02-2021 |
8 | Geetha | Cash | 04-02-2021 |
1 | Rohan | Cash | 01-01-2021 |
3 | Rohan | Cash | 02-02-2021 |
Any help?
data want,
merge
have
have (
in=no
keep=name payment_method
rename=(payment_method=_pay)
where=(_pay ne 'Cash')
)
;
if not no;
drop _pay;
run,
Untested, no usable data supplied.
data want,
merge
have
have (
in=no
keep=name payment_method
rename=(payment_method=_pay)
where=(_pay ne 'Cash')
)
;
if not no;
drop _pay;
run,
Untested, no usable data supplied.
@Kurt_Bremser forgot the BY statement., and to account for possible duplicates.
proc sort data=HAVE out=NOCASH(keep=ID) nodupkey
by ID;
where PAYMENT_METHOD ne 'Cash';
run;
data WANT,
merge HAVE NOCASH;
by ID;
if not NOCASH;
run,
Yep, I forgot the BY. But you do not need to dedup, because a) the lookup is always a subset of the whole, and b) as soon as any observation is found there, the whole customer is discarded anyway, so it can't cause additional observations. It will cause a NOTE about multiple occurences (leaving a comment about that in the code will be nice), but that's it.
@ChrisNZ I'm getting an error message saying that 'By variables are not properly sorted on dataset WORK.HAVE' although it was sorted in your code.
One of the tables is sorted in my code. You need both sorted (or indexed).
But you only need one (sorted) table: Kurt's code should work once the input table is sorted (or indexed), and after you add the BY statement.
data have; infile cards expandtabs truncover; input ID CUST_NAME $ PAYMENT_METHOD $ PAYNT_DT : $20.; cards; 6 Geetha Cash 03-02-2021 8 Geetha Cash 04-02-2021 2 Ram Online 03-01-2021 1 Rohan Cash 01-01-2021 3 Rohan Cash 02-02-2021 5 Vijay Online 03-02-2021 7 Vijay Cash 03-02-2021 ; proc sql; create table want as select * from have group by CUST_NAME having count(*)=sum(PAYMENT_METHOD='Cash'); quit;
@Ksharp Thanks, your code is producing the desired results. Could you please explain this step? How you're summing the character field - 'PAYMENT_METHOD'?
having count(*)=sum(PAYMENT_METHOD='Cash')
It does not sum the variable (which would not be possible with a character value), but the result of the comparison, which is Boolean. Boolean in SAS is numeric, with values 0 (false) or 1 (true).
@Kurt_Bremser May I know how this comparison will be true?
having count(*)=sum(PAYMENT_METHOD='Cash')
In every observation where payment_method equals 'Cash', a 1 is counted, otherwise a zero. SUM is a SQL Summary Function that works over all observations (in a group, if GROUP BY is used).
having count(*)=sum(PAYMENT_METHOD='Cash')
This means: keep groups where number of records = number of records with PAYMENT_METHOD='Cash'
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.