Hello
I have been using SAS for years but I am a basic programmer and generally keep it simple but I want to get better. I have this situation that I can use the following code but I would like it to be smarter. In this scenario the max n is 3 but it could be as many as 50 and I don't want to write out the code. So if anyone out there can educate/enlighten me I would be very grateful
DATA checking;
INPUT CUSTOMER_ACCOUNT_ID n sum_accounts ;
CARDS;
260 1 10436965
260 2 10436965
9629 1 9629
9629 2 88532203
15191 1 15191
15191 2 15191
15191 3 15191
17712 1 17712
17712 2 43811252
17712 3 17712
;
run;
proc transpose data = checking out = x_checking (drop = _name_) prefix = count_;
by customer_account_id ;
id n;
var sum_accounts;
run;
data x_checking_1;
set x_checking;
format match $10.;
if count_3 eq . then
do;
if count_1 = count_2 then match = 'yes';
else match = 'no';
end;
else
do;
if count_1 = count_2 = count_3 then match = 'yes';
else if count_1 = count_2 or count_1 = count_3 or count_2 = count_3 then match = 'partial';
else match = 'no';
end;
run;
Cheers
One way to go:
proc sql;
create table want as
select
CUSTOMER_ACCOUNT_ID,
count(*) as n_rows_per_CA,
count(distinct sum_accounts) as n_dif_vals_per_CA,
case
when calculated n_dif_vals_per_CA = 1 then 'yes'
when calculated n_rows_per_CA = calculated n_dif_vals_per_CA then 'no'
else 'partial'
end as match length=7
from checking
group by CUSTOMER_ACCOUNT_ID
;
quit;
One way to go:
proc sql;
create table want as
select
CUSTOMER_ACCOUNT_ID,
count(*) as n_rows_per_CA,
count(distinct sum_accounts) as n_dif_vals_per_CA,
case
when calculated n_dif_vals_per_CA = 1 then 'yes'
when calculated n_rows_per_CA = calculated n_dif_vals_per_CA then 'no'
else 'partial'
end as match length=7
from checking
group by CUSTOMER_ACCOUNT_ID
;
quit;
My take on this...
DATA checking;
INPUT ID n sum_a ;
CARDS;
260 1 10436965
260 2 10436965
9629 1 9629
9629 2 88532203
15191 1 15191
15191 2 15191
15191 3 15191
17712 1 17712
17712 2 43811252
17712 3 17712
;
proc sql;
create table matches as
select ID,
case
when max(count) >= 2 and max(count) = sum(count) then "yes"
when max(count) >= 2 then "partial"
else "no"
end as match
from (
select ID, sum_a, count(sum_a) as count
from checking
group by ID, sum_a)
group by ID;
select * from matches;
quit;
A possibility:
proc sort data=checking;
by customer_account sum_accounts;
run;
data checking;
length match $ 10;
match='no';
do until (last.customer_account);
set checking;
by customer_account sum_accounts;
if first.customer_account then first_amount = sum_accounts;
if first.sum_accounts=0 then match='partial';
if last.customer_account and sum_accounts=first_amount then match='yes';
end;
do until (last.customer_account);
set checking;
by customer_account;
output;
end;
drop first_amount;
run;
Then run the same PROC TRANSPOSE, adding in MATCH.
Here is a data step solution:
proc sort data=checking out=_ck; by CUSTOMER_ACCOUNT_id sum_accounts; run; data want; do _n_=1 by 1 until (last.CUSTOMER_ACCOUNT_ID); set _ck; by CUSTOMER_ACCOUNT_ID sum_accounts; _ct+first.sum_accounts; end; length match $ 7; if _ct = _n_ and _ct>1 then match='No'; else if _ct < _n_ and _ct >1 then match ='Partial'; else if _ct = 1 and _n_ >1 then match = 'Yes'; else match ='N/A'; _ct=.; keep CUSTOMER_ACCOUNT_ID match; run;
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.