Hello
I have been using SAS for years but I am a basic programmer and generally keep it simple but I want to get better. I have this situation that I can use the following code but I would like it to be smarter. In this scenario the max n is 3 but it could be as many as 50 and I don't want to write out the code. So if anyone out there can educate/enlighten me I would be very grateful
DATA checking;
INPUT CUSTOMER_ACCOUNT_ID n sum_accounts ;
CARDS;
260 1 10436965
260 2 10436965
9629 1 9629
9629 2 88532203
15191 1 15191
15191 2 15191
15191 3 15191
17712 1 17712
17712 2 43811252
17712 3 17712
;
run;
proc transpose data = checking out = x_checking (drop = _name_) prefix = count_;
by customer_account_id ;
id n;
var sum_accounts;
run;
data x_checking_1;
set x_checking;
format match $10.;
if count_3 eq . then
do;
if count_1 = count_2 then match = 'yes';
else match = 'no';
end;
else
do;
if count_1 = count_2 = count_3 then match = 'yes';
else if count_1 = count_2 or count_1 = count_3 or count_2 = count_3 then match = 'partial';
else match = 'no';
end;
run;
Cheers
One way to go:
proc sql;
create table want as
select
CUSTOMER_ACCOUNT_ID,
count(*) as n_rows_per_CA,
count(distinct sum_accounts) as n_dif_vals_per_CA,
case
when calculated n_dif_vals_per_CA = 1 then 'yes'
when calculated n_rows_per_CA = calculated n_dif_vals_per_CA then 'no'
else 'partial'
end as match length=7
from checking
group by CUSTOMER_ACCOUNT_ID
;
quit;
One way to go:
proc sql;
create table want as
select
CUSTOMER_ACCOUNT_ID,
count(*) as n_rows_per_CA,
count(distinct sum_accounts) as n_dif_vals_per_CA,
case
when calculated n_dif_vals_per_CA = 1 then 'yes'
when calculated n_rows_per_CA = calculated n_dif_vals_per_CA then 'no'
else 'partial'
end as match length=7
from checking
group by CUSTOMER_ACCOUNT_ID
;
quit;
My take on this...
DATA checking;
INPUT ID n sum_a ;
CARDS;
260 1 10436965
260 2 10436965
9629 1 9629
9629 2 88532203
15191 1 15191
15191 2 15191
15191 3 15191
17712 1 17712
17712 2 43811252
17712 3 17712
;
proc sql;
create table matches as
select ID,
case
when max(count) >= 2 and max(count) = sum(count) then "yes"
when max(count) >= 2 then "partial"
else "no"
end as match
from (
select ID, sum_a, count(sum_a) as count
from checking
group by ID, sum_a)
group by ID;
select * from matches;
quit;
A possibility:
proc sort data=checking;
by customer_account sum_accounts;
run;
data checking;
length match $ 10;
match='no';
do until (last.customer_account);
set checking;
by customer_account sum_accounts;
if first.customer_account then first_amount = sum_accounts;
if first.sum_accounts=0 then match='partial';
if last.customer_account and sum_accounts=first_amount then match='yes';
end;
do until (last.customer_account);
set checking;
by customer_account;
output;
end;
drop first_amount;
run;
Then run the same PROC TRANSPOSE, adding in MATCH.
Here is a data step solution:
proc sort data=checking out=_ck; by CUSTOMER_ACCOUNT_id sum_accounts; run; data want; do _n_=1 by 1 until (last.CUSTOMER_ACCOUNT_ID); set _ck; by CUSTOMER_ACCOUNT_ID sum_accounts; _ct+first.sum_accounts; end; length match $ 7; if _ct = _n_ and _ct>1 then match='No'; else if _ct < _n_ and _ct >1 then match ='Partial'; else if _ct = 1 and _n_ >1 then match = 'Yes'; else match ='N/A'; _ct=.; keep CUSTOMER_ACCOUNT_ID match; run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.