Hi Experts,
Lets say i have this table:
customer | a1 | a2 | a3 | a4 | a5 | a6 |
qqq | 3000 | 3000 | 3200 | 3050 | 3500 | 3100 |
qqq | 3000 | 3100 | 3050 | 3050 | 3050 | 3100 |
I want to create new column , an indicator that return 1 if :
1. a1=a2=a3=a4=a5 or the difference between (a1 to a6 )is no more then 5% difference from the minimum number.
in this example , in row numer 1,2 the min number is 3000 ,soo the diff is +-150
2. else 0
customer | a1 | a2 | a3 | a4 | a5 | a6 | indicator |
qqq | 3000 | 3000 | 3200 | 3050 | 3500 | 3100 | 0 |
qqq | 3000 | 3100 | 3050 | 3050 | 3050 | 3100 | 1 |
qqq | 4000 | 4000 | 4000 | 4000 | 4000 | 4000 | 1 |
Any ideas?
Best Regards
Hi @jony123
You can try this:
-> you can adapt the allowed percentage of difference from the minimum value in the %let statement
-> in the array, all variable beginning with an 'a' are referenced. You can specify a1-a6 if it makes more sense.
Best,
%let alpha = .05;
data want;
set have;
array _a(*) a:;
flag=1;
do i=1 to dim(_a);
if min(of _a(*))*(1-&alpha) > _a(i) or min(of _a(*))*(1+&alpha) < _a(i)then flag=0;
end;
drop i;
run;
So it seems you have multiple observations per customer. Is there an additional column that identifies the observations?
Hi, Yes
customer | a1 | a2 | a3 | a4 | a5 | a6 | indicator | document |
qqq | 3000 | 3000 | 3200 | 3050 | 3500 | 3100 | 0 | 6 |
qqq | 3000 | 3100 | 3050 | 3050 | 3050 | 3100 | 1 | 7 |
qqq | 4000 | 4000 | 4000 | 4000 | 4000 | 4000 | 1 | 8 |
I add it to the example
Thanks for you help
May be I'm missing to understand part of the problem - or could it be as simple as below.
data want;
set have;
array a_vars {*} a1-a6;
array minmax {2} 8 _temporary_;
minmax[1]=min(of a_vars[*]);
minmax[2]=max(of a_vars[*]);
indicator= ( abs((minmax[2]-minmax[1])/minmax[1])>0.05 );
run;
I asked because this allows transposing to the better long format, where such issues can be coded without regard for the number of variables, and without needing dynamic code if the number of variables is not fixed:
data have;
infile datalines dlm='09'x;
input customer $ a1 a2 a3 a4 a5 a6 document;
datalines;
qqq 3000 3000 3200 3050 3500 3100 6
qqq 3000 3100 3050 3050 3050 3100 7
qqq 4000 4000 4000 4000 4000 4000 8
;
proc transpose data=have out=long (rename=(col1=value _name_=a));
by customer document;
var a:;
run;
proc sql;
create table want as
select
customer,
document,
case
when max(value) le min(value) * 1.05
/* this also covers equality of all values */
then 1
else 0
end as indicator
from long
group by customer, document
;
quit;
Hi @jony123
You can try this:
-> you can adapt the allowed percentage of difference from the minimum value in the %let statement
-> in the array, all variable beginning with an 'a' are referenced. You can specify a1-a6 if it makes more sense.
Best,
%let alpha = .05;
data want;
set have;
array _a(*) a:;
flag=1;
do i=1 to dim(_a);
if min(of _a(*))*(1-&alpha) > _a(i) or min(of _a(*))*(1+&alpha) < _a(i)then flag=0;
end;
drop i;
run;
Thanks works perfect
data ps;
input customer a1 a2 a3 a4 a5 a6;
datalines;
qqq 3000 3000 3200 3050 3500 3100
qqq 3000 3100 3050 3050 3050 3100
;
proc sql;
select
CASE WHEN a1 = a2 = a3 = a4 = a5 =a6 THEN '1' ELSE '0'
END AS indicator
from ps;
quit;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.