In a given territory, I have entity, offices that roll up to the entity, and customer IDs for each office (as in table below). Offices in a given entity can share one or more customers, and I want to be able to cycle through different combinations of all offices in a given territory (whether they belong to the same entity or not) and determine a unique count of customers. For example, if I wanted to get the optimal combination of 60% of the offices listed below (3 offices), where optimal is defined as maximizing the number of unique customers, I would have the counts for the following combinations:
A, B, C = 6 unique IDs
A, B, D = 5 unique IDs
A, D, E = 5 unique IDs
...
I need to be able to select a percentage or counts of offices as the territory level and cycle through. Any help is greatly appreciated!!
Territory Entity Office Customer
Atlanta J Inc A 1
Atlanta J Inc A 2
Atlanta J Inc A 3
Atlanta J Inc B 2
Atlanta J Inc B 4
Atlanta J Inc C 2
Atlanta J Inc C 3
Atlanta J Inc C 5
Atlanta J Inc C 8
Atlanta H Inc D 9
Atlanta H Inc E 10
You just need the combination which has three members, right ?
data have;
input Territory $ Entity & $20. Office $ Customer;
cards;
Atlanta J Inc A 1
Atlanta J Inc A 2
Atlanta J Inc A 3
Atlanta J Inc B 2
Atlanta J Inc B 4
Atlanta J Inc C 2
Atlanta J Inc C 3
Atlanta J Inc C 5
Atlanta J Inc C 8
Atlanta H Inc D 9
Atlanta H Inc E 10
;
run;
data want;
if _n_ eq 1 then do;
if 0 then set have;
declare hash ha(dataset:'have',multidata:'y');
ha.definekey('Territory','Office');
ha.definedata('Customer');
ha.definedone();
length list $ 200 key 8;
declare hash c();
c.definekey('key');
c.definedone();
end;
set have;
by Territory;
length list $ 200;
array x{9999} $ 40 _temporary_;
if first.Territory then do;
n=0;call missing(of x{*});c.clear();
end;
if Office not in x then do;n+1;x{n}=Office;end;
if last.Territory then do;
do i=1 to n;
do j=i+1 to n;
do k=j+1 to n;
c.clear();
Office=x{i};
rc=ha.find();
do while(rc=0);
key=Customer;
rx=c.add();
rc=ha.find_next();
end;
Office=x{j};
rc=ha.find();
do while(rc=0);
key=Customer;
rx=c.add();
rc=ha.find_next();
end;
Office=x{k};
rc=ha.find();
do while(rc=0);
key=Customer;
rx=c.add();
rc=ha.find_next();
end;
list=catx(',',x{i},x{j},x{k});count=c.num_items; output;
end;
end;
end;
end;
keep list count;
run;
You just need the combination which has three members, right ?
data have;
input Territory $ Entity & $20. Office $ Customer;
cards;
Atlanta J Inc A 1
Atlanta J Inc A 2
Atlanta J Inc A 3
Atlanta J Inc B 2
Atlanta J Inc B 4
Atlanta J Inc C 2
Atlanta J Inc C 3
Atlanta J Inc C 5
Atlanta J Inc C 8
Atlanta H Inc D 9
Atlanta H Inc E 10
;
run;
data want;
if _n_ eq 1 then do;
if 0 then set have;
declare hash ha(dataset:'have',multidata:'y');
ha.definekey('Territory','Office');
ha.definedata('Customer');
ha.definedone();
length list $ 200 key 8;
declare hash c();
c.definekey('key');
c.definedone();
end;
set have;
by Territory;
length list $ 200;
array x{9999} $ 40 _temporary_;
if first.Territory then do;
n=0;call missing(of x{*});c.clear();
end;
if Office not in x then do;n+1;x{n}=Office;end;
if last.Territory then do;
do i=1 to n;
do j=i+1 to n;
do k=j+1 to n;
c.clear();
Office=x{i};
rc=ha.find();
do while(rc=0);
key=Customer;
rx=c.add();
rc=ha.find_next();
end;
Office=x{j};
rc=ha.find();
do while(rc=0);
key=Customer;
rx=c.add();
rc=ha.find_next();
end;
Office=x{k};
rc=ha.find();
do while(rc=0);
key=Customer;
rx=c.add();
rc=ha.find_next();
end;
list=catx(',',x{i},x{j},x{k});count=c.num_items; output;
end;
end;
end;
end;
keep list count;
run;
This is awesome! Thanks!! Quick question. Is there an easy way to make the number of combinations a variable? Right now, to go from 3 to 4 combinations I have to modify a few parts in the code. Is there a way code it such that I only have to put in the number 3 or 4?
" Is there an easy way to make the number of combinations a variable? "
No, there is not a easy way to go .If you want get the combination of 4 , that is easy to change .
But If you want define a 3 or 4 like a macro variable, I am afraid that I have make a macro .
And If you want all the combination 1 ,2,3,4,........ , I can also make it happen via function GRAYCODE() .
Xia Keshan
I am able to change the combination to 4, 5, ... etc. The problem is I may need to run combinations up to 50 or 200 and it takes a while for me to change the code to do that. That is why I was hoping to get a macro to do it. Any help is greatly appreciated.
Abdou
OK. Let me think about it . Give me some time .
OK. I think I get it all figure out.
data have;
input Territory $ Entity $ Office $ Customer;
cards;
Atlanta JInc A 1
Atlanta JInc A 2
Atlanta JInc A 3
Atlanta JInc B 2
Atlanta JInc B 4
Atlanta JInc C 2
Atlanta JInc C 3
Atlanta JInc C 5
Atlanta JInc C 8
Atlanta HInc D 9
Atlanta HInc E 10
BAtlant JInc A 1
BAtlant JInc A 2
BAtlant JInc A 3
BAtlant JInc B 2
BAtlant JInc B 4
BAtlant JInc C 2
BAtlant JInc C 3
BAtlant JInc C 5
BAtlant JInc C 8
BAtlant HInc D 9
;
run;
proc sql;
select max(n) into : n
from (select count(distinct Office ) as n from have group by Territory );
quit;
%let k=3;
data want;
if _n_ eq 1 then do;
if 0 then set have;
declare hash ha(dataset:'have',multidata:'y');
ha.definekey('Territory','Office');
ha.definedata('Customer');
ha.definedone();
length list $ 2000 key 8;
declare hash c();
c.definekey('key');
c.definedone();
end;
set have;
by Territory;
array x{&n} $ 40 _temporary_;
array y{&n};
if first.Territory then do;
n=0;call missing(of x{*});
end;
if Office not in x then do;n+1;x{n}=Office;end;
if last.Territory then do;
k=-1;
do ii=1 to 2**dim(y);
c.clear();call missing(list);count=0;
rx=graycode(k, of y{*});
if k=&k then do;
do i=1 to n;
if y{i}=1 then count+1;
end;
if count=&k then do;
do i=1 to n;
if y{i}=1 then do;
list=catx(',',list,x{i});
Office=x{i};
rc=ha.find();
do while(rc=0);
key=Customer;
c.replace();
rc=ha.find_next();
end;
end;
end;
unique_IDs=c.num_items; output;
end;
end;
end;
end;
keep Territory list unique_IDs;
run;
Xia Keshan
This appears to work!!! Thanks!!
It is not suitable for the combination which have many members.
If n=10 , You will get 2**10=1024 combination , that is really horrible .
I suggest to make a macro to wrap every kind of combination Like :
%if &k=2 %then %do;
........
%end;
%else %if &k=3 %then %do;
..................
etc .
Xia Keshan
How do you suggest I do it? I have 1,600 offices and want to be able to pick the most optimal 750, or 800, or 850.
When I use the new code you sent me, I get an error saying too many arguments for graycode.
Thanks.
That couldn't be. You gotta know How big number of combination it would generate . You can
NOTuse such algorithm( enumerate all the combinations).
You need some sophisticated algorithm like SIMPLEX(or cutting branch) method .That is beyond my knowledge . and you need SAS/OR for such large scale problem .
Suggest you post it at Mathematical Optimization and Operations Research , maybe some SAS/OR expert could help you.
Xia Keshan
Message was edited by: xia keshan
HaHa, I think I can solve it by Genetic Algorithm. I love GA a hell of a lot.
Change n=10, if you want get the combination which have 10 members, and I pick up the highest count of unique ids for the combination.
NOTE: This code can only process one Territory one time .
data have; input Territory $ Entity $ Office $ Customer; cards; Atlanta JInc A 1 Atlanta JInc A 2 Atlanta JInc A 3 Atlanta JInc A 21 Atlanta JInc B 2 Atlanta JInc B 4 Atlanta JInc C 2 Atlanta JInc C 3 Atlanta JInc C 5 Atlanta JInc C 8 Atlanta HInc D 9 Atlanta HInc D 10 Atlanta HInc E 11 Atlanta HInc E 14 Atlanta HInc E 12 Atlanta HInc E 16 Atlanta HInc E 17 Atlanta HInc E 18 ; run; proc iml; use have; read all var {Office Customer}; close have; n=4; /*--Change it(the number of combination's members).*/ nrow=nrow(Office ); encoding=j(2,nrow,1); encoding[1,]=0; start function(x) global(Office,Customer,n); count=ncol(unique(Customer[loc(x)])); if ncol(unique(Office[loc(x)]))^=n then count=0; return (count); finish; id=gasetup(2,nrow,1234); call gasetobj(id,1,"function"); call gasetcro(id,0.95,2); call gasetmut(id,0.05,3); call gasetsel(id,100,1,1); call gainit(id,10000,encoding); niter = 100; summary = j(niter,2); mattrib summary [c = {"Max Count", "Avg Count"} l=""]; do i = 1 to niter; call garegen(id); call gagetval(value, id); summary[i,1] = value[1]; summary[i,2] = value[:]; end; call gagetmem(mem, value, id, 1); Memebers=unique(Office[loc(mem)]); print "Best Members:" Memebers[l=""], "Max Count: " value[l = ""] ; call gaend(id); quit;
Xia Keshan
消息编辑者为:xia keshan
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.