Without knowing details of what you might consider a "match", here is a very generic way.
The first two datasets just build some dummy data with a name and value. The Proc sql compares the sales values (assumes that is what you want to match on, not much else in this data) and selects matches when the difference is 300 or less. Depending on the number of matches you might want you would adjust the value of that comparison parameter up or down.
If you only want the compare company to "match" one in the treatment set it will be up to you to remove the duplicates which would not be very difficult (left as an exercise for the interested reader).
If you multiple values to match on describing the space could be a simple as adding AND for the different variables.
Refinements but I'm too lazy to try to code today would be to identify the minimum absolute difference and only select the compare companies with that value.
I am not making ANY claim to efficiency in the code. Simple and brute force. Careful consideration of the differences at first step may help.
data treat;
input name $ sales;
datalines;
abc 10000
cdf 20000
ged 30000
;
run;
data compare;
length compname $ 8;
do a='S','T','U','V','X','Y','Z';
do b='S','T','U','V','X','Y','Z';
do c='S','T','U','V','X','Y','Z';
do d='S','T','U','V','X','Y','Z';
compname=catt(a,b,c,d);
compsales = round(100000*rand('uniform'),100);
output;
end;
end;
end;
end;
drop a b c;
run;
proc sql;
create table compgroup as
select a.*,b.*
from treat as a, compare as b
where abs(a.sales-b.compsales) < 300;
quit;
A real rough other approach would be to combine the two data sets with a variable that contains which set each record comes from.
Sort on the comparison values.
Extract the treatment records and the one or two following the treatment record (very easy) and possibly one or two preceding ( a litle more work).
How do you want to compare them? From your question only, it could be on the first letter of their name .
Without knowing details of what you might consider a "match", here is a very generic way.
The first two datasets just build some dummy data with a name and value. The Proc sql compares the sales values (assumes that is what you want to match on, not much else in this data) and selects matches when the difference is 300 or less. Depending on the number of matches you might want you would adjust the value of that comparison parameter up or down.
If you only want the compare company to "match" one in the treatment set it will be up to you to remove the duplicates which would not be very difficult (left as an exercise for the interested reader).
If you multiple values to match on describing the space could be a simple as adding AND for the different variables.
Refinements but I'm too lazy to try to code today would be to identify the minimum absolute difference and only select the compare companies with that value.
I am not making ANY claim to efficiency in the code. Simple and brute force. Careful consideration of the differences at first step may help.
data treat;
input name $ sales;
datalines;
abc 10000
cdf 20000
ged 30000
;
run;
data compare;
length compname $ 8;
do a='S','T','U','V','X','Y','Z';
do b='S','T','U','V','X','Y','Z';
do c='S','T','U','V','X','Y','Z';
do d='S','T','U','V','X','Y','Z';
compname=catt(a,b,c,d);
compsales = round(100000*rand('uniform'),100);
output;
end;
end;
end;
end;
drop a b c;
run;
proc sql;
create table compgroup as
select a.*,b.*
from treat as a, compare as b
where abs(a.sales-b.compsales) < 300;
quit;
A real rough other approach would be to combine the two data sets with a variable that contains which set each record comes from.
Sort on the comparison values.
Extract the treatment records and the one or two following the treatment record (very easy) and possibly one or two preceding ( a litle more work).
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.
Find more tutorials on the SAS Users YouTube channel.