DATA Step, Macro, Functions and more

Finding closest matches

Reply
N/A
Posts: 0

Finding closest matches

I have a dataset with some firms identified as "targets." I would like to select firms that are within +/- 30% of two parameters (size and ROA) of the targets and somehow flag these firms (other targets can be included in the flagged firms). Ultimately I would like to take the average size of these flagged firms relative to the respective target. Does this make sense? I do not know where to start!

Thanks for any direction you can provide.
Super User
Posts: 3,102

Re: Finding closest matches

I suggest you consider flagging each of your targets for size and ROA separately. I find using boolean (0 or 1) flags are very useful:

target_size = 123;
target_ROA = 456;
flag_size = (abs(size - target_size)/target_size LE 0.3);
flag_ROA = (abs(ROA - target_ROA)/target_ROA LE 0.3);

This logic creates flags that will contain 1 if they are within 30%, 0 if they are not.
N/A
Posts: 0

Re: Finding closest matches

Thank you for your response! The only issue I see with that solution is that it is only good for one target at a time - I have about 100 targets within the dataset (and targets can be in the list of matches for other targets). Is there a way to loop that logic so that I do not have to separately identify the matches? Thanks!!
Trusted Advisor
Posts: 2,113

Re: Finding closest matches

EC,

There is a broad literature on this. Search, for example, for "nearest match" in the forums and you will find
http://support.sas.com/forums/thread.jspa?messageID=39419駻
It has links to lots of techniques to use.

Doc Muhlbaier
Duke
Super User
Posts: 3,102

Re: Finding closest matches

Perhaps I should have been more explicit:

data targets;
set xxx;
target_size = 123;
target_ROA = 456;
flag_size = (abs(size - target_size)/target_size LE 0.3);
flag_ROA = (abs(ROA - target_ROA)/target_ROA LE 0.3);
run;

This program will flag as many "targets" as there are rows in your input data - hundreds or more. Each of your rows will have two flags, one for size and one for ROA. To select rows meeting your size target:

Add: where flag_size = 1;
Ask a Question
Discussion stats
  • 4 replies
  • 160 views
  • 0 likes
  • 3 in conversation