turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- SAS Procedures
- /
- how to matching without replacement

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

06-18-2015 10:35 PM

I have 2 datasets test firms and control firms. I need to match these two datasets based on same fiscal year and same sic code and the minimum difference in their size

Here is my SAS code:

PROC SQL;

create table match as

select a.* , b.* , abs(a.size - b.size) as diff

from control a, test b

where a. fyear = b. fyear and a. sic = b. sic

group by b.fyear ,b.sic, b.cik

having diff=min(diff);

PROC print data=match;

run;

The problem is that my SAS code works like matching with replacement. I need matching without replacement. So once SAS program finds a matched firm, it cannot use the same firm to be matched another test firm. (in the same fiscal year & same sic code)

Any comments or help would be appreciated. Thank you

Best,

Steve

Accepted Solutions

Solution

06-20-2015
10:28 AM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to SteveLee

06-20-2015 10:28 AM

You can't do it via SQL. use Hash Table .

data test;

set MERGEDATA_5(keep=companies fyr sic1 sizes);

run;

data control;

set MERGEDATA_6(keep=companies fyr sic1 sizes

rename=(companies=_companies fyr=fyr sic1=sic1 sizes =_sizes));

k+1;

run;

proc sort data=test; by fyr sic1 sizes;run;

data want;

if _n_ eq 1 then do;

if 0 then set control;

declare hash h(dataset:'control',multidata:'y');

h.definekey('fyr','sic1');

h.definedata('_companies','_sizes','k');

h.definedone();

end;

call missing(of _all_);

set test;

n=0;

min=9999999;

rc=h.find();

do while(rc=0);

abs=abs(sizes-_sizes);

if abs lt min then do;n=k;min=abs;end;

rc=h.find_next();

end;

rc=h.find();

do while(rc=0);

if n=k then do;h.removedup();leave;end;

rc=h.find_next();

end;

drop rc n min abs k;

run;

Xia Keshan

Message was edited by: xia keshan Fix a problem

All Replies

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to SteveLee

06-19-2015 09:53 AM

You missed a keyword CALCULATED .

PROC SQL;

create table match as

select a.* , b.* , abs(a.size - b.size) as diff

from control a, test b

where a. fyear = b. fyear and a. sic = b. sic

group by b.fyear ,b.sic, b.cik

having **calculated** diff=min(**calculated** diff);

run;

and list all the variable you need in SELECT, not like ; a.*,b.* . You will got ERROR info if a and b have the same variable name.

Xia Keshan

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Ksharp

06-20-2015 02:49 AM

I really appreciate your answer. Unfortunately, your suggestion did not work for all the cases. I think it's still working as 'matching with replacement' I mentioned above. Could you give me another way to solve this problem?

Best,

Steve

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Ksharp

06-20-2015 04:15 AM

Matching with replacement / Matching without replacement

Test firm Matching firm

sic year

A 22 2009 Size=$42billion C 22 2009 Size=41billion / C 22 2009 Size=41billion

B 22 2009 Size=$40billion C 22 2009 Size=41billion / D 22 2009 Size=41.5 billion

To create a matched sample, I need to find a matched firm for 'A' and another for 'B'. If I'm matching on specific year (2009), I need to find with closest to 50billion for 'A' from the same sic code in 2009. Again the same for 'B'

I give an specific example of 'Matching with replacement' and 'Matching without replacement'. My desire output is 'Matching without replacement'

This means that once SAS program finds a matched firm, it cannot use the same firm to be matched another test firm. (in the same fiscal year & same sic code).

PROC SQL;

create table matching1 as

select a.*, b.*, abs(a.size - b.sizes) as diff

from mergedata_6 a, mergedata_5 b

where a. fyear = b. fyr and a. sic = b. sic1

group by b.fyr ,b.sic1, b.ciks

having calculated diff=min(calculated diff);

PROC print data=matching1;

run;

However, my SAS code just works as 'Matching with replacement'. Please give some help for solving this problem.

Thank you

Best,

Steve

Solution

06-20-2015
10:28 AM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to SteveLee

06-20-2015 10:28 AM

You can't do it via SQL. use Hash Table .

data test;

set MERGEDATA_5(keep=companies fyr sic1 sizes);

run;

data control;

set MERGEDATA_6(keep=companies fyr sic1 sizes

rename=(companies=_companies fyr=fyr sic1=sic1 sizes =_sizes));

k+1;

run;

proc sort data=test; by fyr sic1 sizes;run;

data want;

if _n_ eq 1 then do;

if 0 then set control;

declare hash h(dataset:'control',multidata:'y');

h.definekey('fyr','sic1');

h.definedata('_companies','_sizes','k');

h.definedone();

end;

call missing(of _all_);

set test;

n=0;

min=9999999;

rc=h.find();

do while(rc=0);

abs=abs(sizes-_sizes);

if abs lt min then do;n=k;min=abs;end;

rc=h.find_next();

end;

rc=h.find();

do while(rc=0);

if n=k then do;h.removedup();leave;end;

rc=h.find_next();

end;

drop rc n min abs k;

run;

Xia Keshan

Message was edited by: xia keshan Fix a problem