## Distance calculation

Regular Contributor
Posts: 162

# Distance calculation

Dear all,

I have a SAS file with the followign variables:

Company identification code: CIK

Auditor Key: AUDITOR_FKEY

Audit fees: AUDITOR_FEES

FYEAR: Financial Year

ZIPCODE: ZIP1

CBSA: Core Based Statistical Area

SIC2: SIC code 2-digit

I would liKE to calculate the absolute difference between the current auditor's market share in client's industry (note: client = CIK, industry =SIC2) and the market share of his closest (other) auditor in the market (Same CBSA, SIC2 and FYEAR).

Market share =proportion of audit fees an audit firm generates in a 2-digit SIC industry relative to the total audit fees generated by all audit firms in an CBSA and in a given year.

I have programs in sql, however, it consumes too much memory and time. I would like to seek for your helps to get some ideas in SAS HASH.

You help is much appreciated.

Thank you.

Regards,

mspak

Super Contributor
Posts: 490

## Re: Distance calculation

It will be easier and faster if you post your sql code. So we can suggest the equivalent in data step or hash.

Regular Contributor
Posts: 162

## Re: Distance calculation

[ Edited ]

The following is the possible steps:

*step 1: calculate the total fee of each auditor in the market;

proc sql;
create table sum1 as
select *,sum(audit_fees) as fee_sum
from cluster.forum
group by auditor_fkey, fyear,sic2,cbsa;
quit;

*step 2: calculate the total fees charged in the market;
proc sql;
create table sum2 as
select *,sum(audit_fees) as market_fee
from fee1
group by fyear,sic2,cbsa;
quit;

*step 3: calculate the market share of each auditor in the market;

data sum3;
set sum2;
fee_shr = fee_sum/market_fee;
run;

*step 4: calculate the closest competitors in the market
(market size is depend on CBSA, SIC2 and fyear);

proc sql;
create table geog.dist as
select a.*, min(zipcitydistance(a.zip1,b.zip1)) as min_dist
from sum3 a , sum3 b
where a.fyear=b.fyear
group by fyear, sic2, cbsa
having not missing(calculated min_dist);
quit;

*step5: only output the data with closest competitors in the market;

Regular Contributor
Posts: 162

## Re: Distance calculation

Hi again,

I need the following SQL program to be included in hash:

proc sql;
create table dist as
select a.*, min(zipcitydistance(a.zip1,b.zip1)) as min_dist
from sum3 a , sum3 b
group by fyear, sic2, cbsa
having not missing(calculated min_dist);
quit;

Thank you in advance for any solution.

Regards,

mspak

Discussion stats
• 3 replies
• 361 views
• 0 likes
• 2 in conversation