DATA Step, Macro, Functions and more

Distance calculation

Reply
Regular Contributor
Posts: 162

Distance calculation

Dear all,

 

I have a SAS file with the followign variables:

Company identification code: CIK

Auditor Key: AUDITOR_FKEY

Audit fees: AUDITOR_FEES

FYEAR: Financial Year

ZIPCODE: ZIP1

CBSA: Core Based Statistical Area

SIC2: SIC code 2-digit

 

I would liKE to calculate the absolute difference between the current auditor's market share in client's industry (note: client = CIK, industry =SIC2) and the market share of his closest (other) auditor in the market (Same CBSA, SIC2 and FYEAR). 

 

Market share =proportion of audit fees an audit firm generates in a 2-digit SIC industry relative to the total audit fees generated by all audit firms in an CBSA and in a given year.

 

I have programs in sql, however, it consumes too much memory and time. I would like to seek for your helps to get some ideas in SAS HASH. 

 

You help is much appreciated.

 

Thank you.

 

Regards,

mspak

Attachment
Super Contributor
Posts: 490

Re: Distance calculation

It will be easier and faster if you post your sql code. So we can suggest the equivalent in data step or hash.

Regular Contributor
Posts: 162

Re: Distance calculation

[ Edited ]
Posted in reply to mohamed_zaki

The following is the possible steps:

 

*step 1: calculate the total fee of each auditor in the market;

proc sql;
create table sum1 as
select *,sum(audit_fees) as fee_sum
  from cluster.forum
   group by auditor_fkey, fyear,sic2,cbsa;
quit;

*step 2: calculate the total fees charged in the market;
proc sql;
create table sum2 as
select *,sum(audit_fees) as market_fee
  from fee1
   group by fyear,sic2,cbsa;
quit;

*step 3: calculate the market share of each auditor in the market;

data sum3;
set sum2;
fee_shr = fee_sum/market_fee;
run;

*step 4: calculate the closest competitors in the market
(market size is depend on CBSA, SIC2 and fyear);

proc sql;
create table geog.dist as
select a.*, min(zipcitydistance(a.zip1,b.zip1)) as min_dist
from sum3 a , sum3 b
where a.fyear=b.fyear
group by fyear, sic2, cbsa
having not missing(calculated min_dist);
quit;

*step5: only output the data with closest competitors in the market;

Regular Contributor
Posts: 162

Re: Distance calculation

Hi again,

 

I need the following SQL program to be included in hash:

 

proc sql;
create table dist as
select a.*, min(zipcitydistance(a.zip1,b.zip1)) as min_dist
from sum3 a , sum3 b
group by fyear, sic2, cbsa
having not missing(calculated min_dist);
quit;

 

Thank you in advance for any solution.

 

Regards,

mspak

Ask a Question
Discussion stats
  • 3 replies
  • 284 views
  • 0 likes
  • 2 in conversation