BookmarkSubscribeRSS Feed
mspak
Quartz | Level 8

Dear all,

 

I have a SAS file with the followign variables:

Company identification code: CIK

Auditor Key: AUDITOR_FKEY

Audit fees: AUDITOR_FEES

FYEAR: Financial Year

ZIPCODE: ZIP1

CBSA: Core Based Statistical Area

SIC2: SIC code 2-digit

 

I would liKE to calculate the absolute difference between the current auditor's market share in client's industry (note: client = CIK, industry =SIC2) and the market share of his closest (other) auditor in the market (Same CBSA, SIC2 and FYEAR). 

 

Market share =proportion of audit fees an audit firm generates in a 2-digit SIC industry relative to the total audit fees generated by all audit firms in an CBSA and in a given year.

 

I have programs in sql, however, it consumes too much memory and time. I would like to seek for your helps to get some ideas in SAS HASH. 

 

You help is much appreciated.

 

Thank you.

 

Regards,

mspak

3 REPLIES 3
mohamed_zaki
Barite | Level 11

It will be easier and faster if you post your sql code. So we can suggest the equivalent in data step or hash.

mspak
Quartz | Level 8

The following is the possible steps:

 

*step 1: calculate the total fee of each auditor in the market;

proc sql;
create table sum1 as
select *,sum(audit_fees) as fee_sum
  from cluster.forum
   group by auditor_fkey, fyear,sic2,cbsa;
quit;

*step 2: calculate the total fees charged in the market;
proc sql;
create table sum2 as
select *,sum(audit_fees) as market_fee
  from fee1
   group by fyear,sic2,cbsa;
quit;

*step 3: calculate the market share of each auditor in the market;

data sum3;
set sum2;
fee_shr = fee_sum/market_fee;
run;

*step 4: calculate the closest competitors in the market
(market size is depend on CBSA, SIC2 and fyear);

proc sql;
create table geog.dist as
select a.*, min(zipcitydistance(a.zip1,b.zip1)) as min_dist
from sum3 a , sum3 b
where a.fyear=b.fyear
group by fyear, sic2, cbsa
having not missing(calculated min_dist);
quit;

*step5: only output the data with closest competitors in the market;

mspak
Quartz | Level 8

Hi again,

 

I need the following SQL program to be included in hash:

 

proc sql;
create table dist as
select a.*, min(zipcitydistance(a.zip1,b.zip1)) as min_dist
from sum3 a , sum3 b
group by fyear, sic2, cbsa
having not missing(calculated min_dist);
quit;

 

Thank you in advance for any solution.

 

Regards,

mspak

SAS Innovate 2025: Register Today!

 

Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.


Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 1407 views
  • 0 likes
  • 2 in conversation