BookmarkSubscribeRSS Feed
mspak
Quartz | Level 8

Dear all,

 

I have a SAS file with the followign variables:

Company identification code: CIK

Auditor Key: AUDITOR_FKEY

Audit fees: AUDITOR_FEES

FYEAR: Financial Year

ZIPCODE: ZIP1

CBSA: Core Based Statistical Area

SIC2: SIC code 2-digit

 

I would liKE to calculate the absolute difference between the current auditor's market share in client's industry (note: client = CIK, industry =SIC2) and the market share of his closest (other) auditor in the market (Same CBSA, SIC2 and FYEAR). 

 

Market share =proportion of audit fees an audit firm generates in a 2-digit SIC industry relative to the total audit fees generated by all audit firms in an CBSA and in a given year.

 

I have programs in sql, however, it consumes too much memory and time. I would like to seek for your helps to get some ideas in SAS HASH. 

 

You help is much appreciated.

 

Thank you.

 

Regards,

mspak

3 REPLIES 3
mohamed_zaki
Barite | Level 11

It will be easier and faster if you post your sql code. So we can suggest the equivalent in data step or hash.

mspak
Quartz | Level 8

The following is the possible steps:

 

*step 1: calculate the total fee of each auditor in the market;

proc sql;
create table sum1 as
select *,sum(audit_fees) as fee_sum
  from cluster.forum
   group by auditor_fkey, fyear,sic2,cbsa;
quit;

*step 2: calculate the total fees charged in the market;
proc sql;
create table sum2 as
select *,sum(audit_fees) as market_fee
  from fee1
   group by fyear,sic2,cbsa;
quit;

*step 3: calculate the market share of each auditor in the market;

data sum3;
set sum2;
fee_shr = fee_sum/market_fee;
run;

*step 4: calculate the closest competitors in the market
(market size is depend on CBSA, SIC2 and fyear);

proc sql;
create table geog.dist as
select a.*, min(zipcitydistance(a.zip1,b.zip1)) as min_dist
from sum3 a , sum3 b
where a.fyear=b.fyear
group by fyear, sic2, cbsa
having not missing(calculated min_dist);
quit;

*step5: only output the data with closest competitors in the market;

mspak
Quartz | Level 8

Hi again,

 

I need the following SQL program to be included in hash:

 

proc sql;
create table dist as
select a.*, min(zipcitydistance(a.zip1,b.zip1)) as min_dist
from sum3 a , sum3 b
group by fyear, sic2, cbsa
having not missing(calculated min_dist);
quit;

 

Thank you in advance for any solution.

 

Regards,

mspak

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 1346 views
  • 0 likes
  • 2 in conversation