BookmarkSubscribeRSS Feed
mspak
Quartz | Level 8

Dear all,

 

I have a SAS file with the followign variables:

Company identification code: CIK

Auditor Key: AUDITOR_FKEY

Audit fees: AUDITOR_FEES

FYEAR: Financial Year

ZIPCODE: ZIP1

CBSA: Core Based Statistical Area

SIC2: SIC code 2-digit

 

I would liKE to calculate the absolute difference between the current auditor's market share in client's industry (note: client = CIK, industry =SIC2) and the market share of his closest (other) auditor in the market (Same CBSA, SIC2 and FYEAR). 

 

Market share =proportion of audit fees an audit firm generates in a 2-digit SIC industry relative to the total audit fees generated by all audit firms in an CBSA and in a given year.

 

I have programs in sql, however, it consumes too much memory and time. I would like to seek for your helps to get some ideas in SAS HASH. 

 

You help is much appreciated.

 

Thank you.

 

Regards,

mspak

3 REPLIES 3
mohamed_zaki
Barite | Level 11

It will be easier and faster if you post your sql code. So we can suggest the equivalent in data step or hash.

mspak
Quartz | Level 8

The following is the possible steps:

 

*step 1: calculate the total fee of each auditor in the market;

proc sql;
create table sum1 as
select *,sum(audit_fees) as fee_sum
  from cluster.forum
   group by auditor_fkey, fyear,sic2,cbsa;
quit;

*step 2: calculate the total fees charged in the market;
proc sql;
create table sum2 as
select *,sum(audit_fees) as market_fee
  from fee1
   group by fyear,sic2,cbsa;
quit;

*step 3: calculate the market share of each auditor in the market;

data sum3;
set sum2;
fee_shr = fee_sum/market_fee;
run;

*step 4: calculate the closest competitors in the market
(market size is depend on CBSA, SIC2 and fyear);

proc sql;
create table geog.dist as
select a.*, min(zipcitydistance(a.zip1,b.zip1)) as min_dist
from sum3 a , sum3 b
where a.fyear=b.fyear
group by fyear, sic2, cbsa
having not missing(calculated min_dist);
quit;

*step5: only output the data with closest competitors in the market;

mspak
Quartz | Level 8

Hi again,

 

I need the following SQL program to be included in hash:

 

proc sql;
create table dist as
select a.*, min(zipcitydistance(a.zip1,b.zip1)) as min_dist
from sum3 a , sum3 b
group by fyear, sic2, cbsa
having not missing(calculated min_dist);
quit;

 

Thank you in advance for any solution.

 

Regards,

mspak

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 1095 views
  • 0 likes
  • 2 in conversation