Dear all,
I have a SAS file with the followign variables:
Company identification code: CIK
Auditor Key: AUDITOR_FKEY
Audit fees: AUDITOR_FEES
FYEAR: Financial Year
ZIPCODE: ZIP1
CBSA: Core Based Statistical Area
SIC2: SIC code 2-digit
I would liKE to calculate the absolute difference between the current auditor's market share in client's industry (note: client = CIK, industry =SIC2) and the market share of his closest (other) auditor in the market (Same CBSA, SIC2 and FYEAR).
Market share =proportion of audit fees an audit firm generates in a 2-digit SIC industry relative to the total audit fees generated by all audit firms in an CBSA and in a given year.
I have programs in sql, however, it consumes too much memory and time. I would like to seek for your helps to get some ideas in SAS HASH.
You help is much appreciated.
Thank you.
Regards,
mspak
It will be easier and faster if you post your sql code. So we can suggest the equivalent in data step or hash.
The following is the possible steps:
*step 1: calculate the total fee of each auditor in the market;
proc sql;
create table sum1 as
select *,sum(audit_fees) as fee_sum
from cluster.forum
group by auditor_fkey, fyear,sic2,cbsa;
quit;
*step 2: calculate the total fees charged in the market;
proc sql;
create table sum2 as
select *,sum(audit_fees) as market_fee
from fee1
group by fyear,sic2,cbsa;
quit;
*step 3: calculate the market share of each auditor in the market;
data sum3;
set sum2;
fee_shr = fee_sum/market_fee;
run;
*step 4: calculate the closest competitors in the market
(market size is depend on CBSA, SIC2 and fyear);
proc sql;
create table geog.dist as
select a.*, min(zipcitydistance(a.zip1,b.zip1)) as min_dist
from sum3 a , sum3 b
where a.fyear=b.fyear
group by fyear, sic2, cbsa
having not missing(calculated min_dist);
quit;
*step5: only output the data with closest competitors in the market;
Hi again,
I need the following SQL program to be included in hash:
proc sql;
create table dist as
select a.*, min(zipcitydistance(a.zip1,b.zip1)) as min_dist
from sum3 a , sum3 b
group by fyear, sic2, cbsa
having not missing(calculated min_dist);
quit;
Thank you in advance for any solution.
Regards,
mspak
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.