BookmarkSubscribeRSS Feed
mspak
Quartz | Level 8

Good days to all,

I have a set of data (as per attached), with the following variables:

gvkey = US company identification code

fyear = financial year

Auditor = auditor identification code

sic2 = 2-digit industry classification code

CBSA = City

NAT_Mkshr = National market share, based on SIC2 and fyear. 

City_Mkshr = City market share, based on CBSA, SIC2 and fyear.

I would like to calculate Auditor Industry Specialist, based on the following definitions:

  • National auditor specialist: An auditor is defined as a national industry specialist if it has the largest annual (ie fyear) market share in an industry, based on the two-digit SIC category, and if its annual market share is at least 10 percentage points greater than its closest competitor in a national  audit market.
  • City auditor specialist: An auditor is defined as a city industry specialist if it has the largest annual (ie fyear) market share in an industry, based on the two-digit SIC category for a given city (ie. CBSA), and if its annual market share is at least 10 percentage points greater than its closest competitor in a city audit market.

I learned from my search that there are SAS function, namely LARGEST and COMPARE. I have no idea on whether these functions can be applied in my case.

Hope to get any simple code to my problem.

Thanks in advance for any help.

Regards,

mspak

8 REPLIES 8
esjackso
Quartz | Level 8

See if this can get you started. Proc rank will rank the observations by fyear, au, sic2. The rank variable is in mk_nat.

proc sort data=test out=test2; by fyear sic2 au; run;

proc rank data=test2 out=test3;

  by fyear sic2 au;

  var nat_mkshr;

  ranks rnk_nat;

run;

As far as the rest there are some other rules you will have to work through like market share ties. But if you keep everyone with a rank < 3 those should be all the people you need to look at and run through a data step to flag the specialists.

Hope this helps!

EJ

esjackso
Quartz | Level 8

By the way the functions you mentioned I dont think will help since they would work across variables and not across observations.

EJ

mspak
Quartz | Level 8

Hi EJ,

Thanks for your helps.

I try to run the following code:

proc sort data=final.test  out=test1 nodupkey; by fyear sic2 au; run;

proc sort data=test1 out=test2; by sic2 fyear ; run;

proc rank data=test2  descending out=test3;

  by sic2 fyear ;

  var nat_mkshr;

  ranks rnk_nat;

run;

proc sort data=test3 out=test4;

by sic2 fyear descending rnk_nat; run;

data test5; set test4;

by sic2 fyear descending rnk_nat;

natshr_dif = dif(nat_mkshr); run;

proc sort data=test5 out=test6; by sic2 fyear cbsa; run;

proc rank data=test6  descending out=test7;

  by sic2 fyear cbsa;

  var City_Mkshr;

  ranks rnk_city;

run;

proc sort data=test7 out=test8;

by sic2 fyear cbsa descending rnk_city; run;

data test9; set test8;

by sic2 fyear cbsa descending rnk_nat;

cityshr_dif = dif(City_Mkshr); run;

I found everything is within my expectation up to the output of test8.

I found the following error whenever I try to output "test9". I found that the obs in test9 has drop significantly compared to as in test8.

ERROR: BY variables are not properly sorted on data set WORK.TEST8.

Thank you.

Regards,

mspak

Do you know the reason?

Jagadishkatam
Amethyst | Level 16

Hi mspak,

in proc sort for test8 you have used the by variables as sic2 fyear cbsa descending rnk_city;

However while creating the dataset test9, the by variable order is sic2 fyear cbsa descending rnk_nat;

Here, please replace rnk_nat with rnk_city and you will not get any error.

Thanks,

Jagadish

Thanks,
Jag
esjackso
Quartz | Level 8

Good catch. I noticed the same thing.

Also I always forget about the dif function so it was good to see that was what you are using.

EJ

mspak
Quartz | Level 8

Hi Jagadish and EJ,

There was a typo error on the variable (rnk_city instead of rnk_nat). I checked the code for a few times, but I could not realise the mistake Smiley Sad.

Thank you helping me Smiley Happy.

Regards,

mspak

esjackso
Quartz | Level 8

No problem ... glad it helped!

EJ

mspak
Quartz | Level 8

Hi all,

I modified the codes as follows to generate my desired output:

proc sort data=final.test  out=test1 nodupkey; by fyear sic2 au; run;

proc sort data=test1 out=test2; by sic2 fyear ; run;

proc rank data=test2  descending out=test3;

  by sic2 fyear ;

  var nat_mkshr;

  ranks rnk_nat;

run;

proc sort data=test3 out=test4;

by sic2 fyear descending rnk_nat; run;

data test5; set test4;

by sic2 fyear descending rnk_nat;

natshr_dif = dif(nat_mkshr);

IF FIRST.FYEAR THEN DO;

natshr_dif = .;

END;

run;

proc sort data=test5 out=test6; by sic2 fyear cbsa; run;

proc rank data=test6  descending out=test7;

  by sic2 fyear cbsa;

  var City_Mkshr;

  ranks rnk_city;

run;

proc sort data=test7 out=test8;

by sic2 fyear cbsa descending rnk_city; run;

data test9; set test8 (drop=gvkey);

by sic2 fyear cbsa descending rnk_city;

cityshr_dif = dif(City_Mkshr);

IF FIRST.CBSA THEN DO;

cityshr_dif = .;

END;

run;

data test10; set test9;

if rnk_nat = 1 and natshr_dif>=0.1 then nat_AIP1 = 1; else nat_AIP1 = 0;

if rnk_city ^= 1 OR . <cityshr_dif <0.1  then city_AIP1 = 0; else city_AIP1 = 1;

run;

Thank you.

Regads,

mspak

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 1430 views
  • 7 likes
  • 3 in conversation