Good days to all,
I have a set of data (as per attached), with the following variables:
gvkey = US company identification code
fyear = financial year
Auditor = auditor identification code
sic2 = 2-digit industry classification code
CBSA = City
NAT_Mkshr = National market share, based on SIC2 and fyear.
City_Mkshr = City market share, based on CBSA, SIC2 and fyear.
I would like to calculate Auditor Industry Specialist, based on the following definitions:
I learned from my search that there are SAS function, namely LARGEST and COMPARE. I have no idea on whether these functions can be applied in my case.
Hope to get any simple code to my problem.
Thanks in advance for any help.
Regards,
mspak
See if this can get you started. Proc rank will rank the observations by fyear, au, sic2. The rank variable is in mk_nat.
proc sort data=test out=test2; by fyear sic2 au; run;
proc rank data=test2 out=test3;
by fyear sic2 au;
var nat_mkshr;
ranks rnk_nat;
run;
As far as the rest there are some other rules you will have to work through like market share ties. But if you keep everyone with a rank < 3 those should be all the people you need to look at and run through a data step to flag the specialists.
Hope this helps!
EJ
By the way the functions you mentioned I dont think will help since they would work across variables and not across observations.
EJ
Hi EJ,
Thanks for your helps.
I try to run the following code:
proc sort data=final.test out=test1 nodupkey; by fyear sic2 au; run;
proc sort data=test1 out=test2; by sic2 fyear ; run;
proc rank data=test2 descending out=test3;
by sic2 fyear ;
var nat_mkshr;
ranks rnk_nat;
run;
proc sort data=test3 out=test4;
by sic2 fyear descending rnk_nat; run;
data test5; set test4;
by sic2 fyear descending rnk_nat;
natshr_dif = dif(nat_mkshr); run;
proc sort data=test5 out=test6; by sic2 fyear cbsa; run;
proc rank data=test6 descending out=test7;
by sic2 fyear cbsa;
var City_Mkshr;
ranks rnk_city;
run;
proc sort data=test7 out=test8;
by sic2 fyear cbsa descending rnk_city; run;
data test9; set test8;
by sic2 fyear cbsa descending rnk_nat;
cityshr_dif = dif(City_Mkshr); run;
I found everything is within my expectation up to the output of test8.
I found the following error whenever I try to output "test9". I found that the obs in test9 has drop significantly compared to as in test8.
ERROR: BY variables are not properly sorted on data set WORK.TEST8.
Thank you.
Regards,
mspak
Do you know the reason?
Hi mspak,
in proc sort for test8 you have used the by variables as sic2 fyear cbsa descending rnk_city;
However while creating the dataset test9, the by variable order is sic2 fyear cbsa descending rnk_nat;
Here, please replace rnk_nat with rnk_city and you will not get any error.
Thanks,
Jagadish
Good catch. I noticed the same thing.
Also I always forget about the dif function so it was good to see that was what you are using.
EJ
Hi Jagadish and EJ,
There was a typo error on the variable (rnk_city instead of rnk_nat). I checked the code for a few times, but I could not realise the mistake .
Thank you helping me .
Regards,
mspak
No problem ... glad it helped!
EJ
Hi all,
I modified the codes as follows to generate my desired output:
proc sort data=final.test out=test1 nodupkey; by fyear sic2 au; run;
proc sort data=test1 out=test2; by sic2 fyear ; run;
proc rank data=test2 descending out=test3;
by sic2 fyear ;
var nat_mkshr;
ranks rnk_nat;
run;
proc sort data=test3 out=test4;
by sic2 fyear descending rnk_nat; run;
data test5; set test4;
by sic2 fyear descending rnk_nat;
natshr_dif = dif(nat_mkshr);
IF FIRST.FYEAR THEN DO;
natshr_dif = .;
END;
run;
proc sort data=test5 out=test6; by sic2 fyear cbsa; run;
proc rank data=test6 descending out=test7;
by sic2 fyear cbsa;
var City_Mkshr;
ranks rnk_city;
run;
proc sort data=test7 out=test8;
by sic2 fyear cbsa descending rnk_city; run;
data test9; set test8 (drop=gvkey);
by sic2 fyear cbsa descending rnk_city;
cityshr_dif = dif(City_Mkshr);
IF FIRST.CBSA THEN DO;
cityshr_dif = .;
END;
run;
data test10; set test9;
if rnk_nat = 1 and natshr_dif>=0.1 then nat_AIP1 = 1; else nat_AIP1 = 0;
if rnk_city ^= 1 OR . <cityshr_dif <0.1 then city_AIP1 = 0; else city_AIP1 = 1;
run;
Thank you.
Regads,
mspak
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.