Dear Madam/Sir,
I have firm identifier (gvkey), year (cyear) and auditor key (auditor_fkey) and like to define common auditor as a dummy variable. In other words, gvkey 1161 and 1380 share the common auditor in year 2003 and these firm-years are coded as 1. Any coding idea will be greatly appreciated.
2003 | 1161 | 2 |
2004 | 1161 | 2 |
2005 | 1161 | 2 |
2006 | 1161 | 2 |
2002 | 1380 | 2 |
2003 | 1380 | 2 |
2004 | 1380 | 2 |
2005 | 1380 | 2 |
2006 | 1380 | 2 |
2007 | 1380 | 2 |
2008 | 1380 | 2 |
2009 | 1380 | 2 |
2010 | 1380 | 2 |
2011 | 1380 | 2 |
2012 | 1380 | 2 |
2013 | 1380 | 2 |
2000 | 1913 | 1 |
2001 | 1913 | 1 |
2002 | 1913 | 1 |
2003 | 1913 | 1 |
Thanks
Joon1
I think you should post the desired output to clarify your question better.
Assuming I understood what you mean.
data have;
input cyear gvkey AUDITOR_FKEY;
datalines;
2003 1161 2
2004 1161 2
2005 1161 2
2006 1161 2
2002 1380 2
2003 1380 2
2004 1380 2
2005 1380 2
2006 1380 2
2007 1380 2
2008 1380 2
2009 1380 2
2010 1380 2
2011 1380 2
2012 1380 2
2013 1380 2
2000 1913 1
2001 1913 1
2002 1913 1
2003 1913 1
;
run;
proc sql;
create table want as
select *,count(distinct gvkey) ne 1 as flag
from have
group by AUDITOR_FKEY,cyear
order by gvkey,cyear;
quit;
I don't really understand what you're asking for. Does below return what you're after?
data have;
input cyear gvkey AUDITOR_FKEY;
datalines;
2003 1161 2
2004 1161 2
2005 1161 2
2006 1161 2
2002 1380 2
2003 1380 2
2004 1380 2
2005 1380 2
2006 1380 2
2007 1380 2
2008 1380 2
2009 1380 2
2010 1380 2
2011 1380 2
2012 1380 2
2013 1380 2
2000 1913 1
2001 1913 1
2002 1913 1
2003 1913 1
;
run;
proc sql;
select
l.cyear
,l.gvkey
,l.AUDITOR_FKEY
,r.n_gvkeys
,case
when r.n_gvkeys >1 then 1
else 0
end as shared_auditor_flg
from have l
left join
(
select
cyear
,AUDITOR_FKEY
,count(*) as n_gvkeys
from have
group by
cyear,AUDITOR_FKEY
) r
on l.cyear=r.cyear and l.auditor_fkey=r.auditor_fkey
order by l.gvkey, l.cyear
;
quit;
I think you should post the desired output to clarify your question better.
Assuming I understood what you mean.
data have;
input cyear gvkey AUDITOR_FKEY;
datalines;
2003 1161 2
2004 1161 2
2005 1161 2
2006 1161 2
2002 1380 2
2003 1380 2
2004 1380 2
2005 1380 2
2006 1380 2
2007 1380 2
2008 1380 2
2009 1380 2
2010 1380 2
2011 1380 2
2012 1380 2
2013 1380 2
2000 1913 1
2001 1913 1
2002 1913 1
2003 1913 1
;
run;
proc sql;
create table want as
select *,count(distinct gvkey) ne 1 as flag
from have
group by AUDITOR_FKEY,cyear
order by gvkey,cyear;
quit;
Thanks for your kind reply, Ksharp. My desired output is the last column (dummy variable coded 1 for firms (gvkey) with the same auditor (auditor_fkey) in the same year (cyear), otherwise 0). Your suggested code does not work. Your help will be highly appreicated.
2003 1161 2 1
2004 1161 2 1
2005 1161 2 1
2006 1161 2 1
2002 1380 2 0
2003 1380 2 1
2004 1380 2 1
2005 1380 2 1
2006 1380 2 1
2007 1380 2 0
2008 1380 2 0
2009 1380 2 0
2010 1380 2 0
2011 1380 2 0
2012 1380 2 0
2013 1380 2 0
2000 1913 1 0
2001 1913 1 0
2002 1913 1 0
2003 1913 1 0
output
Then tell me where is different between mine and yours ? Here is my output.
I am sorry, Ksharp. Your code assigned all 1s for 'flag'. Your help will be highly appreciated.
proc sql;
create table cc2 as
select *,count(distinct gvkey) ne 1 as flag
from cc1
group by AUDITOR_FKEY,cyear
order by gvkey,cyear;
quit;
2003 | 1161 | 2 | 1 |
2004 | 1161 | 2 | 1 |
2005 | 1161 | 2 | 1 |
2006 | 1161 | 2 | 1 |
2002 | 1380 | 2 | 1 |
2003 | 1380 | 2 | 1 |
2004 | 1380 | 2 | 1 |
2005 | 1380 | 2 | 1 |
2006 | 1380 | 2 | 1 |
2007 | 1380 | 2 | 1 |
2008 | 1380 | 2 | 1 |
2009 | 1380 | 2 | 1 |
2010 | 1380 | 2 | 1 |
2011 | 1380 | 2 | 1 |
2012 | 1380 | 2 | 1 |
2013 | 1380 | 2 | 1 |
2000 | 1913 | 1 | 1 |
2001 | 1913 | 1 | 1 |
2002 | 1913 | 1 | 1 |
2003 | 1913 | 1 | 1 |
Since I run the example data set and get the same result as @Ksharp
AUDITOR_ cyear gvkey FKEY flag 2003 1161 2 1 2004 1161 2 1 2005 1161 2 1 2006 1161 2 1 2002 1380 2 0 2003 1380 2 1 2004 1380 2 1 2005 1380 2 1 2006 1380 2 1 2007 1380 2 0 2008 1380 2 0 2009 1380 2 0 2010 1380 2 0 2011 1380 2 0 2012 1380 2 0 2013 1380 2 0 2000 1913 1 0 2001 1913 1 0 2002 1913 1 0 2003 1913 1 0
Then one suspects something different about YOUR CC1 data set that we cannot distinguish because you are not providing example data as a working data step.
Instructions here: https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... will show how to turn an existing SAS data set into data step code that can be pasted into a forum code box using the </> icon or attached as text to show exactly what you have and that we can test code against.
Thanks for your help, Balladw. I was not clear on my question. Ksharp's code works.
Thanks, Balladw and Ksharp. The working file is just 394 observations, so I display my working dataset (Actually, I don't understand the link you sent). Flags are assigned 1 using your suggested code. Your help will be greatly appreciated.
cyear | gvkey | AUDITOR_FKEY |
2003 | 1161 | 2 |
2004 | 1161 | 2 |
2005 | 1161 | 2 |
2006 | 1161 | 2 |
2002 | 1380 | 2 |
2003 | 1380 | 2 |
2004 | 1380 | 2 |
2005 | 1380 | 2 |
2006 | 1380 | 2 |
2007 | 1380 | 2 |
2008 | 1380 | 2 |
2009 | 1380 | 2 |
2010 | 1380 | 2 |
2011 | 1380 | 2 |
2012 | 1380 | 2 |
2013 | 1380 | 2 |
2000 | 1913 | 1 |
2001 | 1913 | 1 |
2002 | 1913 | 1 |
2003 | 1913 | 1 |
2001 | 2230 | 1 |
2002 | 2230 | 2 |
2003 | 2230 | 2 |
2004 | 2230 | 2 |
2005 | 2230 | 2 |
2001 | 2593 | 1 |
2002 | 2593 | 1 |
2000 | 2812 | 5 |
2001 | 2812 | 5 |
2002 | 2812 | 3 |
2012 | 3226 | 3 |
2013 | 3226 | 3 |
2014 | 3226 | 3 |
2015 | 3226 | 3 |
2000 | 3278 | 3 |
2002 | 3278 | 3 |
2004 | 3278 | 3 |
2007 | 3278 | 3 |
2008 | 3278 | 3 |
2009 | 3278 | 3 |
2010 | 3278 | 3 |
2011 | 3278 | 3 |
2012 | 3278 | 3 |
2000 | 3439 | 5 |
2001 | 3439 | 5 |
2002 | 3439 | 2 |
2003 | 3439 | 2 |
2004 | 3439 | 2 |
2001 | 3813 | 2 |
2002 | 3813 | 2 |
2003 | 3813 | 2 |
2001 | 3964 | 3 |
2002 | 3964 | 3 |
2003 | 3964 | 3 |
2001 | 4611 | 4 |
2002 | 4611 | 4 |
2003 | 4611 | 4 |
2000 | 5046 | 5 |
2001 | 5046 | 5 |
2002 | 5046 | 4 |
2003 | 5046 | 4 |
2004 | 5046 | 4 |
2005 | 5046 | 4 |
2006 | 5046 | 4 |
2007 | 5046 | 4 |
2008 | 5046 | 4 |
2009 | 5046 | 4 |
2010 | 5046 | 4 |
2011 | 5046 | 4 |
2000 | 5439 | 5 |
2001 | 5439 | 5 |
2002 | 5439 | 4 |
2003 | 5439 | 4 |
2004 | 5439 | 4 |
2002 | 5597 | 4 |
2003 | 5597 | 4 |
2004 | 5597 | 4 |
2005 | 5597 | 4 |
2006 | 5597 | 4 |
2007 | 5597 | 4 |
2004 | 5723 | 2 |
2005 | 5723 | 2 |
2006 | 5723 | 2 |
2007 | 5723 | 2 |
2008 | 5723 | 2 |
2009 | 5723 | 2 |
2010 | 5723 | 2 |
2011 | 5723 | 2 |
2012 | 5723 | 2 |
2003 | 5791 | 3 |
2004 | 5791 | 3 |
2005 | 5791 | 3 |
2006 | 5791 | 3 |
2007 | 5791 | 3 |
2008 | 5791 | 3 |
2009 | 5791 | 3 |
2010 | 5791 | 3 |
2011 | 5791 | 3 |
2012 | 5791 | 3 |
2013 | 5791 | 3 |
2014 | 5791 | 3 |
2015 | 5791 | 3 |
2004 | 5860 | 3 |
2005 | 5860 | 3 |
2006 | 5860 | 3 |
2007 | 5860 | 3 |
2008 | 5860 | 3 |
2009 | 5860 | 3 |
2010 | 5878 | 3 |
2011 | 5878 | 3 |
2012 | 5878 | 3 |
2013 | 5878 | 3 |
2014 | 5878 | 3 |
2015 | 5878 | 3 |
2016 | 5878 | 3 |
2002 | 6821 | 3 |
2003 | 6821 | 3 |
2004 | 6821 | 3 |
2005 | 6821 | 3 |
2006 | 6821 | 3 |
2007 | 6821 | 3 |
2011 | 6845 | 1 |
2001 | 7127 | 5 |
2002 | 7127 | 5 |
2003 | 7127 | 3 |
2008 | 7163 | 2 |
2009 | 7163 | 2 |
2010 | 7163 | 2 |
2011 | 7163 | 2 |
2012 | 7163 | 2 |
2013 | 7163 | 2 |
2003 | 7585 | 4 |
2004 | 7585 | 4 |
2005 | 7585 | 4 |
2006 | 7585 | 4 |
2001 | 7922 | 3 |
2002 | 7922 | 3 |
2003 | 7922 | 3 |
2001 | 8358 | 1 |
2002 | 8358 | 1 |
2003 | 8358 | 1 |
2004 | 8358 | 1 |
2005 | 8358 | 1 |
2006 | 8358 | 1 |
2007 | 8358 | 1 |
2007 | 9372 | 2 |
2008 | 9372 | 2 |
2009 | 9372 | 2 |
2010 | 9372 | 2 |
2011 | 9372 | 2 |
2012 | 9372 | 2 |
2013 | 9372 | 2 |
2014 | 9372 | 2 |
2015 | 9372 | 2 |
2000 | 9667 | 2 |
2011 | 9882 | 2 |
2012 | 9882 | 2 |
2013 | 9882 | 2 |
2014 | 9882 | 2 |
2015 | 9882 | 2 |
2000 | 10386 | 346 |
2001 | 10386 | 346 |
2002 | 10386 | 346 |
2003 | 10386 | 1 |
2004 | 10386 | 1 |
2005 | 10386 | 1 |
2006 | 10386 | 1 |
2007 | 10386 | 6 |
2008 | 10386 | 6 |
2004 | 10581 | 2 |
2005 | 10581 | 2 |
2006 | 10581 | 2 |
2007 | 10581 | 2 |
2008 | 10581 | 2 |
2009 | 10581 | 2 |
2010 | 10581 | 2 |
2011 | 10581 | 2 |
2001 | 10639 | 2 |
2002 | 10639 | 2 |
2003 | 10639 | 2 |
2004 | 10639 | 2 |
2005 | 10639 | 2 |
2008 | 10793 | 2 |
2009 | 10793 | 2 |
2011 | 10801 | 1 |
2012 | 10801 | 1 |
2013 | 10801 | 1 |
2014 | 10801 | 1 |
2015 | 11264 | 3 |
2000 | 11300 | 1 |
2001 | 11300 | 1 |
2002 | 11300 | 1 |
2003 | 11300 | 1 |
2004 | 11300 | 1 |
2005 | 11300 | 1 |
2006 | 11300 | 1 |
2001 | 12122 | 3 |
2002 | 12122 | 3 |
2003 | 12122 | 3 |
2006 | 12141 | 3 |
2007 | 12141 | 3 |
2008 | 12141 | 3 |
2009 | 12141 | 3 |
2010 | 12141 | 3 |
2011 | 12141 | 3 |
2000 | 12215 | 1 |
2001 | 12215 | 1 |
2002 | 12215 | 1 |
2003 | 12215 | 1 |
2004 | 12215 | 1 |
2005 | 12215 | 1 |
2000 | 12757 | 1 |
2001 | 12757 | 1 |
2002 | 12757 | 1 |
2003 | 12757 | 1 |
2004 | 12757 | 1 |
2005 | 12757 | 1 |
2006 | 12757 | 1 |
2008 | 12785 | 2 |
2009 | 12785 | 2 |
2000 | 13407 | 2 |
2001 | 13407 | 2 |
2013 | 14304 | 2 |
2014 | 14304 | 4 |
2000 | 14412 | 3 |
2005 | 15247 | 4 |
2006 | 15247 | 4 |
2007 | 15247 | 4 |
2013 | 16101 | 2 |
2014 | 16101 | 2 |
2015 | 16101 | 2 |
2016 | 16101 | 2 |
2017 | 16101 | 2 |
2008 | 21186 | 2 |
2009 | 21186 | 2 |
2010 | 21186 | 2 |
2011 | 21186 | 2 |
2003 | 23767 | 1 |
2004 | 23767 | 1 |
2005 | 23767 | 1 |
2005 | 23978 | 1 |
2006 | 23978 | 1 |
2007 | 23978 | 1 |
2008 | 23978 | 1 |
2009 | 23978 | 1 |
2010 | 23978 | 1 |
2011 | 23978 | 1 |
2012 | 23978 | 1 |
2013 | 23978 | 1 |
2014 | 23978 | 1 |
2001 | 24783 | 3 |
2002 | 24783 | 3 |
2003 | 24783 | 3 |
2004 | 24783 | 1 |
2005 | 24783 | 1 |
2006 | 24783 | 1 |
2007 | 24783 | 1 |
2008 | 24783 | 1 |
2009 | 24783 | 1 |
2013 | 25110 | 3 |
2014 | 25110 | 3 |
2015 | 25110 | 3 |
2016 | 25110 | 3 |
2000 | 25119 | 4 |
2001 | 25119 | 4 |
2002 | 25119 | 4 |
2003 | 25119 | 4 |
2004 | 25119 | 4 |
2005 | 25119 | 4 |
2006 | 25119 | 4 |
2007 | 25119 | 4 |
2008 | 25119 | 4 |
2009 | 25119 | 4 |
2010 | 25119 | 4 |
2007 | 25279 | 2 |
2008 | 25279 | 2 |
2009 | 25279 | 2 |
2010 | 25279 | 2 |
2011 | 25279 | 2 |
2012 | 25279 | 2 |
2013 | 25279 | 2 |
2014 | 25279 | 2 |
2015 | 25279 | 2 |
2011 | 25880 | 2 |
2012 | 25880 | 2 |
2013 | 25880 | 2 |
2014 | 25880 | 2 |
2015 | 25880 | 2 |
2003 | 28742 | 3 |
2004 | 28742 | 3 |
2005 | 28742 | 3 |
2006 | 28742 | 3 |
2007 | 28742 | 3 |
2008 | 28742 | 3 |
2009 | 28742 | 1 |
2010 | 28742 | 1 |
2011 | 28742 | 1 |
2012 | 28742 | 1 |
2013 | 28742 | 1 |
2014 | 28742 | 1 |
2015 | 28742 | 1 |
2016 | 28742 | 1 |
2004 | 28930 | 2 |
2005 | 28930 | 2 |
2006 | 28930 | 2 |
2007 | 28930 | 2 |
2008 | 28930 | 2 |
2009 | 28930 | 2 |
2010 | 28930 | 2 |
2005 | 29968 | 3 |
2006 | 29968 | 3 |
2007 | 29968 | 3 |
2008 | 29968 | 3 |
2009 | 29968 | 3 |
2010 | 29968 | 3 |
2011 | 29968 | 3 |
2012 | 30032 | 2 |
2013 | 30032 | 2 |
2001 | 31488 | 2 |
2002 | 31488 | 2 |
2003 | 31488 | 2 |
2004 | 31488 | 2 |
2005 | 31488 | 2 |
2001 | 61155 | 5 |
2005 | 61155 | 2 |
2006 | 61155 | 2 |
2002 | 63051 | 5 |
2003 | 63051 | 2 |
2003 | 117768 | 4 |
2004 | 117768 | 4 |
2005 | 117768 | 4 |
2006 | 117768 | 1 |
2002 | 126554 | 1 |
2003 | 126554 | 1 |
2004 | 126554 | 1 |
2005 | 126554 | 1 |
2006 | 126554 | 1 |
2007 | 126554 | 1 |
2008 | 126554 | 1 |
2009 | 126554 | 1 |
2010 | 126554 | 1 |
2011 | 126554 | 1 |
2012 | 126554 | 1 |
2004 | 128978 | 2 |
2005 | 128978 | 2 |
2006 | 128978 | 2 |
2007 | 128978 | 2 |
2008 | 128978 | 2 |
2009 | 128978 | 2 |
2010 | 128978 | 2 |
2008 | 141720 | 2 |
2009 | 141720 | 2 |
2010 | 141720 | 2 |
2011 | 141720 | 2 |
2012 | 141720 | 2 |
2013 | 141720 | 6 |
2014 | 141720 | 6 |
2001 | 144559 | 1 |
2002 | 144559 | 1 |
2003 | 144559 | 1 |
2004 | 144559 | 1 |
2005 | 144559 | 1 |
2007 | 147988 | 2 |
2008 | 147988 | 2 |
2009 | 147988 | 2 |
2004 | 157354 | 1 |
2005 | 157354 | 1 |
2006 | 157354 | 1 |
2007 | 157354 | 1 |
2008 | 157354 | 1 |
2009 | 157354 | 1 |
2010 | 157354 | 1 |
2011 | 157354 | 1 |
2012 | 157354 | 1 |
2013 | 157354 | 1 |
2007 | 176899 | 1 |
2008 | 176899 | 1 |
2010 | 176899 | 1 |
2011 | 176899 | 1 |
2012 | 176899 | 1 |
2007 | 177925 | 2 |
2008 | 177925 | 2 |
2009 | 177925 | 2 |
2010 | 177925 | 2 |
2011 | 177925 | 2 |
2012 | 177925 | 2 |
2013 | 177925 | 2 |
2014 | 177925 | 2 |
2015 | 177925 | 2 |
2016 | 177925 | 2 |
2009 | 183366 | 3 |
2010 | 183366 | 3 |
2011 | 183366 | 3 |
2012 | 183366 | 3 |
Most of those should be flagged based on your explanation of the logic.
Of the 394 observations there are only 13 where the auditor only audited one firm in that year.
AUDITOR_ Obs cyear gvkey FKEY n_firms flag 9 2000 25119 4 1 0 14 2000 10386 346 1 0 39 2001 10386 346 1 0 68 2002 10386 346 1 0 217 2007 10386 6 1 0 245 2008 10386 6 1 0 321 2011 5046 4 1 0 361 2013 141720 6 1 0 375 2014 14304 4 1 0 376 2014 141720 6 1 0 377 2015 28742 1 1 0 389 2016 28742 1 1 0 394 2017 16101 2 1 0
PS: Notice how much easier it is to share plain text instead of tables? Not only can it be copied into code directly it display using much less screen space on this forum.
Thanks for your help, Tom. Your code works. I was not clear about my question. Have a great day.
Perhaps it would be easier to understand if you used normal SAS code instead of SQL logic?
Sort the data by YEAR and AUDITOR (which way you nest them does not really matter) and FIRM. Then in a data step you can use two DO loops to read through the year X auditor groups of records twice. Once to count how many distinct firms are listed. And the second time to re-read the data so the detailed observations can be written out.
proc sort data=have;
by cyear AUDITOR_FKEY gvkey ;
run;
data want;
do _n_=1 by 1 until(last.AUDITOR_FKEY);
set have;
by cyear AUDITOR_FKEY gvkey;
n_firms = sum(n_firms,first.gvkey);
end;
flag = n_firms > 1 ;
do _n_=1 to _n_ ;
set have;
output;
end;
run;
AUDITOR_ Obs cyear gvkey FKEY n_firms flag 1 2000 1913 1 1 0 2 2001 1913 1 1 0 3 2002 1913 1 1 0 4 2002 1380 2 1 0 5 2003 1913 1 1 0 6 2003 1161 2 2 1 7 2003 1380 2 2 1 8 2004 1161 2 2 1 9 2004 1380 2 2 1 10 2005 1161 2 2 1 11 2005 1380 2 2 1 12 2006 1161 2 2 1 13 2006 1380 2 2 1 14 2007 1380 2 1 0 15 2008 1380 2 1 0 16 2009 1380 2 1 0 17 2010 1380 2 1 0 18 2011 1380 2 1 0 19 2012 1380 2 1 0 20 2013 1380 2 1 0
@joon1 wrote:
I am sorry, Tom. Your code also produces almost 1 for 'flag'. I just posted
the whole dataset in another posting. Thanks.
Show exactly what observations have the wrong FLAG setting. And explain why. Your new data has only 13/394 observations where the auditor only audited one firm in that year. If that is NOT the situation you are trying to detect then please explain (with concrete examples) what you do want to flag.
Thanks a lot, Ksharp. Actually, your code works. I sincerely apologize for your confusion.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.