BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Agent1592
Pyrite | Level 9

 

Dear SAS Community. I am trying to identify if CEO is also a chair on the board. For this purpose I need to create a dummy equal to 1 if the CEO is also a chair. The text is contained in the TITLEANN. 

Here is the sample code I use. First I clean the TITLEANN of all the unnecessary characters. Then I would like to use findw with multiple conditions but I am able to use only one. How can I identify the word chairman regardless of whether it is small or large caps and all of its variations like "chmn" and so on. Is it with prxmatch or findw function?

Can I have a list of all the variations of chairman and use them with this function?

 

 

data ceo_anncomp1;
set ceo_anncomp;
TITLEANN1 = compbl(translate(TITLEANN, " ", ";.,:()-/&"));
if findw(TITLEANN1,"chmn CEO")>0 then ceo_chair=1;
else ceo_chair=0;
run;

 

 

 

 

data test1.TEST;
  infile datalines dsd truncover;
  input id:$6. YEAR:32. TITLEANN:$250.;
  label id="Company ID Number" YEAR="Fiscal Year" TITLEANN="Annual Title";
datalines4;
008596,1992,chmn. & pres.
010884,1993,chmn. & CEO
014881,1993,chmn. & CEO
002812,1993,chmn. & CEO
008596,1993,chmn. & pres.
012349,1994,chmn. & CEO
010884,1994,chmn. & CEO
014881,1994,chmn. & CEO
002812,1994,chmn. & CEO
013468,1994,CEO
008596,1994,chmn. & pres.
012349,1995,chmn. & CEO
010884,1995,chmn. & CEO
014881,1995,chmn. & CEO
002812,1995,chmn. & CEO
013468,1995,chmn. & CEO
008596,1995,chmn. & CEO
012349,1996,v-p & acting CEO
010884,1996,chmn. & CEO
014881,1996,chmn. & CEO
028524,1996,chmn. & CEO
002812,1996,chmn. & CEO
013468,1996,chmn. & CEO
008596,1996,pres. & CEO
010884,1997,chmn. & CEO
014881,1997,chmn. & CEO
028524,1997,CEO
002812,1997,chmn. & CEO
013468,1997,CEO
008596,1997,"chmn, pres. & CEO"
010884,1998,chmn. & CEO
028524,1998,chmn. & CEO
002812,1998,chmn. & CEO
008596,1998,"chmn, pres. & CEO"
010884,1999,chmn. & CEO
028524,1999,chmn. & CEO
002812,1999,chmn. & CEO
010884,2000,chairman & CEO
028524,2000,CEO
002812,2000,chairman & CEO
140760,2000,president & CEO
010884,2001,chairman & CEO
028524,2001,vice chairman & CEO
002812,2001,chairman & CEO
140760,2001,president & CEO
010884,2002,"chairman, president & CEO"
028524,2002,CEO
002812,2002,chairman & CEO
140760,2002,president & CEO
010884,2003,"chairman, president & CEO"
028524,2003,president & CEO
140760,2003,president & CEO
010884,2004,"chairman, president & CEO"
028524,2004,president & CEO
002812,2004,chairman & CEO
140760,2004,"chairman, president & CEO"
010884,2005,"chairman, president & CEO"
028524,2005,president & CEO
002812,2005,chairman & CEO
140760,2005,"chairman, president & CEO"
010884,2006,"chairman, president & CEO"
028524,2006,president & CEO
002812,2006,chairman & CEO
140760,2006,"chairman, president & CEO"
010884,2007,"chairman, president & CEO"
002812,2007,president & CEO
140760,2007,"chairman, president & CEO"
010884,2008,"chairman, president & CEO"
002812,2008,president & CEO
030443,2008,
140760,2008,"chairman, president & CEO"
010884,2009,"Chairman, Chief Executive Officer and President"
002812,2009,"Chief Executive Officer, President, Director and Member of Corporate Compensation & Benefits Committee"
030443,2009,Chairman and Chief Executive Officer
140760,2009,"chairman, president & CEO"
010884,2010,"Chairman, Chief Executive Officer and President"
002812,2010,"Chief Executive Officer, President, Director and Member of Executive Committee"
030443,2010,Chairman and Chief Executive Officer
140760,2010,"Chairman, Chief Executive Officer, President and Chairman of Executive Committee"
010884,2011,"Chairman, Chief Executive Officer and President"
002812,2011,"Chief Executive Officer, President, Director and Member of Executive Committee"
030443,2011,Chairman and Chief Executive Officer
140760,2011,"Chairman, Chief Executive Officer, President and Chairman of Executive Committee"
010884,2012,Consultant
002812,2012,Director and Member of Executive Committee
030443,2012,Chairman and Chief Executive Officer
140760,2012,"Chairman, Chief Executive Officer and Chairman of Executive Committee"
010884,2013,"Chief Executive Officer, President and Director"
002812,2013,"Chairman, Chief Executive Officer and Chairman of Executive Committee"
030443,2013,Chairman of the Board and Chief Executive Officer
140760,2013,"Chairman, Chief Executive Officer and Chairman of Executive Committee"
010884,2014,"Chief Executive Officer, President and Director"
002812,2014,"Chairman, Chief Executive Officer and Chairman of Executive Committee"
030443,2014,Chairman of the Board and Chief Executive Officer
140760,2014,"Chairman, Chief Executive Officer and Chairman of Executive Committee"
030443,2015,Chairman and Chief Executive Officer
140760,2015,"Chairman, Chief Executive Officer and Chairman of Executive Committee"
002812,2016,Chairman and Chief Executive Officer
030443,2016,Chairman and Chief Executive Officer
140760,2016,"Executive Chairman, Chief Executive Officer and Chairman of Executive Committee"
;;;;

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
Ksharp
Super User
data TEST;
  infile datalines dsd truncover;
  input id:$6. YEAR TITLEANN:$250.;
  label id="Company ID Number" YEAR="Fiscal Year" TITLEANN="Annual Title";
  
  if prxmatch('/\b(chmn|chairman)\b/i',TITLEANN) then is_chair=1;
  if prxmatch('/\b(CEO|Chief Executive Officer)\b/i',TITLEANN) then is_ceo=1;
  ceo_chair=ifn(is_chair and is_ceo ,1,0);
  
datalines4;
008596,1992,chmn. & pres.
010884,1993,chmn. & CEO
014881,1993,chmn. & CEO
002812,1993,chmn. & CEO
008596,1993,chmn. & pres.
012349,1994,chmn. & CEO
010884,1994,chmn. & CEO
014881,1994,chmn. & CEO
002812,1994,chmn. & CEO
013468,1994,CEO
008596,1994,chmn. & pres.
012349,1995,chmn. & CEO
010884,1995,chmn. & CEO
014881,1995,chmn. & CEO
002812,1995,chmn. & CEO
013468,1995,chmn. & CEO
008596,1995,chmn. & CEO
012349,1996,v-p & acting CEO
010884,1996,chmn. & CEO
014881,1996,chmn. & CEO
028524,1996,chmn. & CEO
002812,1996,chmn. & CEO
013468,1996,chmn. & CEO
008596,1996,pres. & CEO
010884,1997,chmn. & CEO
014881,1997,chmn. & CEO
028524,1997,CEO
002812,1997,chmn. & CEO
013468,1997,CEO
008596,1997,"chmn, pres. & CEO"
010884,1998,chmn. & CEO
028524,1998,chmn. & CEO
002812,1998,chmn. & CEO
008596,1998,"chmn, pres. & CEO"
010884,1999,chmn. & CEO
028524,1999,chmn. & CEO
002812,1999,chmn. & CEO
010884,2000,chairman & CEO
028524,2000,CEO
002812,2000,chairman & CEO
140760,2000,president & CEO
010884,2001,chairman & CEO
028524,2001,vice chairman & CEO
002812,2001,chairman & CEO
140760,2001,president & CEO
010884,2002,"chairman, president & CEO"
028524,2002,CEO
002812,2002,chairman & CEO
140760,2002,president & CEO
010884,2003,"chairman, president & CEO"
028524,2003,president & CEO
140760,2003,president & CEO
010884,2004,"chairman, president & CEO"
028524,2004,president & CEO
002812,2004,chairman & CEO
140760,2004,"chairman, president & CEO"
010884,2005,"chairman, president & CEO"
028524,2005,president & CEO
002812,2005,chairman & CEO
140760,2005,"chairman, president & CEO"
010884,2006,"chairman, president & CEO"
028524,2006,president & CEO
002812,2006,chairman & CEO
140760,2006,"chairman, president & CEO"
010884,2007,"chairman, president & CEO"
002812,2007,president & CEO
140760,2007,"chairman, president & CEO"
010884,2008,"chairman, president & CEO"
002812,2008,president & CEO
030443,2008,
140760,2008,"chairman, president & CEO"
010884,2009,"Chairman, Chief Executive Officer and President"
002812,2009,"Chief Executive Officer, President, Director and Member of Corporate Compensation & Benefits Committee"
030443,2009,Chairman and Chief Executive Officer
140760,2009,"chairman, president & CEO"
010884,2010,"Chairman, Chief Executive Officer and President"
002812,2010,"Chief Executive Officer, President, Director and Member of Executive Committee"
030443,2010,Chairman and Chief Executive Officer
140760,2010,"Chairman, Chief Executive Officer, President and Chairman of Executive Committee"
010884,2011,"Chairman, Chief Executive Officer and President"
002812,2011,"Chief Executive Officer, President, Director and Member of Executive Committee"
030443,2011,Chairman and Chief Executive Officer
140760,2011,"Chairman, Chief Executive Officer, President and Chairman of Executive Committee"
010884,2012,Consultant
002812,2012,Director and Member of Executive Committee
030443,2012,Chairman and Chief Executive Officer
140760,2012,"Chairman, Chief Executive Officer and Chairman of Executive Committee"
010884,2013,"Chief Executive Officer, President and Director"
002812,2013,"Chairman, Chief Executive Officer and Chairman of Executive Committee"
030443,2013,Chairman of the Board and Chief Executive Officer
140760,2013,"Chairman, Chief Executive Officer and Chairman of Executive Committee"
010884,2014,"Chief Executive Officer, President and Director"
002812,2014,"Chairman, Chief Executive Officer and Chairman of Executive Committee"
030443,2014,Chairman of the Board and Chief Executive Officer
140760,2014,"Chairman, Chief Executive Officer and Chairman of Executive Committee"
030443,2015,Chairman and Chief Executive Officer
140760,2015,"Chairman, Chief Executive Officer and Chairman of Executive Committee"
002812,2016,Chairman and Chief Executive Officer
030443,2016,Chairman and Chief Executive Officer
140760,2016,"Executive Chairman, Chief Executive Officer and Chairman of Executive Committee"
;;;;

View solution in original post

6 REPLIES 6
Ksharp
Super User
data TEST;
  infile datalines dsd truncover;
  input id:$6. YEAR TITLEANN:$250.;
  label id="Company ID Number" YEAR="Fiscal Year" TITLEANN="Annual Title";
  
  if prxmatch('/\b(chmn|chairman)\b/i',TITLEANN) then is_chair=1;
  if prxmatch('/\b(CEO|Chief Executive Officer)\b/i',TITLEANN) then is_ceo=1;
  ceo_chair=ifn(is_chair and is_ceo ,1,0);
  
datalines4;
008596,1992,chmn. & pres.
010884,1993,chmn. & CEO
014881,1993,chmn. & CEO
002812,1993,chmn. & CEO
008596,1993,chmn. & pres.
012349,1994,chmn. & CEO
010884,1994,chmn. & CEO
014881,1994,chmn. & CEO
002812,1994,chmn. & CEO
013468,1994,CEO
008596,1994,chmn. & pres.
012349,1995,chmn. & CEO
010884,1995,chmn. & CEO
014881,1995,chmn. & CEO
002812,1995,chmn. & CEO
013468,1995,chmn. & CEO
008596,1995,chmn. & CEO
012349,1996,v-p & acting CEO
010884,1996,chmn. & CEO
014881,1996,chmn. & CEO
028524,1996,chmn. & CEO
002812,1996,chmn. & CEO
013468,1996,chmn. & CEO
008596,1996,pres. & CEO
010884,1997,chmn. & CEO
014881,1997,chmn. & CEO
028524,1997,CEO
002812,1997,chmn. & CEO
013468,1997,CEO
008596,1997,"chmn, pres. & CEO"
010884,1998,chmn. & CEO
028524,1998,chmn. & CEO
002812,1998,chmn. & CEO
008596,1998,"chmn, pres. & CEO"
010884,1999,chmn. & CEO
028524,1999,chmn. & CEO
002812,1999,chmn. & CEO
010884,2000,chairman & CEO
028524,2000,CEO
002812,2000,chairman & CEO
140760,2000,president & CEO
010884,2001,chairman & CEO
028524,2001,vice chairman & CEO
002812,2001,chairman & CEO
140760,2001,president & CEO
010884,2002,"chairman, president & CEO"
028524,2002,CEO
002812,2002,chairman & CEO
140760,2002,president & CEO
010884,2003,"chairman, president & CEO"
028524,2003,president & CEO
140760,2003,president & CEO
010884,2004,"chairman, president & CEO"
028524,2004,president & CEO
002812,2004,chairman & CEO
140760,2004,"chairman, president & CEO"
010884,2005,"chairman, president & CEO"
028524,2005,president & CEO
002812,2005,chairman & CEO
140760,2005,"chairman, president & CEO"
010884,2006,"chairman, president & CEO"
028524,2006,president & CEO
002812,2006,chairman & CEO
140760,2006,"chairman, president & CEO"
010884,2007,"chairman, president & CEO"
002812,2007,president & CEO
140760,2007,"chairman, president & CEO"
010884,2008,"chairman, president & CEO"
002812,2008,president & CEO
030443,2008,
140760,2008,"chairman, president & CEO"
010884,2009,"Chairman, Chief Executive Officer and President"
002812,2009,"Chief Executive Officer, President, Director and Member of Corporate Compensation & Benefits Committee"
030443,2009,Chairman and Chief Executive Officer
140760,2009,"chairman, president & CEO"
010884,2010,"Chairman, Chief Executive Officer and President"
002812,2010,"Chief Executive Officer, President, Director and Member of Executive Committee"
030443,2010,Chairman and Chief Executive Officer
140760,2010,"Chairman, Chief Executive Officer, President and Chairman of Executive Committee"
010884,2011,"Chairman, Chief Executive Officer and President"
002812,2011,"Chief Executive Officer, President, Director and Member of Executive Committee"
030443,2011,Chairman and Chief Executive Officer
140760,2011,"Chairman, Chief Executive Officer, President and Chairman of Executive Committee"
010884,2012,Consultant
002812,2012,Director and Member of Executive Committee
030443,2012,Chairman and Chief Executive Officer
140760,2012,"Chairman, Chief Executive Officer and Chairman of Executive Committee"
010884,2013,"Chief Executive Officer, President and Director"
002812,2013,"Chairman, Chief Executive Officer and Chairman of Executive Committee"
030443,2013,Chairman of the Board and Chief Executive Officer
140760,2013,"Chairman, Chief Executive Officer and Chairman of Executive Committee"
010884,2014,"Chief Executive Officer, President and Director"
002812,2014,"Chairman, Chief Executive Officer and Chairman of Executive Committee"
030443,2014,Chairman of the Board and Chief Executive Officer
140760,2014,"Chairman, Chief Executive Officer and Chairman of Executive Committee"
030443,2015,Chairman and Chief Executive Officer
140760,2015,"Chairman, Chief Executive Officer and Chairman of Executive Committee"
002812,2016,Chairman and Chief Executive Officer
030443,2016,Chairman and Chief Executive Officer
140760,2016,"Executive Chairman, Chief Executive Officer and Chairman of Executive Committee"
;;;;
Agent1592
Pyrite | Level 9

Thanks Ksharp this works partially. However what if I am trying to find the exact matching word. For Example now if we have "vice chairman" and "chairman" both will be flagged as is_chair=1. Is there a way to be more precise?

ChrisNZ
Tourmaline | Level 20

Try

if prxmatch('/\S\b(chmn|chairman)\b/i',TITLEANN) then is_chair=1;

or

if prxmatch('/[",](chmn|chairman)\b/i',TITLEANN) then is_chair=1;

 

ChrisNZ
Tourmaline | Level 20

To get both flags, Maybe:

if      prxmatch('/\b(vice\W?ch(mn|airman)\b/i',TITLEANN) then is_vicechair=1;
else if prxmatch('/\b(chmn|chairman)\b/      i',TITLEANN) then is_chair    =1;
 
 
 
Ksharp
Super User

That is a big challenge . You need list all the words .

i.e.

if prxmatch('/\b(Vice Chairman|Vice-Chairman)\b/i') then is_chair=0 ;

 

ErikLund_Jensen
Rhodochrosite | Level 12

Hi @Agent1592 

 

In cases like this, I would use another approach, the use of a lookup list instead of using algoritms to identify CEO's and Chairmen. It is a little more complicated and maybe overkill in your case, but I normally work with production code, where data quality is a major concern. The reason is that

  1. even with 100 records as in your exampe, it is difficult to determine if the algoritm covers all occurences, and with thousends of records it becomes impossible.
  2. if input changes, and other title variants occurs, I would like some sort of notification so the code could be changed accordingly.

So here we go. Input is copied from your post and not repeated here.

 

First part is to create a distinct list of titles. A data step is used to split the title string in individual titles with one opuput record per title.:

 

 

data test2 (drop=i); set test;
	length w $250 title $100;
	w = tranwrd(upcase(tranwrd(titleann,' and ',',')),'&',',');;
	do i = 1 to count(w,',')+1;
		title = left(upcase(scan(w,i,',')));
		output;
	end;
run;

proc sql;
	create table titles as 
		select distinct title
		from test2;
quit;
data _null_; set titles;
	put '0 ' title;
run;

 

 

the result is a list written to the log, 5 out of 19 lines shown:

 

0 ACTING CEO
0 BENEFITS COMMITTEE
0 CEO
0 CHAIRMAN
0 CHAIRMAN OF EXECUTIVE COMMITTEE

 

Then I copy the list from the LOG window to data step code in the Program Editor and change code values from "0" to "1" (CEO) or "2" (Chairman) where appropriate. The result is this code, which should wtite to a permanent data set in production code.

 

 

data titlevalue;
	infile cards truncover;
	input value title $100.;
	cards;
0
0 ACTING CEO
0 BENEFITS COMMITTEE
1 CEO
2 CHAIRMAN
2 CHAIRMAN OF EXECUTIVE COMMITTEE
2 CHAIRMAN OF THE BOARD
1 CHIEF EXECUTIVE OFFICER
2 CHMN
2 CHMN.
0 CONSULTANT
0 DIRECTOR
2 EXECUTIVE CHAIRMAN
0 MEMBER OF CORPORATE COMPENSATION
0 MEMBER OF EXECUTIVE COMMITTEE
0 PRES.
0 PRESIDENT
0 V-P
0 VICE CHAIRMAN
;
run;

 

 

Now we are ready to use the list as a lookup table. The following code reads the original input table and is a veriant of the code used above to split the titles, but here it doesn't split observations, byt identifies CEO and chairmen AND reports titles not accounted for. Note that I added an extra line to the input before running to show the handling of unknown titles:

 

 


data ceochair (keep=id year titleann ceo_chair);
	length w $250 title $100 value 8;

	if _N_ = 1 then do;
		declare hash h(dataset:'work.titlevalue');
		h.defineKey('title');
		h.defineData('value');
		h.defineDone();
		call missing(value);
	end;

	set test;
	w = tranwrd(upcase(tranwrd(titleann,' and ',',')),'&',',');;
	do i = 1 to count(w,',')+1;
		title = left(upcase(scan(w,i,',')));
	    rc=h.find();
		if value = 2 then ceo = 1;
		if value = 1 then chair = 1;
		if value = . then put 'INFO: Unknown title: ' id= year=  title=;
	end;
	if ceo = 1 and chair = 1 then ceo_chair = 1;
run;
	

 

 

The log shows the result:

 

 

warn.gif

 

 

 

 

 

 

Part of the result data set - actually the coding might be inaccurate, because I made errors in the the lookup table, but the point is that it follows the rules set in the table, not in an error-prone algoritm: 

 

 

ceores.gif

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 727 views
  • 2 likes
  • 4 in conversation