SAS Data Integration Studio, DataFlux Data Management Studio, SAS/ACCESS, SAS Data Loader for Hadoop and others

creating dummy variable conditional on group characteristics

Accepted Solution Solved
Reply
Super Contributor
Posts: 413
Accepted Solution

creating dummy variable conditional on group characteristics

hi,

suppose that for each company-date combination I have a dummy variable called Duality and a description of a director:

company

date

duality

description

A

31/12/2010

1

chairman ahah

A

31/12/2009

1

chariman 57hf

A

31/12/2009

1

ceo kgk9g9g

B

20/10/2014

0

ceo gk9

C

13/2/2002

0

chairman chairmans

I would like to create a new dummy variable ceo_dummy such that:

whenever the word 'ceo' appears in the description ceo_dummy=1, BUT, if for any company-date duality=1 and there is no description entry with the word 'ceo', then whenever the word 'chairman' appears in the description ceo_dummy=1.

so the result shuold be the following:

company

date

duality

description

ceo_dummy

A

31/12/2010

1

chairman ahah

1

A

31/12/2009

1

chariman 57hf

0

A

31/12/2009

1

ceo kgk9g9g

1

B

20/10/2014

0

ceo gk9

1

C

13/2/2002

0

chairman chairmans

0

like this every company-date combination has only one ceo_dummy=1

thank you!


Accepted Solutions
Solution
‎06-10-2015 10:12 AM
Super User
Posts: 9,662

Re: creating dummy variable conditional on group characteristics

Code: Program

data company;
   input company:$1.  date:ddmmyy.  duality  description &$16.;
   format date ddmmyy10.;
   cards;
A 31/12/2010 1 chairman ahah 
A 31/12/2009 1 chairman 57hf 
A 31/12/2009 1 ceo kgk9g9g 
B 20/10/2014 0 /ceo/ gk9 
C 13/2/2002 0 chairman chairmans
;;;;
   run;
proc sql;
create table temp as
select *,(prxmatch('/\bceo\b/i',description ) ne 0) as ceo,
   (prxmatch('/\bchairman \b/i',description ) ne 0) as chairman,
   (sum(calculated ceo) ne 0) as hasCEO
  from company
   group by company,date;
create table want as
select company,date,duality,description,
   case when ceo=1 then 1
   when (duality=1 and hasCEO=0 and chairman=1 ) then 1 else 0 end as dummy_ceo
  from temp;
quit;

View solution in original post


All Replies
Valued Guide
Posts: 858

Re: creating dummy variable conditional on group characteristics

will it always be leading the description and lower case?  Here's a solution for what you have:

data have;

infile cards dsd;

informat date ddmmyy10.;

format date mmddyy10.;

length description $25.;

input company $ date duality description $;

cards;

A,31/12/2010,1,chairman ahah

A,31/12/2009,1,chariman 57hf

A,31/12/2009,1,ceo kgk9g9g

A,31/12/2009,1,kgk9g9g

;

run;

data want;

set have;

if (duality = 1 and upcase(scan(description,1,' ')) = 'CEO') or

   (duality = 1 and upcase(scan(description,1,' ')) = 'CHAIRMAN') then ceo_dummy = 1;

run;

Super Contributor
Posts: 413

Re: creating dummy variable conditional on group characteristics

hi Mark,

you code works fine with your data.

But in my case I might have the words ceo be in places like President/CEO ----> since there is no space the code doesn't make ceo_dummy=1.

Is there a way to correct for this?

Thank you!

Valued Guide
Posts: 858

Re: creating dummy variable conditional on group characteristics

Index() will give you the location of where the string you are searching for starts so it will find ceo anywhere but it is case sensitive.  This will work for what you have described but again, if there are many other combination of what you are looking for this will probably not be the best solution.

data have;

infile cards dsd;

informat date ddmmyy10.;

format date mmddyy10.;

length description $25.;

input company $ date duality description $;

cards;

A,31/12/2010,1,chairman ahah

A,31/12/2009,1,chariman 57hf

A,31/12/2009,1,ceo kgk9g9g

A,31/12/2009,1,President/CEO kgk9g9g

A,31/12/2009,1,Pres/CEO/dir kgk9g9g

A,31/12/2009,0,Pres/CEO/dir kgk9g9g

;

run;

data want;

set have;

if ((duality = 1) and ((index(description,'ceo') > 0) or (index(description,'CEO') > 0))) or

   ((duality = 1) and (upcase(scan(description,1,' '))) = 'CHAIRMAN') then ceo_dummy = 1;

run;

Respected Advisor
Posts: 3,777

Re: creating dummy variable conditional on group characteristics

This looks about right I think.

data company;
   input company:$1.  date:ddmmyy.  duality  description &$16.;
  
format date ddmmyy10.;
  
cards;
A  31/12/2010  1  chairman ahah 
A  31/12/2009  1  chariman 57hf 
A  31/12/2009  1  ceo kgk9g9g 
B  20/10/2014  0  /ceo/ gk9 
C  13/2/2002   0  chairman chairmans
;;;;
   run;
proc print;
  
run;
data new;
   hasceo=0;
  
do _n_ = 1 by 1 until(last.date);
      set company;
      by company date notsorted;
     
if duality and findw(description,'ceo',,'SPRIT') then hasceo=1;
     
end;
  
do _n_ = 1 to _n_;
      set company;
      select(duality);
         when(1) do;
           
if hasceo then ceo = findw(description,'ceo',,'SPRIT');
            else           ceo = findw(description,'chairman',,'SPRIT');
            end;
        
when(0) do;
            ceo = findw(description,
'ceo',,'SPRIT');
            end;
        
end;
      ceo = not not ceo;
     
output;
     
call missing(of ceo);
      end;
  
run;
proc print;
  
run;
   
6-9-2015 1-30-22 PM.png   
Super Contributor
Posts: 413

Re: creating dummy variable conditional on group characteristics

hi data_null,

the code works perfectly on your data but in my case (I forgot to mention and sorry for that) the word CEO can be in a string such as Pres/CEO/dir

and since there is no blank space the code doesn't give such an observation a ceo=1

Thnka you!

Respected Advisor
Posts: 3,777

Re: creating dummy variable conditional on group characteristics

Details. 

There are FINDW function options for that.

Solution
‎06-10-2015 10:12 AM
Super User
Posts: 9,662

Re: creating dummy variable conditional on group characteristics

Code: Program

data company;
   input company:$1.  date:ddmmyy.  duality  description &$16.;
   format date ddmmyy10.;
   cards;
A 31/12/2010 1 chairman ahah 
A 31/12/2009 1 chairman 57hf 
A 31/12/2009 1 ceo kgk9g9g 
B 20/10/2014 0 /ceo/ gk9 
C 13/2/2002 0 chairman chairmans
;;;;
   run;
proc sql;
create table temp as
select *,(prxmatch('/\bceo\b/i',description ) ne 0) as ceo,
   (prxmatch('/\bchairman \b/i',description ) ne 0) as chairman,
   (sum(calculated ceo) ne 0) as hasCEO
  from company
   group by company,date;
create table want as
select company,date,duality,description,
   case when ceo=1 then 1
   when (duality=1 and hasCEO=0 and chairman=1 ) then 1 else 0 end as dummy_ceo
  from temp;
quit;
☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 7 replies
  • 561 views
  • 6 likes
  • 4 in conversation