Help using Base SAS procedures

Need help to calculate a variable

Reply
Regular Contributor
Posts: 162

Need help to calculate a variable

Good days to all,

I attached a dataset with company identification code (ticker), directors identification code (iddir), data year (year)  and directors' gender (dirgender).

I have to calculate a variable - the fraction of male directors on the board who sit on other boards on which there are female directors for a given year.

For example, let say company X have 14 directors in total in year 2010, 10 of them are males and 4 of them are females. 3 out of 10 male directors sit on the boards in other companies. 2 out of 3 male directors who sit on other boards with female directors (ie. one of 3 males directors who sit on other boards in other company without the presence of female directors).

Based on the case above:

The fraction of male directors on the board who sit on other boards on which there are female directors for Year 2010 = 2/10 or 0.2.

Hope to get any answer which relevant for the calculation.

Thank you.

mspak

Attachment
Frequent Contributor
Posts: 110

Need help to calculate a variable

what about the missing values for iddir?

Respected Advisor
Posts: 3,124

Need help to calculate a variable

Hi,

Your task can't be finished in one step (not by me at least). Following code has not been intensively tested:

/*purge records with missing data*/

data have;

set h.connected_male;

if cmiss(ticker,dirgender)=0;

if nmiss(iddir,year)=0;

run;

proc sql;

/*Mark the ticker-year group with female on board*/

create table want1 as

select distinct *, SUM(upcase(dirgender)='FEMALE')>0 AS _FEMALE FROM have

GROUP BY TICKER,YEAR

;

/*mark the iddir who holds multiple positions in different companies at the same period*/

create table want2 as

select distinct *, count(distinct ticker)>1 as _mult from want1

group by iddir, year;

/*count the positions per iddir-year */

create table want3 as

select distinct a.*,b._mult from want1 a, want2 b where

a.ticker=b.ticker

and a.year=b.year

and a.iddir=b.iddir

;

create table want4 as

select distinct *,sum(_female=1 and _mult=1) as _ct from want3

group by iddir, year;

quit;

/*Mark the iddir of interest*/

data want5;

set want4;

if ((_mult=0 and _ct>0) or (_mult=1 and _ct>1)) and upcase(dirgender)='MALE' then flag=1;

else flag=0;

run;

/*Calculated the percentage and sorted by descending percentage*/

proc sql;

create table want6 as

select ticker,year, sum(flag)/sum(upcase(dirgender)='MALE') as percent format=percent7.2 from want5

group by ticker, year

order by  percent desc, ticker,year;

quit;

Regards,

Haikuo

Frequent Contributor
Posts: 110

Need help to calculate a variable

@Haikuo

Did you consider about the duplicates?

I mean

gender     company    id           year

male       ENR          27719     2009

male       ENR         27719     2009

Respected Advisor
Posts: 3,124

Need help to calculate a variable

In my code, duplicates are removed by using 'distinct' option in SQL.

Regular Contributor
Posts: 162

Need help to calculate a variable

The duplicates and missing values are not be included.

Respected Advisor
Posts: 4,641

Need help to calculate a variable

mspak, assuming you can fix the missing iddir problem and that the abrupt jump in the number of companies in 2009 doesn't pose a problem, the following queries should proove useful:

proc sql;

/* Companies with females on the board */
create table femComp as
select distinct year, ticker from Connected_male where char(dirGender,1)="F";

/* Connecting directors : male directors sitting on the board of more than one
company with females on the board */
create table connDir as
select distinct C.year, C.iddir
from Connected_male as C inner join femComp as F
on C.year=F.year and C.ticker=F.ticker
where C.iddir is not missing and char(C.dirGender,1)="M"
group by C.year, C.iddir
having count(C.ticker) > 1;

/* Total count of connecting male directors per year */
create table connDirCount as
select year, count(iddir) as nConn from connDir group by year;

/* Total count of male directors per year*/
create table dirCount as
select year, count(distinct iddir) as nDir from Connected_male
where char(dirGender,1)="M" group by year;

/* Proportion of connecting directors to total male director count, every year */
select T.year, C.nConn label="Connecting male directors", T.nDir label="Total male directors",
coalesce(C.nConn, 0)/T.nDir as connProp format=percent10.1 label="Percent connecting"
from dirCount as T left join connDirCount as C on T.year=C.year;

PG

PG
Regular Contributor
Posts: 162

Need help to calculate a variable

Thanks PG,

I expect an output that percent of male directors which also sit in other boards with female directors for each firm-year.

I expect the following (for example):

Ticker   YEAR   PERCENT_CONNECTED MALE:

AAA      2003        CERTAIN PERCENTAGE

AAA      2004         CERTAIN PERCENTAGE

AAA      2005         CERTAIN PERCENTAGE

AAA      2006         CERTAIN PERCENTAGE

AAA      2007         CERTAIN PERCENTAGE

AAA      2008         CERTAIN PERCENTAGE

AAA      2009         CERTAIN PERCENTAGE

AAA      2010         CERTAIN PERCENTAGE

BBB      2003

.......and so on.

Thank you Smiley Happy

Respected Advisor
Posts: 4,641

Need help to calculate a variable

Then, it is even simpler:

proc sql;

/* Companies with females on the board */
create table femComp as
select distinct year, ticker from test.Connected_male where char(dirGender,1)="F";


/* Connecting directors : male directors sitting on the board of more than one
company with females on the board */
create table connDir as
select distinct C.year, C.iddir
from test.Connected_male as C inner join femComp as F
on C.year=F.year and C.ticker=F.ticker
where C.iddir is not missing and char(C.dirGender,1)="M"
group by C.year, C.iddir
having count(C.ticker) > 1;

/* Proportion of connecting directors in each board, every year */
Create table propConn as
select C.ticker, C.year,
count(D.iddir) / sum(char(C.dirGender,1)="M") as pctConn format=percent10.1 label="Percent connecting"
from test.Connected_male as C left join connDir as D on C.year=D.year and C.iddir=D.iddir
where ticker is not missing
group by C.ticker, C.year;

/* print the first 20 results */
proc print data=propConn(obs=20) label; run;

-------------------------------------------------------------------

                                                                    Percent

                   Obs    Ticker                 YEAR    connecting

                     1    A                      2001         0.0%
                     2    A                      2002          .
                     3    A                      2003         0.0%
                     4    A                      2004         0.0%
                     5    A                      2005         0.0%
                     6    A                      2006         0.0%
                     7    A                      2007         0.0%
                     8    A                      2008         0.0%
                     9    A                      2009        86.7%
                    10    A                      2010        88.9%
                    11    AA                     2001         0.0%
                    12    AA                     2002         0.0%
                    13    AA                     2003         0.0%
                    14    AA                     2004         0.0%
                    15    AA                     2005         0.0%
                    16    AA                     2006         0.0%
                    17    AA                     2007         0.0%
                    18    AA                     2008         0.0%
                    19    AA                     2009        81.0%
                    20    AA                     2010        82.6%

----------------------------------------------------------

PG

PG
Super User
Posts: 9,671

Need help to calculate a variable

How about:

libname x v9 'c:\';
%let company=A;
%let year=2009;
proc sort data=x.Connected_male(where=(ticker="&company" and year=&year and dirgender='Male')) out=have(keep=iddir) nodupkey;
  by iddir ;
run;
proc sort data=x.Connected_male out=dir(keep=iddir ticker rename=(iddir=_iddir) ) nodupkey;
  by iddir ticker ;
run;
proc sort data=x.Connected_male out=gender(keep=dirgender ticker rename=(ticker=_ticker)) nodupkey;
  by ticker dirgender ;
run;
data _null_;
 if _n_ eq 1 then do;
  if 0 then set dir ;
  declare hash dir(hashexp:10, dataset:'dir',multidata:'Y');
  declare hiter _dir('dir');
   dir.definekey('_iddir');
   dir.definedata('_iddir','ticker');
   dir.definedone();
  if 0 then set gender;
  declare hash gender(hashexp:10,dataset:'gender',multidata:'Y') ;
  declare hiter _gender('gender');
   gender.definekey('_ticker');
   gender.definedata('_ticker','dirgender');
   gender.definedone();
 end;
set have end=last;
 male+1; flag=0;
 do while(_dir.next()=0);
  if iddir eq _iddir and ticker ne "&company" then do;
     do while(_gender.next()=0);
      if ticker eq _ticker and dirgender eq: 'F' then flag=1;
     end;
                                            end;
 end;
 if flag=1 then yes+1;
if last then do;
               percent= yes/male;
               putlog NOTE: yes= male= percent= ;
             end;
run;


Ksharp

Regular Contributor
Posts: 162

Need help to calculate a variable

Thanks Ksharp,

If not mistaken, your program code is to generate a report, instead of a new data variable. I would like to claculate the percent of male connected (which defined as per the question) in each ticker (ie firm) for every year.

mspak

Super User
Posts: 9,671

Re: Need help to calculate a variable

OK. Easy.

libname x v9 'c:\';



%macro percent(company= ,year= );
proc sort data=x.Connected_male(where=(ticker="&company" and year=&year and dirgender='Male')) out=have(keep=iddir) nodupkey;
  by iddir ;
run;
proc sort data=x.Connected_male out=dir(keep=iddir ticker rename=(iddir=_iddir) ) nodupkey;
  by iddir ticker ;
run;
proc sort data=x.Connected_male out=gender(keep=dirgender ticker rename=(ticker=_ticker)) nodupkey;
  by ticker dirgender ;
run;
data temp(keep=year company percent);
length company $ 100;
 if _n_ eq 1 then do;
  if 0 then set dir ;
  declare hash dir(hashexp:10, dataset:'dir',multidata:'Y');
  declare hiter _dir('dir');
   dir.definekey('_iddir');
   dir.definedata('_iddir','ticker');
   dir.definedone();
  if 0 then set gender;
  declare hash gender(hashexp:10,dataset:'gender',multidata:'Y') ;
  declare hiter _gender('gender');
   gender.definekey('_ticker');
   gender.definedata('_ticker','dirgender');
   gender.definedone();
 end;
set have end=last;
 male+1; flag=0;
 do while(_dir.next()=0);
  if iddir eq _iddir and ticker ne "&company" then do;
     do while(_gender.next()=0);
      if ticker eq _ticker and dirgender eq: 'F' then flag=1;
     end;
                                            end;
 end;
 if flag=1 then yes+1;
if last then do;
               company="&company";
               year=&year;
               percent= yes/male;
               output ;
             end;
run;
proc append base=want data=temp ;run;
%mend percent;

proc sort data=x.Connected_male(keep=ticker year) out=index nodupkey;
  by ticker year ;
run;
data _null_;
 set index;
 call execute('%percent(company='||ticker||',year='||year||')');
run;


Ksharp

Regular Contributor
Posts: 162

Need help to calculate a variable

Thanks Ksharp,

I need time to understand the program. I know that you used the SAS Hash object function, but it is new to me. Will post an update.

Regards,

mspak

Ask a Question
Discussion stats
  • 12 replies
  • 188 views
  • 3 likes
  • 5 in conversation