A small sample of the data can be downloaded from my Google Drive. You can use the data to debug my code. This is how my data looks like. At each annual report date, there is a count of male directors (male_dirs) and a count of female directors (male_dirs). During a year, members may leave or join the board, denoted by male_dirs_ann and female_dirs_ann ("_ann" is for "announcement" because they are merged from the "announcement" table), where 1 indicates one joined, -1 one left. I wanted to update the numbers of male and female directors (male_dirs and female_dirs) when any left or joined the board, by generating two new variables, "total_male" and "total_female". On the annual report dates, I just copied over the male_dirs and female_dirs because there is no change on these date. For dates with changes, I took the number from the last date (male_dirs/female_dirs) and added/deducted the current change (male_dirs_ann/female_dirs_ann). Row 2: total_male = 7 + 1 = 8, which is correct. Row 6 is wrong: A male director left, and total_male should have been 6 (i.e., 7-1). Rows 9 and 17 are also wrong, among others. My code is as follows: data board_size_change;
set board_size_change;
by companyID date;
retain total_male total_female;
if first.companyID = 1 then do;
total_male = 0;
total_female = 0;
end;
*At annual report date;
if not missing(annualreportdate) then do;
total_male=male_dirs;
total_female=female_dirs;
output;
end;
*At announcement;
if missing(annualreportdate) then do;
total_male + male_dirs_ann;
total_female + female_dirs_ann;
output;
end;
run; What I missed or is there another way to approach this? Thanks a lot!
... View more