Hello everyone.
my question today is How can I calculate the number of women from dummy variable?
I will explain my situation
I have a variable its name gender ( which means 1 female 0 male) this variable contain all firms and many years , as well as there are some firms have more than one woman director on the board that appear as repetition observations .
so i want to calculate the number of women in each firm by each year and the results comes out as new variable.
please help me i need a code for doing so because it take too much time
thanks in advance
Please show us a representative portion of your data, following these instructions:
Hi @PaigeMiller i have run the code and I got this message, please check it out
%let dataSetName = gender;
2 *number of observations you want to keep;
3 %let obsKeep = 30;
4
5 %let source_path =
5 ! https://gist.githubusercontent.com/statgeek/bcc55940dd825a13b9c8ca40a904cba9/raw/865d2cf18f515
5 ! 0b8e887218dde0fc3951d0ff15b/data2datastep.sas;
6
7 filename reprex url "&source_path";
8 %include reprex;
169 filename reprex;
NOTE: Fileref REPREX has been deassigned.
170
171 option linesize=max;
172 %data2datastep(dsn=&dataSetName, obs=&obsKeep);
NOTE: No rows were selected.
NOTE: PROCEDURE SQL used (Total process time):
real time 0.17 seconds
cpu time 0.03 seconds
DATA2DATASTEP DSN GENDER
DATA2DATASTEP FILE create_WORK_GENDER_data.sas
DATA2DATASTEP FMT YES
DATA2DATASTEP FMTLIST
DATA2DATASTEP INPUTLIST code:32. year:32. gender:32. Age:32.
DATA2DATASTEP LBL YES
DATA2DATASTEP LBLLIST code="code" year="year" gender="gender" Age="Age"
DATA2DATASTEP LIB WORK
DATA2DATASTEP LS 256
DATA2DATASTEP MEMLABEL
DATA2DATASTEP MSGTYPE NOTE
DATA2DATASTEP OBS 30
DATA2DATASTEP OUTLIB WORK
DATA2DATASTEP SQLEXITCODE 0
DATA2DATASTEP SQLOBS 0
DATA2DATASTEP SQLOOPS 17
DATA2DATASTEP SQLRC 0
DATA2DATASTEP SQLXOBS 0
DATA2DATASTEP VARLIST code year gender Age
data WORK.GENDER;
infile datalines dsd truncover;
input code:32. year:32. gender:32. Age:32.;
label code="code" year="year" gender="gender" Age="Age";
datalines;
1 2011 0 56
1 2011 0 53
1 2011 0 49
1 2011 1 52
1 2011 0 64
1 2011 0 62
1 2011 0 62
1 2011 0 50
1 2011 0 47
1 2011 0 58
1 2011 0 47
1 2011 0 60
1 2011 0 49
1 2011 0 57
1 2011 1 49
1 2011 0 62
1 2011 1 41
1 2011 0 63
1 2011 0 55
1 2011 1 55
1 2011 0 40
1 2011 0 37
1 2011 1 55
1 2011 0 36
1 2011 0 43
1 2011 0 54
1 2011 0 46
1 2011 0 45
1 2010 0 56
1 2010 1 48
Here, i extract 100 observations to be more clear.
I want to count the number of female directors in each firm and in each year to be like ( e.g., if the firm 1 has 3 female directors in 2011, in the raw data (that i have it ) appears as 3 observations in same fiscal year, but i need it as one observation in each year with number of female directors 3). i hope you are gonna help me out
data WORK.GENDER;
infile datalines dsd truncover;
input code:32. year:32. gender:32. Age:32.;
label code="code" year="year" gender="gender" Age="Age";
datalines;
1 2011 0 56
1 2011 0 53
1 2011 0 49
1 2011 1 52
1 2011 0 64
1 2011 0 62
1 2011 0 62
1 2011 0 50
1 2011 0 47
1 2011 0 58
1 2011 0 47
1 2011 0 60
1 2011 0 49
1 2011 0 57
1 2011 1 49
1 2011 0 62
1 2011 1 41
1 2011 0 63
1 2011 0 55
1 2011 1 55
1 2011 0 40
1 2011 0 37
1 2011 1 55
1 2011 0 36
1 2011 0 43
1 2011 0 54
1 2011 0 46
1 2011 0 45
1 2010 0 56
1 2010 1 48
1 2010 0 61
1 2010 1 40
1 2010 0 35
1 2010 0 42
1 2010 0 53
1 2010 0 44
1 2010 0 62
1 2010 0 54
1 2010 1 54
1 2010 0 39
1 2010 0 36
1 2010 1 54
1 2010 0 55
1 2010 0 52
1 2010 1 51
1 2010 0 63
1 2010 0 61
1 2010 0 61
1 2010 0 49
1 2010 0 46
1 2010 0 48
1 2010 0 53
1 2010 0 48
1 2009 0 64
1 2009 0 56
1 2009 0 56
1 2009 0 61
1 2009 0 49
1 2009 0 61
1 2009 0 66
1 2009 0 54
1 2009 0 45
1 2009 0 46
1 2009 1 47
1 2009 0 54
1 2009 0 46
1 2009 0 43
1 2009 0 67
1 2009 0 39
1 2009 0 49
1 2009 0 47
1 2009 0 54
1 2009 0 39
1 2009 1 57
1 2009 0 52
1 2009 0 54
1 2009 0 51
1 2008 0 66
1 2008 0 38
1 2008 0 48
1 2008 0 46
1 2008 0 53
1 2008 0 51
1 2008 0 38
1 2008 1 56
1 2008 0 63
1 2008 0 55
1 2008 0 50
1 2008 0 60
1 2008 0 48
1 2008 0 60
1 2008 0 65
1 2008 0 53
1 2008 0 44
1 2008 0 45
1 2008 1 46
1 2008 0 53
1 2008 0 45
1 2008 0 42
1 2008 0 53
;;;;
Consider this for count and proportion of females for each age.
data indicator;
set sashelp.class;
f = sex eq 'F';
run;
proc summary data=indicator nway missing;
class age;
var f;
output out=females(drop=_:) sum=count mean=prop n=n;
run;
proc print;
run;
I will try use it for calculate the number of female in my data
i will let you know if it does not work
thanks
@Radwan wrote:
It is not what i want
because this code count the number of female director in total. I need to count the number of female directors in each firm not in whole sample
My example was posted before you posted data and is an example of a technique. If you haven't use PROC SUMMARY before I suggest you consult the documentation.
I think it works
i did little changes then i got reasonable result
please have a look on the code
proc summary data=gender nway missing;
class gender;
var gender;
by code year;
output out=femaledirectors(drop=_:) sum=count mean=prop n=n;
run;
@Radwan wrote:
I think it works
i did little changes then i got reasonable result
please have a look on the code
proc summary data=gender nway missing; class gender; var gender; by code year; output out=femaledirectors(drop=_:) sum=count mean=prop n=n; run;
Or place all of the grouping variables on Class statement. No sorting would be needed.
proc summary data=gender nway missing; class gender code year; var gender; output out=femaledirectors(drop=_:) sum=count mean=prop n=n; run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.