Hello all, I'm using this macro syntax which works very well : %macro QWR(var=,sheet=); DATA MD_QWR_GWG; SET MD_pooledyrsF; KEEP md_compyr sex lnwage realwages real_wklyearn md_stdwght_Pool &var; RUN; PROC SORT DATA = MD_QWR_GWG; by sex md_compyr &var; run; PROC MEANS DATA = MD_QWR_GWG NOPRINT; VAR lnwage realwages real_wklyearn ; BY sex md_compyr &var; WEIGHT md_stdwght_Pool; OUTPUT OUT = wages MEAN(lnwage realwages real_wklyearn ) = lnwage hourly_earn weekly_earn; RUN; DATA men women; SET wages; IF sex = 1 THEN output men; ELSE IF sex = 2 then output women; RUN; DATA MEN2; SET men; lnwage_M = lnwage; hrly_M = hourly_earn; wkly_M = weekly_earn; Nobs_M = _FREQ_; KEEP md_compyr lnwage_M hrly_M wkly_M Nobs_M &var; RUN; DATA WOMEN2; SET women; lnwage_F = lnwage; hrly_W = hourly_earn; wkly_W = weekly_earn; Nobs_W = _FREQ_; KEEP md_compyr lnwage_F hrly_W wkly_W Nobs_W &var; RUN; DATA GWG_QWR_All; MERGE MEN2 WOMEN2; by md_compyr; LOG_GAP = lnwage_M - lnwage_F; GWR_hrly = hrly_W / hrly_M; /* gender ratio */ GWR_wkly = wkly_W / wkly_M; GWG_hrly = 1 - GWR_hrly; /* gender gaps */ GWG_wkly = 1 - GWR_wkly; RUN; PROC PRINT DATA = GWG_QWR_ALL; TITLE ' Gender wage ratio, by demographic and job related characteristics, 2007-8 and 2021-2022 - QWR_All'; RUN; /* saves as an excel spreadsheet */ PROC EXPORT DATA= GWG_QWR_All OUTFILE="my_data\QWR\All.xlsx" DBMS=EXCELCS REPLACE; SHEET = "&sheet."; RUN; %mend QWR; options symbolgen; %QWR(var=AGEGROUP5yr, sheet=AGEGROUP5yr); %QWR(var=md_educlev, sheet=md_educlev); %QWR(var=dv_regionalt, sheet=dv_regionalt); %QWR(var=dv_ROQUE, sheet=dv_ROQUE); %QWR(var=dv_ontR, sheet=dv_ontR); Then, I add another variable (Group4) to the BY statement. /** By 4-group population **/ %macro QWR_gr(var=,sheet=); DATA MD_QWR_GWG; SET MD_pooledyrsF; KEEP md_compyr sex GROUP4 lnwage realwages real_wklyearn md_stdwght_Pool &var; RUN; PROC SORT DATA = MD_QWR_GWG; by sex md_compyr GROUP4 &var; run; PROC MEANS DATA = MD_QWR_GWG NOPRINT; VAR lnwage realwages real_wklyearn ; BY sex md_compyr GROUP4 &var; WEIGHT md_stdwght_Pool; OUTPUT OUT = wages MEAN(lnwage realwages real_wklyearn ) = lnwage hourly_earn weekly_earn; RUN; DATA men women; SET wages; IF sex = 1 THEN output men; ELSE IF sex = 2 then output women; RUN; DATA MEN2; SET men; WHERE Group4=1 ; lnwage_M = lnwage; hrly_M = hourly_earn; wkly_M = weekly_earn; Nobs_M = _FREQ_; KEEP md_compyr GROUP4 lnwage_M hrly_M wkly_M Nobs_M &var; RUN; DATA WOMEN2; SET women; lnwage_F = lnwage; hrly_W = hourly_earn; wkly_W = weekly_earn; Nobs_W = _FREQ_; KEEP md_compyr GROUP4 lnwage_F hrly_W wkly_W Nobs_W &var; RUN; DATA GWG_QWR_group; MERGE MEN2 WOMEN2; by md_compyr; LOG_GAP = lnwage_M - lnwage_F; GWR_hrly = hrly_W / hrly_M; /* gender ratio */ GWR_wkly = wkly_W / wkly_M; GWG_hrly = 1 - GWR_hrly; /* gender gaps */ GWG_wkly = 1 - GWR_wkly; RUN; PROC PRINT DATA = GWG_QWR_group; TITLE ' Gender wage ratio, by demographic and job related characteristics, 2007-8 and 2021-2022 - QWR_group'; RUN; /* saves as an excel spreadsheet */ PROC EXPORT DATA= GWG_QWR_group OUTFILE="my_data\QWR\group.xlsx" DBMS=EXCELCS REPLACE; SHEET = "&sheet."; RUN; %mend QWR_gr; options symbolgen; %QWR_gr(var=AGEGROUP5yr, sheet=AGEGROUP5yr); %QWR_gr(var=md_educlev, sheet=md_educlev); %QWR_gr(var=dv_regionalt, sheet=dv_regionalt); %QWR_gr(var=dv_ROQUE, sheet=dv_ROQUE); %QWR_gr(var=dv_ontR, sheet=dv_ontR); Here, the data looks like this: OBS md_compyr Group4 AGEGROUP5yr lnwage_M hrly_M ............. 1 2007 1 1 4.91 28.22 2 2007 1 2 2.17 17.11 3 2007 1 3 3.45 29.89 4 2007 1 4 3.55 28.63 5 2007 1 5 3.12 28.03 6 2007 1 6 2.98 20.13 7 2007 1 7 3.55 28.63 8 2007 2 1 3.55 28.63 9 2007 2 2 3.55 28.63 10 2007 2 3 3.55 28.63 11 2007 2 4 3.55 28.63 12 2007 2 5 3.55 28.63 13 2007 2 6 3.55 28.63 14 2007 2 7 3.55 28.63 15 2008 1 1 3.91 30.22 16 2008 1 2 3.50 30.43 17 2008 1 3 3.19 29.80 18 2008 1 4 2.89 27.63 19 2008 1 5 3.22 28.13 20 2008 1 6 2.88 20.10 21 2008 1 7 3.25 28.13 22 2008 2 1 3.25 28.13 23 2008 2 2 3.25 28.13 24 2008 2 3 3.25 28.13 25 2008 2 4 3.25 28.13 26 2008 2 5 3.25 28.13 27 2008 2 6 3.25 28.13 28 2008 2 7 3.25 28.13 for each year group, the estimations of age groups are repeated after first category of 7. It's the same for other variables. Could somebody tell me why? Thanks. Note: It has nothing to do with the macro. I tested it.
... View more