BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
yo1
Obsidian | Level 7 yo1
Obsidian | Level 7

Friends,

 

Currently I am working with the Current Population Survey table’s replicate weights. These are used to come up with accurate variances and standard errors for each observation(row).     I am working with a data table that has groups of education level (PEEDUCA) and gender (PESEX).

 

I have summed using PROC MEANS to get the totals for all the 160 weights for each grouping by education and gender variables but now I need to aggregate all 160 weight variables together.  

 

I received the sample code to do this but it only shows how to get the total statistical measures of variance, standard errors, etc. for the whole table and I need to get these measured, mentioned previously, by the education and gender group.

 

I made several attempts to modify the data step DATA3 ‘ s array statement but I did not get any success.

 

Anyone help?

 

Here is the data file attached.

 

--------------------------------------------------------------------------

 

 

 

 

PROC IMPORT OUT= sas2 DATAFILE= "M:\Directory\Datafile.xlsx"

            DBMS=xlsx REPLACE;

          GETNAMES=YES;

RUN;

 

proc sort data= sas2;

by pesex peeduca; run;

 

 

/* issue is below  */

 

data data3;

set data2 end=eof;

 

if _n_=1 then sdiffsq = 0;

array repwts{161} est rw1-rw160;

by pesex peeduca;

 

do I = 2 to 161 ;

   sdiffsq = sdiffsq + (repwts{i} - repwts{1})***2;

end;

if eof then do;

var = (4/160) * sdiffsq;

   se = (var)**.5;

  cv = se/est;

 

  output;

end;

 

run;

1 ACCEPTED SOLUTION

Accepted Solutions
mkeintz
PROC Star

Thanks for setting up a DATA step with sample data.  This program does what I think you want:

 

data sas;
input PESEX PEEDUCA est rw1 rw2 rw3 rw4 rw5;
datalines;
-1 -1 -49864 22874350.85 22659945.03 22903624.44 22767723.5 22892043.8
1 -1 9.34243E+11 7570002.72 7667491.2 7431655.81 7612309.3 7486121.77
1 31 12564146115 161933.92 167913.27 164331.15 170058.35 172511.73
1 32 29761571175 408365 419732.95 385364.74 414872.6 381754.14
1 33 62411106394 957588.79 969276.45 932760.04 976158.82 953027.19
1 34 1.04873E+11 1654197.28 1624400.43 1644643.08 1662470.12 1711297.02
1 35 1.23266E+11 1749037.05 1735108.1 1695043.87 1760309.9 1748697.13
1 36 1.44251E+11 2031777.33 2020457.88 2022012 2019879.09 2015074
1 37 1.50277E+11 2239705.77 2189660.6 2241284.35 2208840.2 2240355.83
1 38 55961473747 921086.34 892762.39 921477.7 897491.49 883803.37
1 39 1.02711E+12 20030575.08 19976286.72 19798599.3 20011243.15 20077031.04
1 40 6.00395E+11 11207288.37 11025635.7 11215725.14 11171560.08 11266788.19
1 41 1.24048E+11 2763522.16 2714987.14 2784518.41 2752007.09 2737672.08
1 42 1.16804E+11 2464708.19 2444951.98 2475915.5 2436125.98 2453272.57
1 43 5.68552E+11 11689392.43 11528258.03 11681518.29 11632424.83 11904497.23
1 44 1.96911E+11 4121142.13 4051989.25 4104225.14 4079413.35 4199083.46
1 45 61289762858 1351959.9 1303424.53 1350133.73 1397382.07 1343791.07
1 46 50422043835 1117781.09 1111152.04 1082617.42 1070579.19 1126276.84
2 -1 8.9259E+11 7179549 7329547.62 7224111.69 7118959.08 7165556.47
2 31 12404504260 171695.04 172210.01 170707.38 161384.89 176091.97
2 32 26762348751 425065.6 419739.62 390356.52 424182.92 403285.9
2 33 57288878784 955652.76 946880.29 926261.09 982999.19 945591.93
2 34 1.00864E+11 1511106.45 1510590.14 1473016.32 1496720.3 1519945.39
2 35 1.15301E+11 1701413.59 1729956.86 1677755.2 1671733.26 1760323.21
2 36 1.42934E+11 2067426.21 2068764.82 2093418.24 2084657.61 2110334.71
2 37 1.48372E+11 2332700.92 2311200.01 2387173.58 2330574.18 2323323.35
2 38 53328810768 897703.57 905293.56 930043.76 902796.97 894034.63
2 39 1.09284E+12 20979620.83 20842564.78 20969066.55 20909747.62 21065961.91
2 40 6.82496E+11 13383992.97 13288555.94 13292152.65 13405753.71 13614158.29
2 41 1.50574E+11 3204941.8 3174911.37 3230007.46 3208547.63 3276353.74
2 42 1.62257E+11 3521456.84 3508097.83 3486327.92 3496688.73 3474977.1
2 43 6.21475E+11 12485087.91 12321917.27 12384763.34 12447795.71 12692836.6
2 44 2.2571E+11 4612029.8 4498069.7 4560896.58 4576150.31 4643944.89
2 45 33655882083 737648.65 707857.14 745628.74 737061.2 745185.62
2 46 25628450446 557320.43 551752.51 560922.02 580039.81 595418.47
;
data want;
  set sas;
  var=var(of rw:);
  se=stderr(of rw:);
  cv=cv(of rw:);
run;
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

View solution in original post

6 REPLIES 6
VDD
Ammonite | Level 13 VDD
Ammonite | Level 13

should this set data2 end=eof;

be this set sas2 end=eof;

in your datastep

 

yo1
Obsidian | Level 7 yo1
Obsidian | Level 7

Yes.  That is correct.  Sorry about that.  It should be sas2

PaigeMiller
Diamond | Level 26

Can you show us part of your data using these instructions: https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat...

 

Some of us won't open .xlsx (or other Microsoft Office) files as they are a security risk.


Can you also show us the desired output for a small portion of the data? You talk about totals for all the 160 weights for each grouping by education and gender variables, but yet your code computes number across the columns in a single row, rather than down the columns.

--
Paige Miller
mkeintz
PROC Star

Your excel table has 35 rows, and 160 columns (rw1-rw160) and columns for PESEX, PEEDUCA, and EST.

 

You saw you want the "variances and standard errors for each observation".  If so, can't you just use the VAR, STDERR, and CV functions, as below? (untested by me since you haven't provided a DATA step with sample values),

 

data want;
  set have;
  var=var(of rw:);
  se=stderr(of rw:);
  cv=cv(of rw:);
run;

The "of rw:" says to use off the variables whose names starts with "rw".

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
yo1
Obsidian | Level 7 yo1
Obsidian | Level 7


Friends,

 

Sorry about the lack of data.  Here is a short sample of the main file.  It only uses 5 weight variables instead of the 160.  To fix the code in the DATA3 data step just change the value 161 to 6 .  and 160 to 5 for the var calculation.

 

 

 

data sas;

input PESEX PEEDUCA est rw1 rw2 rw3 rw4 rw5;

datalines;
-1 -1 -49864 22874350.85 22659945.03 22903624.44 22767723.5 22892043.8
1 -1 9.34243E+11 7570002.72 7667491.2 7431655.81 7612309.3 7486121.77
1 31 12564146115 161933.92 167913.27 164331.15 170058.35 172511.73
1 32 29761571175 408365 419732.95 385364.74 414872.6 381754.14
1 33 62411106394 957588.79 969276.45 932760.04 976158.82 953027.19
1 34 1.04873E+11 1654197.28 1624400.43 1644643.08 1662470.12 1711297.02
1 35 1.23266E+11 1749037.05 1735108.1 1695043.87 1760309.9 1748697.13
1 36 1.44251E+11 2031777.33 2020457.88 2022012 2019879.09 2015074
1 37 1.50277E+11 2239705.77 2189660.6 2241284.35 2208840.2 2240355.83
1 38 55961473747 921086.34 892762.39 921477.7 897491.49 883803.37
1 39 1.02711E+12 20030575.08 19976286.72 19798599.3 20011243.15 20077031.04
1 40 6.00395E+11 11207288.37 11025635.7 11215725.14 11171560.08 11266788.19
1 41 1.24048E+11 2763522.16 2714987.14 2784518.41 2752007.09 2737672.08
1 42 1.16804E+11 2464708.19 2444951.98 2475915.5 2436125.98 2453272.57
1 43 5.68552E+11 11689392.43 11528258.03 11681518.29 11632424.83 11904497.23
1 44 1.96911E+11 4121142.13 4051989.25 4104225.14 4079413.35 4199083.46
1 45 61289762858 1351959.9 1303424.53 1350133.73 1397382.07 1343791.07
1 46 50422043835 1117781.09 1111152.04 1082617.42 1070579.19 1126276.84
2 -1 8.9259E+11 7179549 7329547.62 7224111.69 7118959.08 7165556.47
2 31 12404504260 171695.04 172210.01 170707.38 161384.89 176091.97
2 32 26762348751 425065.6 419739.62 390356.52 424182.92 403285.9
2 33 57288878784 955652.76 946880.29 926261.09 982999.19 945591.93
2 34 1.00864E+11 1511106.45 1510590.14 1473016.32 1496720.3 1519945.39
2 35 1.15301E+11 1701413.59 1729956.86 1677755.2 1671733.26 1760323.21
2 36 1.42934E+11 2067426.21 2068764.82 2093418.24 2084657.61 2110334.71
2 37 1.48372E+11 2332700.92 2311200.01 2387173.58 2330574.18 2323323.35
2 38 53328810768 897703.57 905293.56 930043.76 902796.97 894034.63
2 39 1.09284E+12 20979620.83 20842564.78 20969066.55 20909747.62 21065961.91
2 40 6.82496E+11 13383992.97 13288555.94 13292152.65 13405753.71 13614158.29
2 41 1.50574E+11 3204941.8 3174911.37 3230007.46 3208547.63 3276353.74
2 42 1.62257E+11 3521456.84 3508097.83 3486327.92 3496688.73 3474977.1
2 43 6.21475E+11 12485087.91 12321917.27 12384763.34 12447795.71 12692836.6
2 44 2.2571E+11 4612029.8 4498069.7 4560896.58 4576150.31 4643944.89
2 45 33655882083 737648.65 707857.14 745628.74 737061.2 745185.62
2 46 25628450446 557320.43 551752.51 560922.02 580039.81 595418.47
;

 

mkeintz
PROC Star

Thanks for setting up a DATA step with sample data.  This program does what I think you want:

 

data sas;
input PESEX PEEDUCA est rw1 rw2 rw3 rw4 rw5;
datalines;
-1 -1 -49864 22874350.85 22659945.03 22903624.44 22767723.5 22892043.8
1 -1 9.34243E+11 7570002.72 7667491.2 7431655.81 7612309.3 7486121.77
1 31 12564146115 161933.92 167913.27 164331.15 170058.35 172511.73
1 32 29761571175 408365 419732.95 385364.74 414872.6 381754.14
1 33 62411106394 957588.79 969276.45 932760.04 976158.82 953027.19
1 34 1.04873E+11 1654197.28 1624400.43 1644643.08 1662470.12 1711297.02
1 35 1.23266E+11 1749037.05 1735108.1 1695043.87 1760309.9 1748697.13
1 36 1.44251E+11 2031777.33 2020457.88 2022012 2019879.09 2015074
1 37 1.50277E+11 2239705.77 2189660.6 2241284.35 2208840.2 2240355.83
1 38 55961473747 921086.34 892762.39 921477.7 897491.49 883803.37
1 39 1.02711E+12 20030575.08 19976286.72 19798599.3 20011243.15 20077031.04
1 40 6.00395E+11 11207288.37 11025635.7 11215725.14 11171560.08 11266788.19
1 41 1.24048E+11 2763522.16 2714987.14 2784518.41 2752007.09 2737672.08
1 42 1.16804E+11 2464708.19 2444951.98 2475915.5 2436125.98 2453272.57
1 43 5.68552E+11 11689392.43 11528258.03 11681518.29 11632424.83 11904497.23
1 44 1.96911E+11 4121142.13 4051989.25 4104225.14 4079413.35 4199083.46
1 45 61289762858 1351959.9 1303424.53 1350133.73 1397382.07 1343791.07
1 46 50422043835 1117781.09 1111152.04 1082617.42 1070579.19 1126276.84
2 -1 8.9259E+11 7179549 7329547.62 7224111.69 7118959.08 7165556.47
2 31 12404504260 171695.04 172210.01 170707.38 161384.89 176091.97
2 32 26762348751 425065.6 419739.62 390356.52 424182.92 403285.9
2 33 57288878784 955652.76 946880.29 926261.09 982999.19 945591.93
2 34 1.00864E+11 1511106.45 1510590.14 1473016.32 1496720.3 1519945.39
2 35 1.15301E+11 1701413.59 1729956.86 1677755.2 1671733.26 1760323.21
2 36 1.42934E+11 2067426.21 2068764.82 2093418.24 2084657.61 2110334.71
2 37 1.48372E+11 2332700.92 2311200.01 2387173.58 2330574.18 2323323.35
2 38 53328810768 897703.57 905293.56 930043.76 902796.97 894034.63
2 39 1.09284E+12 20979620.83 20842564.78 20969066.55 20909747.62 21065961.91
2 40 6.82496E+11 13383992.97 13288555.94 13292152.65 13405753.71 13614158.29
2 41 1.50574E+11 3204941.8 3174911.37 3230007.46 3208547.63 3276353.74
2 42 1.62257E+11 3521456.84 3508097.83 3486327.92 3496688.73 3474977.1
2 43 6.21475E+11 12485087.91 12321917.27 12384763.34 12447795.71 12692836.6
2 44 2.2571E+11 4612029.8 4498069.7 4560896.58 4576150.31 4643944.89
2 45 33655882083 737648.65 707857.14 745628.74 737061.2 745185.62
2 46 25628450446 557320.43 551752.51 560922.02 580039.81 595418.47
;
data want;
  set sas;
  var=var(of rw:);
  se=stderr(of rw:);
  cv=cv(of rw:);
run;
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 1102 views
  • 1 like
  • 4 in conversation