- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I am attempting to write a program using PROC MEANS that outputs the median, 25th percentile, 75th percentiles, and sum for different combinations of values for 6 variables: cat, ser, fun, caf, lud, wer.
In order to do so I've written the following program, which seems to work fairly well:
PROC MEANS NOPRINT data=work.DATA_collapsed;
CLASS cat ser fun caf lud wer;
VAR SUMPay;
BY SEX;
OUTPUT OUT=work.PricesForCombos median=med p25=Q1 p75=Q3 sum=SUMPay;
RUN;
PROC PRINT DATA=work.PricesForCombos;
RUN;
Unfortunately, my output shows combinations for my variables in which they have missing values (see output at bottom). However, none of my variables actually have missing values (values only = 0 and 1). I confirmed this with the following PROC FREQ:
PROC FREQ DATA=work.DATA_collapsed;
TABLES cat ser fun caf lud wer /missprint;
RUN;
Is there any simple explanation for this? What am I missing?_
Output (sorry that the header isn't aligned but hopefully you get the idea)
Obs SEX cat ser fun caf lud wer _TYPE_ _FREQ_ med Q1 Q3 SUMPay
875 2 1 1 0 . 1 . 58 1258 697.25 463.500 956.79 982322.25
876 2 1 1 1 . 0 . 58 454 712.33 453.600 983.00 359492.67
877 2 1 1 1 . 1 . 58 2000 844.70 600.525 1113.07 1850175.81
878 2 1 0 0 . 0 0 59 570 251.17 131.720 355.94 154409.88
879 2 1 0 0 . 0 1 59 11 162.65 116.800 360.00 3923.75
880 2 1 0 0 . 1 0 59 157 395.12 217.000 587.20 74664.93
881 2 1 0 0 . 1 1 59 9 439.00 209.820 903.50 6667.98
882 2 1 0 1 . 0 0 59 91 460.87 302.780 617.39 47802.26
883 2 1 0 1 . 0 1 59 2 217.87 216.240 219.50 435.74
884 2 1 0 1 . 1 0 59 154 495.40 254.000 630.34 81060.69
885 2 1 0 1 . 1 1 59 6 606.79 508.040 751.00 4950.81
886 2 1 1 0 . 0 0 59 353 500.90 311.430 742.65 202470.20
887 2 1 1 0 . 0 1 59 33 800.00 427.000 1041.40 25969.93
888 2 1 1 0 . 1 0 59 1110 677.76 456.250 917.99 837920.36
889 2 1 1 0 . 1 1 59 148 872.72 529.555 1276.86 144401.89
890 2 1 1 1 . 0 0 59 444 714.59 463.880 985.80 352264.78
891 2 1 1 1 . 0 1 59 10 559.26 277.280 917.96 7227.90
892 2 1 1 1 . 1 0 59 1937 843.15 600.230 1101.56 1760746.16
893 2 1 1 1 . 1 1 59 63 900.79 602.120 1533.20 89429.65
894 2 1 0 0 0 . . 60 677 257.00 133.590 376.53 195024.39
895 2 1 0 0 1 . . 60 70 457.98 282.500 725.23 44642.15
896 2 1 0 1 0 . . 60 175 460.00 267.070 609.88 86035.06
897 2 1 0 1 1 . . 60 78 523.75 325.730 682.50 48214.44
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
The _TYPE_ variable tells you which of the CLASS variables have been used. For example 58 is '111010'B so the first, second, third and fifth class variables were used to form the combinations. The fourth and sixth class variables were ignored, hence the missing values for CAF and WER.
If you do NOT want it to generate the summaries for all combinations of the class variables then add the NWAY option to the PROC MEANS statement.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
The _TYPE_ variable tells you which of the CLASS variables have been used. For example 58 is '111010'B so the first, second, third and fifth class variables were used to form the combinations. The fourth and sixth class variables were ignored, hence the missing values for CAF and WER.
If you do NOT want it to generate the summaries for all combinations of the class variables then add the NWAY option to the PROC MEANS statement.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I see. That makes sense. I really appreciate your help.
Just to confirm, when I'm seeing "." in my output, it's not actually referring to those values being missing, but rather that that variable is not a component of the displayed combination. Correct?
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
In your case, yes, since you did not include the MISSING option on the PROC MEANS statement. Without that option any observation that had a missing value for any of the class variables would have been excluded from the processing.