Hello all, long time SAS user, first post! My apologies if this has been answered, however I have searched and searched and haven't been able to find the answer for this very specific problem. Note: I have found solutions for related issues, but a simple/elegant solution for this specific one has eluded me. I do have solutions (which I include in the code below), but I keep thinking there has to be some GCHART or AXIS option that can get me what I need without having to go way around as I do in the solutions I provide. Here is the issue: I have a numeric field that is continuous (or at least quasi-continuous; say integer with 200+ distinct values). I want to graph VBARs of this field, but want to bin the values into ranges, so I run a PROC FORMAT, then within the PROC GCHART I run a format statement to bin the values (I realize I haven't actually changed the values in the numeric field, only the way they are displayed). I want to show multiple graphs using a by variable. Not all by variable cuts have data in each bin, so there are missing midpoints on some of the graphs. I want to force the graphs to include all bins that are stipulated in the PROC FORMAT statement. All my solutions seem to me to be roundabout/workarounds. So to be clear, my question is: Are there options within GCHART/AXIS/etc... that I can use to force a formatted range midpoint onto the graph, given that the underlying field is continuous numeric, without having do do a lot of "buildup" workaround as I do in my solutions? Below I provide an example using SASHELP.CARS. Thanks! /* create format for Horsepower binning */
proc format;
value HPBins
low -< 100 = '<100'
100 -< 200 = '<200'
200 -< 300 = '<300'
300 -< 400 = '<400'
400 - high = '400+';
run;
/* The first graph includes all makes, and has data in all Horsepower bins. There is no
need to fix anything because all midpoints appear on the graph */
title1 "Model Count by Horsepower Bin";
title2 "Nothing Missing: all midpoints are on graph without requiring a fix";
title3 "All Makes";
proc gchart data=SASHELP.CARS;
vbar Horsepower / discrete type=freq outside=freq;
format Horsepower HPBins.;
run;quit;
title3 ;
/* The next graph, which is by make (which I restrict to just 'Acura' for simplicity)
only has data in the [200,300) Horsepower bin, so not all bins appear on the graph */
title1 "Model Count by Horsepower Bin";
title2 "No Fix: not all bins showing";
proc gchart data=SASHELP.CARS (where=(Make eq 'Acura'));
vbar Horsepower / discrete type=freq outside=freq;
format Horsepower HPBins.;
by Make;
run;quit;
/* Solution 1: create character field for bins, then force character midpoints
onto graph via fully enumerating the bins/ranges in an axis order=() statement */
data CARS1;
set SASHELP.CARS;
format HPBins $4.;
HPBins=put(Horsepower,HPBins.);
run;
axis1 order=('<100' '<200' '<300' '<400' '400+');
title1 "Model Count by Horsepower Bin";
title2 "Solution 1: convert bins to character and force character midpoints onto graph";
proc gchart data=CARS1 (where=(Make eq 'Acura'));
vbar HPBins / type=freq outside=freq maxis=axis1;
by Make;
run;quit;
/* Solution 2: adjust each Horsepower value so that each bin can only have one
possible value, then force those midpoints onto the graph via a midpoints= statement */
data CARS2;
set SASHELP.CARS;
HPRedux=min(floor(Horsepower/100)*100,400);
run;
title1 "Model Count by Horsepower Bin";
title2 "Solution 2: adjust Horsepower to a single numeric value per bin and force numeric midpoints onto graph";
proc gchart data=CARS2 (where=(Make eq 'Acura'));
vbar HPRedux / type=freq outside=freq midpoints=(0 to 400 by 100);
format HPRedux HPBins.;
by Make;
run;quit;
/* Solution 3: if using sumvar for weighting, can cycle through all possibilities
to add records with 0 weight. In this case I use MSRP as the weight variable. */
proc sort data=SASHELP.CARS (keep=Make) out=MAKE nodupkey;
by Make;
run;
data CARS3;
set SASHELP.CARS (in=a) MAKE (in=b);
if a=1 then output;
if b=1 then do;
do Horsepower = 0 to 400 by 100;
MSRP=0;
output;
end;
end;
run;
proc sort data=CARS3;
by make;
run;
proc format; *force vbars with zero weight to show blank label (just my personal preference);
value MSRPzero
0 = ' '
other = [dollar12.];
run;
title1 "SUM MSRP by Horsepower Bin";
title2 "Solution 3: add rows so each make/HPBin possibility is represented in the data";
proc gchart data=CARS3 (where=(Make eq 'Acura'));
vbar Horsepower / discrete sumvar=msrp outside=sum;
format Horsepower HPBins. MSRP MSRPzero.;
by Make;
run;quit;
... View more