BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
subpar_actuary
Fluorite | Level 6

Hello all, long time SAS user, first post!

 

My apologies if this has been answered, however I have searched and searched and haven't been able to find the answer for this very specific problem. Note: I have found solutions for related issues, but a simple/elegant solution for this specific one has eluded me. I do have solutions (which I include in the code below), but I keep thinking there has to be some GCHART or AXIS option that can get me what I need without having to go way around as I do in the solutions I provide. 

 

Here is the issue:

I have a numeric field that is continuous (or at least quasi-continuous; say integer with 200+ distinct values). I want to graph VBARs of this field, but want to bin the values into ranges, so I run a PROC FORMAT, then within the PROC GCHART I run a format statement to bin the values (I realize I haven't actually changed the values in the numeric field, only the way they are displayed). I want to show multiple graphs using a by variable. Not all by variable cuts have data in each bin, so there are missing midpoints on some of the graphs. I want to force the graphs to include all bins that are stipulated in the PROC FORMAT statement. All my solutions seem to me to be roundabout/workarounds.

 

So to be clear, my question is: Are there options within GCHART/AXIS/etc... that I can use to force a formatted range midpoint onto the graph, given that the underlying field is continuous numeric, without having do do a lot of "buildup" workaround as I do in my solutions?

 

Below I provide an example using SASHELP.CARS.

 

Thanks!

 

 

/* create format for Horsepower binning */
proc format;
	value HPBins
		low -< 100  = '<100'
		100 -< 200  = '<200'
		200 -< 300  = '<300'
		300 -< 400  = '<400'
		400 -  high = '400+';
run;

/* The first graph includes all makes, and has data in all Horsepower bins. There is no
	need to fix anything because all midpoints appear on the graph */
title1 "Model Count by Horsepower Bin";
title2 "Nothing Missing: all midpoints are on graph without requiring a fix";
title3 "All Makes";
proc gchart data=SASHELP.CARS;
	vbar Horsepower / discrete type=freq outside=freq;
	format Horsepower HPBins.;
	run;quit;
title3 ;

/*	The next graph, which is by make (which I restrict to just 'Acura' for simplicity)
	only has data in the [200,300) Horsepower bin, so not all bins appear on the graph */
title1 "Model Count by Horsepower Bin";
title2 "No Fix: not all bins showing";
proc gchart data=SASHELP.CARS (where=(Make eq 'Acura'));
	vbar Horsepower / discrete type=freq outside=freq;
	format Horsepower HPBins.;
	by Make;
	run;quit;

/* Solution 1: create character field for bins, then force character midpoints
	onto graph via fully enumerating the bins/ranges in an axis order=() statement */
data CARS1;
	set SASHELP.CARS;
	format HPBins $4.;
	HPBins=put(Horsepower,HPBins.);
	run;

axis1 order=('<100' '<200' '<300' '<400' '400+');

title1 "Model Count by Horsepower Bin";
title2 "Solution 1: convert bins to character and force character midpoints onto graph";
proc gchart data=CARS1 (where=(Make eq 'Acura'));
	vbar HPBins / type=freq outside=freq maxis=axis1;
	by Make;
	run;quit;

/* Solution 2: adjust each Horsepower value so that each bin can only have one
	possible value, then force those midpoints onto the graph via a midpoints= statement */
data CARS2;
	set SASHELP.CARS;
	HPRedux=min(floor(Horsepower/100)*100,400);
	run;

title1 "Model Count by Horsepower Bin";
title2 "Solution 2: adjust Horsepower to a single numeric value per bin and force numeric midpoints onto graph";
proc gchart data=CARS2 (where=(Make eq 'Acura'));
	vbar HPRedux / type=freq outside=freq midpoints=(0 to 400 by 100);
	format HPRedux HPBins.;
	by Make;
	run;quit;

/* Solution 3: if using sumvar for weighting, can cycle through all possibilities
	to add records with 0 weight. In this case I use MSRP as the weight variable. */
proc sort data=SASHELP.CARS (keep=Make) out=MAKE nodupkey;
	by Make;
	run;

data CARS3;
	set SASHELP.CARS (in=a) MAKE (in=b);
	if a=1 then output;
	if b=1 then do;
		do Horsepower = 0 to 400 by 100;
			MSRP=0;
			output;
		end;
	end;
	run;

proc sort data=CARS3;
	by make;
	run;

proc format; *force vbars with zero weight to show blank label (just my personal preference);
	value MSRPzero
		0	   = ' ' 
		other = [dollar12.];
run;

title1 "SUM MSRP by Horsepower Bin";
title2 "Solution 3: add rows so each make/HPBin possibility is represented in the data";
proc gchart data=CARS3 (where=(Make eq 'Acura'));
	vbar Horsepower / discrete sumvar=msrp outside=sum;
	format Horsepower HPBins. MSRP MSRPzero.;
	by Make;
	run;quit;

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
subpar_actuary
Fluorite | Level 6

Hello Reeza. Thank you for your reply.

 

I like the idea of switching over to SGPLOT if that is better due to more options.

 

I noticed your SGPLOT code makes use of the "Solution 1" I provided (first creating a field with character bins). I played around with SGPLOT and came up with the following code that worked, which produces the desired graph directly from SGHELP.CARS without doing any other work (other than creating the HPBins format):

 

 

proc format;
	value HPBins
		low -< 100  = '<100'
		100 -< 200  = '<200'
		200 -< 300  = '<300'
		300 -< 400  = '<400'
		400 -  high = '400+';
run;

proc sgplot data=SASHELP.CARS (where=(Make='Acura'));
format Horsepower HPBins.;
vbar Horsepower / stat=freq ;
xaxis values = ('<100' '<200' '<300' '<400' '400+');
run;quit;

 

 

I think the crux of the issue is that with a numeric field that is binned via formatting (thus retaining numeric quality), GCHART only allows numeric values in the axis statement, while SGPLOT allows character values which represent those format strings (in this case in called within the xaxis statement).

View solution in original post

2 REPLIES 2
Reeza
Super User

SGPLOT doesn't display the same behaviour. SG procedures are newer but more developed and offer more options. I don't believe there's significant development work happening within SAS/GRAPH anymore. 

proc sgplot data=cars1;
by make;
vbar hpbins / stat=freq ;
xaxis values = ('<100' '<200' '<300' '<400' '400+');
run;quit;
subpar_actuary
Fluorite | Level 6

Hello Reeza. Thank you for your reply.

 

I like the idea of switching over to SGPLOT if that is better due to more options.

 

I noticed your SGPLOT code makes use of the "Solution 1" I provided (first creating a field with character bins). I played around with SGPLOT and came up with the following code that worked, which produces the desired graph directly from SGHELP.CARS without doing any other work (other than creating the HPBins format):

 

 

proc format;
	value HPBins
		low -< 100  = '<100'
		100 -< 200  = '<200'
		200 -< 300  = '<300'
		300 -< 400  = '<400'
		400 -  high = '400+';
run;

proc sgplot data=SASHELP.CARS (where=(Make='Acura'));
format Horsepower HPBins.;
vbar Horsepower / stat=freq ;
xaxis values = ('<100' '<200' '<300' '<400' '400+');
run;quit;

 

 

I think the crux of the issue is that with a numeric field that is binned via formatting (thus retaining numeric quality), GCHART only allows numeric values in the axis statement, while SGPLOT allows character values which represent those format strings (in this case in called within the xaxis statement).

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

SAS Enterprise Guide vs. SAS Studio

What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 2 replies
  • 434 views
  • 0 likes
  • 2 in conversation