Data visualization with SAS programming

Producing one-to-one line in statistical graphics

Reply
Occasional Contributor
Posts: 16

Producing one-to-one line in statistical graphics

Hi:

I need to display the one-to-one line in output from the statistical graphics procs SGSCATTER (matrix), SGPANEL (reg), and SGPLOT (scatter). Is there any way to do this easily?

Thanks.

Bill.
Occasional Contributor
Posts: 16

Re: Producing one-to-one line in statistical graphics

Just to clarify, by a one-to-one line in a scatter plot I mean a straight line going from the min(x1,x2) to max(y1,y2). For example,

min(x1,x2)=2,
max(y1,y2)=10,
one-to-one line = (2,2) to (10,10).

Ideally, the horizontal and vertical axis will have identical ranges and the plot area will be square, resulting in a line at 45 degrees. I guess I could determine the min and max in a PROC MEANS, and merge these values with the data set, but since the axis range is determined by the SG procedure, the line drawn by this method is not guaranteed to span the entire graph. I’m wondering if this can be accomlished using the Graph Template Language.
SAS Employee
Posts: 967

Re: Producing one-to-one line in statistical graphics

Since there were no replies for how to do it with the new sg procs, here's how to do it using an annotated line in gplot. (Note that this particular code assumes the X and Y axes have exactly the same range - if they do not, then you could use xsys and ysys='2', and then hard-code some data points)...

/* some fake/random data to plot */
data foo;
do x=0 to 1 by .01;
y=ranuni(123);
output;
end;
run;

/* annotate line from bottom/left to top/right */
data anno_diagonal;
xsys='1'; ysys='1';
function='move'; x=0; y=0; output;
function='draw'; x=100; y=100; color='red'; output;
run;

axis1 order=(0 to 1 by .2) minor=none offset=(0,0);

symbol1 value=dot interpol=none;

proc gplot data=foo anno=anno_diagonal;
plot y*x=1 / vaxis=axis1 haxis=axis1;
run;
SAS Super FREQ
Posts: 1,081

Re: Producing one-to-one line in statistical graphics

You want the line to span the data and preferably keep the 45 degree slope. Since you are open to using GTL, that is the best way for a SGPLOT (single-cell) type graph.

Run the SGPLOT procedure and extract the GTL code using the option TMPLOUT='filename'. This will give you the necessary GTL code to create the same graph. Then, add a LINEPARM within the LAYOUT OVERLAY. See the Reference Doc for the LINEPARM statement. You need to provide a point and slope. These values can come from columns in your data set. The line created will span the full data.

Also, replace the LAYOUT OVERLAY with a LAYOUT OVERLAYEQUATED. This will allow you to have equated axes, and retain your 45 degree angle for same data on X & Y. Please also see the Reference Doc for the syntax.

You can add the LINEPARM in the same way to get the SGPANEL (Classification Panel) type graph. Add it in the LAYOUT PROTOTYPE statement. However, you cannot created a DATALATTICE of equated plots. And, you can only use this with the basic plot statemets like Scatter, Series, Bar, etc.

To do the same for ScatterPlotMatrix, you will need to unroll it yourself and use a LAYOUT LATTICE, and populate each cell. Here you can add a LINEPARM and use a LAYOUT OVERLAYEQUATED.
Occasional Contributor
Posts: 16

Re: Producing one-to-one line in statistical graphics

Hi:
I had problems getting the Graphics Template Language to work properly, so I created a new variable for the one-to-one line coordinates and added plotted it by adding a "series" statement to the SGPLOT and SGPANEL Procs. It works perfectly. The program is shown below. For now I've given up trying to get one-to-one lines on the PROC SGSCATTER / Matrix output.
------------------------------
ODS LISTING CLOSE;
GOPTIONS RESET=ALL ftext='Helvetica' ;
OPTIONS orientation=landscape papersize=letter nonumber nodate;
ODS graphics on/reset
border = off
height = 7.5in
width = 9in
;
ODS PDF file='c:\temp\pm_intercomparison_hourly_data.pdf' DPI=300 style=analysis;
TITLE1 'CAPMoN PM Intercomparison Study--Hourly' j=r h=.8 "&SYSDATE9";
TITLE2 ;
* Calculate the min/max of all observations (ignoring the BY variable) for each of the two plotting variables;
PROC MEANS DATA=alldata NOPRINT;
VAR egbbam1 egbbam2;
OUTPUT OUT=min_max(KEEP=min_var1 min_var2 max_var1 max_var2) min=min_var1 min_var2 max=max_var1 max_var2;
RUN;
* Create new variables to store the MIN and MAX values of the two variables.
Insert the MIN pairing on the first observation of the BY group, and the MAX pairing in the last observation of the BY group.;
DATA alldata_with_1_to_1;
SET alldata;
BY yr_month;
RETAIN one 1 _found 0;
IF FIRST.yr_month
THEN DO;
SET min_max POINT=one;
one_to_one_x = MIN(min_var1,min_var2);
one_to_one_y = one_to_one_x;
END;
IF LAST.yr_month
THEN DO;
SET min_max POINT=one;
one_to_one_x = MAX(max_var1,max_var2);
one_to_one_y = one_to_one_x;
END;
* Store the value of the first non-missing BY group variable in the macro variable first_by_variable_value;
IF yr_month NE ''
AND NOT _found
THEN DO;
CALL SYMPUTX('first_by_variable_value',yr_month);
_found = 1;
END;
LABEL one_to_one_y = '1-1 Line';
DROP min_var1 min_var2 max_var1 max_var2 _found;
RUN;
* For SGPLOT output where all BY groups are on one plot, include the one-to-one pairings only for the first BY group.;
DATA alldata_for_sgplot;
SET alldata_with_minmax;
IF yr_month NE "&first_by_variable_value"
THEN DO;
one_to_one_x = .;
one_to_one_y = .;
END;
RUN;
PROC SGPLOT DATA=alldata_for_sgplot;
series x=one_to_one_x y=one_to_one_y / lineattrs=(pattern=dot thickness=2px color=gray) ;
REG x=egbbam1 Y=egbbam2 / CLM GROUP=yr_month;
RUN;
QUIT;
PROC SGPANEL DATA=alldata_with_1_to_1;
PANELBY yr_month / columns=4 rows=4 uniscale=all ;
reg x=egbbam1 Y=egbbam2 ;
series x=one_to_one_x y=one_to_one_y / lineattrs=(pattern=dot thickness=2px color=gray) ;
RUN;
QUIT;
ODS PDF CLOSE;
ODS GRAPHICS OFF;
Ask a Question
Discussion stats
  • 4 replies
  • 232 views
  • 0 likes
  • 3 in conversation