BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
bmhelm
Fluorite | Level 6

Dear Community:

I am using SAS 9.4. I am working on a project where I estimate multiple pairwise tetrachoric/polychoric correlations between dichotomous variables. I then take those correlations and make a heatmap. However, the correlations are not very strong, and/or there are more strongly negative correlations than positive ones. What results is a heatmap whose color range scale is centered at -0.25 (i.e., the white or "neutral/null" color is centered at -0.25 instead of, ideally, at 0). This results in a heatmap that provides a "warmer" (pink) color indicating correlation=0. The result is a figure that is harder to interpret. 

Here is an example of what this is looking like (I changed my variables' names for this post):

bmhelm_0-1682598051690.png

 

You can see that the the red/pink colors here are a little misleading, since they actually correspond to values close to 0 (no correlation).

So my GOAL would be: change the color scale so that the scale is symmetric between [-0.5,0.5] so that the 0 is centered and corresponds to the null/neutral values. Or another approach could be to keep the asymmetric color range here (as shown above: [-1.0,0.5] but change the color values so that white is indicating 0 correlations.

 

I think I would need to use/alter some RANGEATTR type of variable, but I have little experience with this from my reading, and I was not able to find solutions with the syntax/code examples.

 

Here is my full code:

 


*NOTE (!!!): for Sex=Female, the color scale is centered on -0.25, so I will want to figure out
how to recenter this;

*SEX=Female;
Title 'Correlation Matrix';
Title2 'Using PROC FREQ/plcorr Method';
Title3 'Sex = Female';
PROC FREQ Data=work.inpt_data;
Where Sex=0;
Tables (var1 var2 var3 var4 var5 var6 var7 var8)*
(A B C D E F G H I J K L M N O P Q R S T U V W X Y Z AA) / plcorr;
Ods output measures=mycorr_sexF_1 (where=(statistic="Tetrachoric Correlation"
or statistic="Polychoric Correlation")
keep = statistic table value ASE);
RUN;
*NOTE: correlation stats could not be calculated for some of the alphabet variables;

*The ods output statement makes an output dataset called "mycorr_" that has the
statistics from the Measures output and I specify that I only want the values for
ether the polychoric or tetrachoric correlation and only want to keep certain
variables (i.e., statistic, table, and value). But there is an additional 2 steps needed
to make this into a matrix (See next)*;
Title 'View the Tetrachoric Correlation Output';PROC PRINT Data=work.mycorr_sexF_1;
RUN;


DATA mycorr_sexF_2; *new dataset name working from the mycorr data made in the previous
output statement*;
Set mycorr_sexF_1;
Group = floor((_n_ - 1)/8); *the number at the end has to be 8, since I have 8 variables for numbered variable overall
that will serve to define what the groups are in this specific output*;
x = scan(table, 2, " *");
y = scan(table, 3, " *");
Keep group value table x y;
RUN;
*want to see how the new dataset looks*;
Title '';
PROC PRINT Data=work.mycorr_sexF_2;
RUN;
*nice, this makes a correlation matrix, but it is not in an easily-readable
order, so I can transform the data using
PROC TRANSPOSE... See next*;
Title '';
PROC TRANSPOSE Data=work.mycorr_sexF_2 out=mymatrix_sexF_2(drop = _name_ group) ;
Id x;
By group;
Var value;
RUN;
*this creates an output dataset that is formatted to look like a matrix,
and this is called "mymatrix"... will see what it looks like*;
Title '';
PROC PRINT Data=work.mymatrix_sexF_2;
RUN;

*this is fine, though I would have to color the cells myself in an output file. Likely better to try to get the heatmap to work.


*PROC SGPLOT has the heatmap or heatmapparm statement to attempt a heatmap: will try this next:
*See: https://go.documentation.sas.com/doc/en/pgmsascdc/9.4_3.5/grstatproc/p0y94pnvxskvkmn19u7sx2j043lq.ht...;

Ods graphics on;
Ods html sge=on;
Ods trace on;
Title '4/27/23: HeatMap: ';
Title2 'Sex = Female';
PROC SGPLOT Data=work.mycorr_sexF_2;
Heatmapparm x=x y=y colorresponse=value / outline nomissingcolor; *coloresponse statement tells SAS that the third variable is numeric value (correlation stat);
Text x=x y=y text=value; *(***this adds the correlation values to the cells, but it looks messy without rescaling it in ods html file);
RUN;
Ods graphics off;
Ods trace off;

 

What I would like to see happen is something like the following, which was generated from a different project with slightly different range of values:

Sorry this looks really messy because I was altering it so much -- but you get the idea: the neutral (no correlation) is centered on 0, and this makes this figure much easier to interpret compared to the one above:

bmhelm_1-1682598917649.png

 

Thank you for any and all help!

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
bmhelm
Fluorite | Level 6

Thanks ballardw,

That is very helpful. I was able to get onto the right track I think, though I need to alter the min max values to get a better-looking figure (in terms of the color transitions). My SGPLOT code differed slightly, mainly in terms of where the colorresponse= and rattrid= options are located.

 

*Create custom RATTRMAP;

data myrattrmap;
retain id "myID";
length min $ 6 max $ 6 color altcolor colormodel1 colormodel2 colormodel3 $ 15;
input min $ max $ color $ altcolor $ colormodel1 $ colormodel2 $ colormodel3 $ ;
datalines;
-0.99 -0.29 . . blue lightblue verylightblue
-0.289 0.02 white white . . .
0.02 0.50 . . verylightred lightred red
;
run;

 


PROC SGPLOT Data=work.mycorr_sexF_2 rattrmap=myrattrmap;
Heatmapparm x=x y=y colorresponse=value / rattrid=myID outline nomissingcolor; *coloresponse statement tells SAS that the third variable is numeric value (correlation stat);
Text x=x y=y text=value; 
RUN;

 

Doing this, I was able to generate the following:

bmhelm_0-1682706508089.png

It does not look great. I think I can change the min and max values and also determine if I need more colormodel statements? I will read up on how to define these colormodels at the following: https://documentation.sas.com/doc/en/pgmsascdc/9.4_3.5/grstatproc/n1fxl54aizx1tjn1nfder5bsqqco.htm & https://documentation.sas.com/doc/en/pgmsascdc/9.4_3.5/grstatproc/p0edl20cvxxmm9n1i9ht3n21eict.htm

 

View solution in original post

2 REPLIES 2
ballardw
Super User

Here is an example of an RATTRMAP similar to what I think you are asking for. You would replace the MIN MAX values in the Myattrmap to the value ranges you want. The 13 14 below is a "middle" with a fixed color, the other two are color ramps. Depending on other options the range map may not extend to the border of your graphic for this example using the SASHELP.Class data set that you should have.

 

data myrattrmap;
retain id "myID";
length min $ 5 max $ 5 color  altcolor colormodel1 colormodel2  colormodel3 $ 15;
input min $ max $ color $ altcolor $ colormodel1 $ colormodel2 $ colormodel3 $  ;
datalines;
11  13    .       .     blue    lightblue verylightblue
13  14   white   white  .       .         .
14  16 .       .        verylightred   lightred      red
;
run;


proc sgplot data=sashelp.class rattrmap=myrattrmap;
  heatmap x=height y=weight / 
    colorresponse=age rattrid=myID;
run;

Suggestion for future similar questions: Provide the PLOT data in the form of data step code. No reason to clutter your question with lots of transformations unless asked. You can provide a reduced data set with fewer categories/observations just enough to demonstrate the result with your code. OR use one of the SAS supplied datasets in the SASHELP library.

bmhelm
Fluorite | Level 6

Thanks ballardw,

That is very helpful. I was able to get onto the right track I think, though I need to alter the min max values to get a better-looking figure (in terms of the color transitions). My SGPLOT code differed slightly, mainly in terms of where the colorresponse= and rattrid= options are located.

 

*Create custom RATTRMAP;

data myrattrmap;
retain id "myID";
length min $ 6 max $ 6 color altcolor colormodel1 colormodel2 colormodel3 $ 15;
input min $ max $ color $ altcolor $ colormodel1 $ colormodel2 $ colormodel3 $ ;
datalines;
-0.99 -0.29 . . blue lightblue verylightblue
-0.289 0.02 white white . . .
0.02 0.50 . . verylightred lightred red
;
run;

 


PROC SGPLOT Data=work.mycorr_sexF_2 rattrmap=myrattrmap;
Heatmapparm x=x y=y colorresponse=value / rattrid=myID outline nomissingcolor; *coloresponse statement tells SAS that the third variable is numeric value (correlation stat);
Text x=x y=y text=value; 
RUN;

 

Doing this, I was able to generate the following:

bmhelm_0-1682706508089.png

It does not look great. I think I can change the min and max values and also determine if I need more colormodel statements? I will read up on how to define these colormodels at the following: https://documentation.sas.com/doc/en/pgmsascdc/9.4_3.5/grstatproc/n1fxl54aizx1tjn1nfder5bsqqco.htm & https://documentation.sas.com/doc/en/pgmsascdc/9.4_3.5/grstatproc/p0edl20cvxxmm9n1i9ht3n21eict.htm

 

sas-innovate-white.png

Register Today!

Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9.

 

Early bird rate extended! Save $200 when you sign up by March 31.

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 2 replies
  • 1327 views
  • 0 likes
  • 2 in conversation