BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
palolix
Lapis Lazuli | Level 10

Dear SAS Community,

 

I was looking for a way of visualizing the association between different categorical variables (nominal and ordinal variables). Since a heatmap will probably not work for categorical variables I think that a correspondence analysis should be a good option. Is it ok to include nominal (more than two levels) and also ordinal variables together in a correspondence analysis?

 

I would greatly appreciate your feedback!

1 ACCEPTED SOLUTION

Accepted Solutions
Ksharp
Super User

OK. Here you go.

 

data pop;
infile cards truncover;
input Row & $40. @;
do col = '1','2','3','4','5','6','7','8','9';
 input count @;
 output;
end;
datalines;
Asian              0 0 0 1 4 3 6 4 0
African American   0 1 7 16 8 37 57 44 8
Caucasian          0 1 4 13 6 24 51 38 9
Hispanic           0 0 0 0 0 0 0 0 0
Native American    5 1 12 24 24 82 108 90 12
Middle Eastern     0 1 2 10 15 24 31 27 6
Pacific Islander   0 0 1 0 0 3 0 0 0
;


proc freq data=pop; /* Row and column marginals */
   weight count/zero;
   tables row / noprint out=f2(drop=percent rename=(count=RowF row=mRow));
   tables col / noprint out=f3(drop=percent rename=(count=ColF col=mCol));
run;


data all1;
   if 0 then merge pop f2 f3;
   low = 0;
   x0  = 35; * 35 is the x axis value of margin freq;
   if _n_ = 1 then do;
      row = ' ';   col = ' ';  mrow = ' '; mcol = ' ';
      count = 0;   rowf = 0;   colf = 0;   output;
      count = 358; rowf = 358; colf = 358; output; * 358 is the max count in margin freq;
   end;
   merge pop f2 f3;
   output;
run;



ods graphics on / height=4.9in width=6.2in;
title;
proc sgplot data=all1 noautolegend noborder ;
   heatmapparm y=row x=col colorresponse=count /colormodel=(white cx6767bb cxbb67bb cxdd2255);
   text        y=row x=col text=count/ textattrs=(size=8) strip contributeoffsets=none;
   highlow y=mrow low=low   high=rowf / x2axis 
type=bar barwidth=0.95 nooutline colormodel=(white cx6767bb cxbb67bb cxdd2255) colorresponse=rowf;
   text    y=mrow x=x0      text=rowf / x2axis textattrs=(size=8) strip contributeoffsets=none;
   highlow x=mcol low=low   high=colf / y2axis 
type=bar barwidth=0.95 nooutline colormodel=(white cx6767bb cxbb67bb cxdd2255) colorresponse=colf;
   text    x=mcol y=x0      text=colf / y2axis textattrs=(size=8) strip contributeoffsets=none;
   xaxis   display=(nolabel noticks noline) offsetmax=.3;
   yaxis   display=(nolabel noticks noline) offsetmin=.3 reverse;
   x2axis  display=none     offsetmin=.75 offsetmax=.03;
   y2axis  display=none     offsetmin=.75 offsetmax=.03;
run;

Ksharp_0-1768270686692.png

 

View solution in original post

11 REPLIES 11
Ksharp
Super User

Yes. You could could . since PROC CORRESP is just decomposing the chi-square value of contingency table, no matter this category variable is nominal or ordinal .

 

proc corresp data=sashelp.heart all chi2p;
tables bp_status,weight_status;
run;

Ksharp_0-1768029689455.png

 

 

And I also think you could use heatmap to visualize it. Here is an example:

https://blogs.sas.com/content/graphicallyspeaking/2017/06/26/advanced-ods-graphics-range-attribute-m...

palolix
Lapis Lazuli | Level 10

That's great, thank you so much Ksharp!!

 

That heatmap looks great, I really want to try it with my data but I was having problems when inputting my data in the first data step.  One of my variables is ' Ethnicity' (7 levels) and the other one is 'overall1' which is a score that goes from 1 to 9 (1 2 3 4 5 6 7 8 9). I put it like this in Col but it didn't work out. I highlighted the statement "do" and "count" because I'm not sure what to include in them. Count should be the numbers I have after each ethnicity level which is the freq of each score for every level of ethnicity. I hope it makes sense. I would greatly appreciate if you could help me.  

 

data one(drop=f: i);
input Row $ 1-7 f1-f9;
array f[9];
do i = 0 to 5;
Col = 1 2 3 4 5 6 7 8 9;
Count = i;
output;
end;
datalines;
Asian 0 0 0 1 4 3 6 4 0
African American 0 1 7 16 8 37 57 44 8
Caucasian 0 1 4 13 6 24 51 38 9
Hispanic 0 0 0 0 0 0 0 0 0
Native American 5 1 12 24 24 82 108 90 12
Middle Eastern 0 1 2 10 15 24 31 27 6
Pacific Islander 0 0 1 0 0 3 0 0 0
;

sbxkoenk
SAS Super FREQ

No time at this moment to work it out for you ... but maybe the author of that blog can help you out quicker.

@WarrenKuhfeld  , can you?

Advanced ODS Graphics: Range Attribute Maps

sbxkoenk
SAS Super FREQ

See also ...

Create mosaic plots in SAS by using PROC FREQ

palolix
Lapis Lazuli | Level 10

Thank you very much for the link to that article Koen!

Tom
Super User Tom
Super User

Not sure what that data step is trying to do.  But with those datalines (with an extra space inserted after the Ethnicity value so the lines can be parsed properly) are trivial to read in.

data have;
  input ethnicity &:$16. @;
  do overall=1 to 9;
    input count @;
    output;
  end;
datalines;
Asian  0 0 0 1 4 3 6 4 0
African American  0 1 7 16 8 37 57 44 8
Caucasian  0 1 4 13 6 24 51 38 9
Hispanic  0 0 0 0 0 0 0 0 0
Native American  5 1 12 24 24 82 108 90 12
Middle Eastern  0 1 2 10 15 24 31 27 6
Pacific Islander  0 0 1 0 0 3 0 0 0
;

proc corresp data=have all chi2p;
 tables ethnicity,overall;
 weight count;
run;

Screenshot 2026-01-12 at 8.54.11 PM.png

palolix
Lapis Lazuli | Level 10

That was also very helpful, thank you Tom!

Ksharp
Super User

OK. Here you go.

 

data pop;
infile cards truncover;
input Row & $40. @;
do col = '1','2','3','4','5','6','7','8','9';
 input count @;
 output;
end;
datalines;
Asian              0 0 0 1 4 3 6 4 0
African American   0 1 7 16 8 37 57 44 8
Caucasian          0 1 4 13 6 24 51 38 9
Hispanic           0 0 0 0 0 0 0 0 0
Native American    5 1 12 24 24 82 108 90 12
Middle Eastern     0 1 2 10 15 24 31 27 6
Pacific Islander   0 0 1 0 0 3 0 0 0
;


proc freq data=pop; /* Row and column marginals */
   weight count/zero;
   tables row / noprint out=f2(drop=percent rename=(count=RowF row=mRow));
   tables col / noprint out=f3(drop=percent rename=(count=ColF col=mCol));
run;


data all1;
   if 0 then merge pop f2 f3;
   low = 0;
   x0  = 35; * 35 is the x axis value of margin freq;
   if _n_ = 1 then do;
      row = ' ';   col = ' ';  mrow = ' '; mcol = ' ';
      count = 0;   rowf = 0;   colf = 0;   output;
      count = 358; rowf = 358; colf = 358; output; * 358 is the max count in margin freq;
   end;
   merge pop f2 f3;
   output;
run;



ods graphics on / height=4.9in width=6.2in;
title;
proc sgplot data=all1 noautolegend noborder ;
   heatmapparm y=row x=col colorresponse=count /colormodel=(white cx6767bb cxbb67bb cxdd2255);
   text        y=row x=col text=count/ textattrs=(size=8) strip contributeoffsets=none;
   highlow y=mrow low=low   high=rowf / x2axis 
type=bar barwidth=0.95 nooutline colormodel=(white cx6767bb cxbb67bb cxdd2255) colorresponse=rowf;
   text    y=mrow x=x0      text=rowf / x2axis textattrs=(size=8) strip contributeoffsets=none;
   highlow x=mcol low=low   high=colf / y2axis 
type=bar barwidth=0.95 nooutline colormodel=(white cx6767bb cxbb67bb cxdd2255) colorresponse=colf;
   text    x=mcol y=x0      text=colf / y2axis textattrs=(size=8) strip contributeoffsets=none;
   xaxis   display=(nolabel noticks noline) offsetmax=.3;
   yaxis   display=(nolabel noticks noline) offsetmin=.3 reverse;
   x2axis  display=none     offsetmin=.75 offsetmax=.03;
   y2axis  display=none     offsetmin=.75 offsetmax=.03;
run;

Ksharp_0-1768270686692.png

 

palolix
Lapis Lazuli | Level 10

Fantastic, thank you so much Ksharp!! Just one more question if you don't mind: 

what does the number 40 stands for in the input row statement?

input Row & $40. @;

 

Thanks

Ksharp
Super User
40 stands for the length of variable ROW .
Just make it bigger to include these ROW value.
palolix
Lapis Lazuli | Level 10

Oh ok, thank you so much, you were so helpful!

sas-innovate-2026-white.png



April 27 – 30 | Gaylord Texan | Grapevine, Texas

Registration is open

Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and save with the early bird rate—just $795!

Register now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 11 replies
  • 216 views
  • 7 likes
  • 4 in conversation