Data visualization with SAS programming

Is there a way to overlay score plot on loading plot in proc princomp?

Accepted Solution Solved
Reply
Contributor
Posts: 25
Accepted Solution

Is there a way to overlay score plot on loading plot in proc princomp?

I am using sas 9.3v.

 

SAS Code: 

proc princomp data=plants plots=all;
id species;
var X1-X244;
run;

 

In the output, I get score plots as "Plots of component scores" and loading plots as "Plots of component pattern". 

 

I want to overlay 2 by 1 score plot on 2 by 1 loading plot to better visualize my results.

 

Also, in loading plots I would like to color my variables differently. 

For example: X1-X75 as green, X76-X150 as blue, and X151-X244 as red.

 

Can anyone provide me the code to perform these functions?

 

Thanks.


Accepted Solutions
Solution
‎11-03-2017 08:30 PM
SAS Super FREQ
Posts: 496

Re: Is there a way to overlay score plot on loading plot in proc princomp?

Since prinqual typically iterates, it tries to get rid of potentially problematic variables.  You can specify a really small singularity criterion as in this example.

data x;
   do i = 1 to 10;
      x1 = normal(7);
      x2 = normal(7);
      x3 = normal(7);
      x4 = normal(7);
      x5 = normal(7) * 1e-10;
      output;
      end;
   run;

proc prinqual data=x std;
   var ide(x:);
run;

proc prinqual data=x std singular=1e-50;
   var ide(x:);
run;


You can always output anything produced by any procedure and rearrange it in any way to make a graph.  Do an ODS OUTPUT on the data object that underlies prinqual's graph.  Then use princomp pieces to make something like that.  Again be aware that prinqual scales things in a nice way.  The documentation has details.

View solution in original post


All Replies
SAS Super FREQ
Posts: 496

Re: Is there a way to overlay score plot on loading plot in proc princomp?

The easiest way to display scores and loadings simultaneously is to use proc prinqual.

 

If someone has not showed you how to change the colors for some variables before then, I will help you with it tomorrow.  I'll probably recreate the graph outside the proc with a modified data object that has a group variable that determines the colors.  Some of the techniques I use are in https://support.sas.com/documentation/prod-p/grstat/9.4/en/PDF/odsadvg.pdf

Contributor
Posts: 25

Re: Is there a way to overlay score plot on loading plot in proc princomp?

Posted in reply to WarrenKuhfeld
Hi WarrenKuhfeld, Thank you so much for your reply. I will review the link and will post again if I have any further questions. Yes, please do show me how to change colors of some variables.
SAS Super FREQ
Posts: 4,124

Re: Is there a way to overlay score plot on loading plot in proc princomp?

The overlay of those two plots is often balled a 'biplot' in the literature. Various ways to produce a biplot in SAS are shown at http://support.sas.com/kb/54/993.html
Contributor
Posts: 25

Re: Is there a way to overlay score plot on loading plot in proc princomp?

Hi Rick_SAS, thanks for your help. I will look in to the literature. Is there any literature for coloring variables differently in loading plot?
SAS Super FREQ
Posts: 496

Re: Is there a way to overlay score plot on loading plot in proc princomp?

[ Edited ]

This answer has undergone several edits. Sorry for the churn. Replaced earlier answer around 4:00 EDT on Nov 1.  This is probably the most technical answer that I ever posted in SAS Communities, and I learned a few things in the process.  Ask questions if you are confused and I will try to help.  Please understand that you don't have to understand everything that I do to use it.

 

If you have never seen code like this, it is going to be hard to digest. I examined the graph template and modify it. I also add a new ID variable. For you, it will contain groups (like 1s for all the variables that you want to be the first color), 2s for all the variables you want to be the second color, and so on. You can change the colors, but I did not get into that. http://go.documentation.sas.com/?docsetId=grstatgraph&docsetTarget=n0mx6gudpcurt7n1ubv4fvbhj4fc.htm&...
@Rick_SAS recently did a blog on modifying the template. https://blogs.sas.com/content/iml/2017/10/30/programming-modify-ods-templates.html  Take a look if you are confused by what I did. For more advanced examples, see  https://support.sas.com/resources/papers/proceedings16/SAS1800-2016.pdf

 

Note that PRINQUAL does not make a "true" biplot. It scales the vector lengths in a nicer way. You now have all the tools you need to do other customizations.

 


title 'Ratings for Automobiles Manufactured in 1980';
data cars;
   input Origin $ 1-8 Make $ 10-19 Model $ 21-36
         (MPG Reliability Acceleration Braking Handling Ride
          Visibility Comfort Quiet Cargo) (1.);
   datalines;
GMC      Buick      Century         3334444544
GMC      Buick      Electra         2434453555
GMC      Buick      Lesabre         2354353545
GMC      Buick      Regal           3244443424
GMC      Buick      Riviera         2354553543
GMC      Buick      Skyhawk         3232423224
GMC      Buick      Skylark         4145555422
GMC      Chevrolet  Camaro          2254541241
GMC      Chevrolet  Caprice Classic 2445353555
GMC      Chevrolet  Chevette        5335425223
GMC      Chevrolet  Citation        4155555525
GMC      Chevrolet  Corvette        2153542242
GMC      Chevrolet  Malibu          3333444544
GMC      Chevrolet  Monte Carlo     3253353544
GMC      Chevrolet  Monza           2142233114
Chrysler Dodge      Aspen           2143333424
Chrysler Dodge      Colt Hatchback  5544445434
Chrysler Dodge      Diplomat        2153343434
Chrysler Dodge      Mirada          2143432434
Chrysler Dodge      Omni 024        4345535225
Chrysler Dodge      St Regis        1154353545
Ford     Ford       Fairmont        3324345434
Ford     Ford       Fiesta          5445344414
Ford     Ford       Granada         2233233233
Ford     Ford       LTD             3354354555
Ford     Ford       Mustang         3244323222
Ford     Ford       Pinto           4134313222
Ford     Ford       Thunderbird     2354344444
Ford     Mercury    Bobcat          4134313212
Ford     Mercury    Capri           3154322222
Ford     Mercury    Cougar XR7      2454444444
Ford     Mercury    Marquis         3354354555
Ford     Mercury    Monarch         2353232232
Ford     Mercury    Zephyr          3124345434
GMC      Oldsmobile Cutlass         3443444544
GMC      Oldsmobile Delta 88        2435353555
GMC      Oldsmobile 98              2445353555
GMC      Oldsmobile Omega           4155555522
GMC      Oldsmobile Starfire        2133522154
GMC      Oldsmobile Toronado        3323443544
Chrysler Plymouth   Champ           5544445434
Chrysler Plymouth   Gran Fury       2134353535
Chrysler Plymouth   Horizon         4345535235
Chrysler Plymouth   Volare          2153333424
GMC      Pontiac    Bonneville      2345353555
GMC      Pontiac    Firebird        1153551231
GMC      Pontiac    Grand Prix      3224432434
GMC      Pontiac    Lemans          3333444544
GMC      Pontiac    Phoenix         4155554415
GMC      Pontiac    Sunbird         3134533234
;

* Ensure we are not using the modified template (if we run this more than once).
  OK if you get a warning the first time that it does not exist.;
proc template;
   delete Stat.Prinqual.Graphics.MDPref /  store=work.modtemp;
quit;

ods graphics on;
ods listing close;
ods html body='b.html';
* Default analysis;
proc prinqual data=cars mdpref;
   ods output mdprefplot=m;
   transform ide(mpg -- cargo);
   id model;
run;

* Default data object;
proc print data=m; run;

* Store template in a file.  Look at it.;
proc template;
   source Stat.Prinqual.Graphics.MDPref / file='tpl.tpl' store=sashelp.tmplstat;
quit;

* Store modified template in WORK so it disappears when SAS closes;
ods path (prepend) work.modtemp(update);
ods path show;

* Modify template.  You cannot write code like this in a vacuum.
  You must look at the original template.;
data _null_;
   infile 'tpl.tpl' end=eof;
   input;
   * Add proc call;
   if _n_ = 1 then call execute('proc template;');
   * Remove Store option;
   i = index(lowcase(_infile_), '/ store = ');
   if i then substr(_infile_, i) = ';';
   * Skip using the group variable;
   _infile_ = tranwrd(_infile_, '_id2=IDLAB2', ' ');
   _infile_ = tranwrd(_infile_, ' _id2 ', ' ');
   * Find start of vectorplot;
   v + index(lowcase(_infile_), 'vectorplot');
   if v then do;
      * Add group var;
      _infile_ = tranwrd(_infile_, '/', '/ group=idlab2');
      * remove current label attributes;
      _infile_ = tranwrd(_infile_, 'datalabelattrs', 'primary=true; *');
      * flag end of vectorplot statement;
      if index(_infile_, ';') then v = 0;
   end;
   * write out line;
   call execute(_infile_);
   * end;
   if eof then call execute('run;');
run;

proc template;
   source Stat.Prinqual.Graphics.MDPref;
quit;

* Map the variable names to group numbers.
  You will want some simpler logic based on groups of variable names.
  You won't even need to make an informat.;
data cntlin;
   length start $ 24;
   Type    = 'i';
   FmtName = "CarFmt";
   input start $ label;
   datalines;
MPG           1
Acceleration  1
Reliability   2
Braking       3
Handling      3
Ride          3
Visibility    4
Comfort       4
Quiet         4
Cargo         5
;

proc format cntlin=cntlin; quit;

* Add a new group variable.  Make it reflect the color groups that you want.
  It will become idlab2 in the data object;
data cars2;
   set cars;
   array __x[*] _numeric_;
   * Here is where you will substitute your logic for making the group names.;
   if _n_ le 10 then group = input(vname(__x[_n_]), carfmt.);
   run;

* Need the ODS Document to access the dynamics.;
ods document name=MyDoc (write);
* Add group as an ID variable.  User the modified data set and template.
  We will ignore this graph, but it will provide the pieces that we need
  for the real graph.;
proc prinqual data=cars2 mdpref;
   ods output mdprefplot=m;
   transform ide(mpg -- cargo);
   id model group;
run;
* The warning is because of my two-data object trick. 
  ODS Graphics does not accept variables from different data objects
  (although in some ways it appears that it does).
  This is a bit complicated, so let's leave it at that.  
  I need one more major step, then I can create the graph outside PRINQUAL.
ods document close;

* Need to see the path for the graph.;
proc document name=MyDoc;
   list / levels=all;
quit;

* Display and store the dynamics;
proc document name=MyDoc;
   ods output dynamics=dynamics;
   obdynam \Prinqual#1\MDPREF#1\MDPrefPlot#1;
quit;

* Examine the new data object just to better see what is going on;
proc print data=m; run;

* This calls SGRENDER using the modified template and populates the
  DYNAMIC statement with all of the dynamic name/value pairs.;
 data _null_;
   set dynamics(where=(label1 ne '___NOBS___')) end=eof;
   if _n_ = 1 then do;
      call execute('proc sgrender data=m ' ||
                   'template=Stat.Prinqual.Graphics.MDPref;');
      call execute('dynamic');
   end;
   if cvalue1 ne ' ' then
      call execute(catx(' ', label1, '=',
                   ifc(n(nvalue1), cvalue1, quote(trim(cvalue1)))));
   if eof then call execute('; run;');
run;
  
* Unfortunately for what you want, your problem requires *all* of the 
  steps for making highly customized graphs.  I initially thought I 
  could get by with just a template modification.
  1) Modify the template.
  2) Output the data object.  I modified it by using an ID variable before hand,
     (just because that was how I started approaching this)
     but I could have modified it after the fact.
  3) Access the dynamic variables.
  4) Use CALL EXECUTE to make SGRENDER code that uses the data object, 
     modified template, and dynamic variables.
  ODS Graphics gives you *incredible* *flexibilty* through programming,
  but everything you might want is not available by flipping an option.
  And yes, there is a long learning curve for some of this.;

ods html close;
ods listing;

Contributor
Posts: 25

Re: Is there a way to overlay score plot on loading plot in proc princomp?

Posted in reply to WarrenKuhfeld
Hi @WarrenKuhfeld,

This concept is completely new to me. So I am still trying to wrap my mind around it. I am trying to tailor the code to my data. I will follow up within a week if I have any more questions.

Thanks a lot for all your help.
SAS Super FREQ
Posts: 496

Re: Is there a way to overlay score plot on loading plot in proc princomp?

You are welcome.  Yes, there is a long learning curve associated with what I did.  Again, start with @Rick_SAS's blog if you want to understand the process. The main thing you need to customize is the creation of the _group variable  and the variables and data set for PROC PRINQUAL.   If you have questions, post them.

PS. This question inspired my next blog in Graphically Speaking.  I am still working on it.

SAS Super FREQ
Posts: 4,124

Re: Is there a way to overlay score plot on loading plot in proc princomp?

Posted in reply to WarrenKuhfeld

If the concept is "completely new", you might want to start by copying the template into a text editor, making the changes that Warren recommends, and then running PROC TEMPLATE to save the new template. Some people find that process conceptually easier to understand because they can see and understand the template edits. 

SAS Super FREQ
Posts: 496

Re: Is there a way to overlay score plot on loading plot in proc princomp?

Sorry.  Hold off on this.  While what I did apparently worked--it affected the graph in the way I had in mind--that WARNING is a signal that what I did will not generalize to your exact problem.  I will repost when I have this all resolved.

Contributor
Posts: 25

Re: Is there a way to overlay score plot on loading plot in proc princomp?

Posted in reply to WarrenKuhfeld
Thanks.
Contributor
Posts: 25

Re: Is there a way to overlay score plot on loading plot in proc princomp?

[ Edited ]

Thanks for the suggestion. 

Contributor
Posts: 25

Re: Is there a way to overlay score plot on loading plot in proc princomp?

Posted in reply to WarrenKuhfeld
Please post the link after you post your next blog.
SAS Super FREQ
Posts: 4,124

Re: Is there a way to overlay score plot on loading plot in proc princomp?

I feel compelled to point out that biplots were developed in the days when "big data" meant thousands of observations and dozens of variables. They don't scale well to hundreds of variables and tens of thousands of observations.  For example, have you considered how you expect to visualize 244 vectors on one graph? Almost surely the labels will be impossible to read when the vectors are projected onto the first two principal components. Similarly, the observations are likely to suffer from overplotting, although using transparency can alleviate this somewhat.

 

 

Contributor
Posts: 25

Re: Is there a way to overlay score plot on loading plot in proc princomp?

[ Edited ]

It did occur to me as to how to visualize vectors with the graph. Thanks for suggesting transparency.

One more question is : When I am running the proc princomp, the explained variance for component 1: 34.66% and Component 2: 23.52% .

 

But when I am running proc prinqual, the explained variance for component 1: 38.35% and Component 2: 25.88%.

 

I am using 'transform ide' for proc prinqual as suggested by Warrenkuhfeld.

 

Why is there a difference in explained variance between princomp and prinqual.

 

I am trying to work on one problem at a time. That is why I didn't post these questions earlier.

Thanks.

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 19 replies
  • 364 views
  • 4 likes
  • 3 in conversation