Hi,
Apologies if this topic has already been covered, but I've had a search across the forum and havn't found a resolution.
I've generated some plots of residuals, using ods graphics.
There are a number of potentially influential points plotted on the outputted graphs, but because the dataset is large (c.10k observations) the observation numbers are stacked on top of one another, and I consequently can't read them.
Is there some way to either drill into points of interest, zoom into the graph or to increase the resolution of the graph to a point that makes the observation numbers legible?
I'm using base SAS 9.2, but should mention that it's remotely hosted, so i've got limitations on being able to install any 3rd party graphing packages that might assist (I read about using Editable Graphics by calling SGE=ON option with LISTING, but then I wasn't able to open the editable files that were produced)
Code i've used to generate the graphs currently:
/* Residuals & Influential Statistics: OfferHolderTRAIN */
ods graphics on;
proc logistic data= OfferHolderTRAIN plots(unpack label)=
 (dfbetas influence leverage phat dpc);
CLASS Comms_Engagement/PARAM=REF DESCENDING;
MODEL Applicant_Enrolled(EVENT='1')= Comms_Engagement Offer_Place_email_Opened Offer_Place_email_Clked; 
run;
ods graphics off;
quit;
Example below of the issue I have trying to read the observation numbers:
Greatly apprecite any guidance.
R,
Jon
Hi @ballardw, many thanks for your help. I've tried increasing the size of the plot area as you suggest, but this unfortunately doesn't make the observation numbers any more legible. I also tried increasing the resolution to 300dpi, but again this made no difference (when I tried to increase both the size and the resolution simultaneous, it crashed SAS).
It just occurs that maybe i'll try sampling just 500 observations from my data set, and see if plotting this subset recreates the same outliers
You may have some luck by saving the appropriate data and routing that to another graphics procedure where you have more control that in the logistics procedure. But if that graph is supposed to contain 10K+ observations that kind of apparent cluster density may require some work.
If you have an area of the data of interest you could WHERE clause in SGPLOT to display values only within a given range for the values such as
where difference > 4 and probability le 0.1;
So fewer records are displayed.
Another option may be to make the graphics area huge with an ODS graphics statement:
ods graphics / height=24in width=24in;
would make a 24 by 24 inch result. I don't know how large you mayhave to make things if your clusters of values are close to read them, and may not be even possible.
Hi @ballardw, many thanks for your help. I've tried increasing the size of the plot area as you suggest, but this unfortunately doesn't make the observation numbers any more legible. I also tried increasing the resolution to 300dpi, but again this made no difference (when I tried to increase both the size and the resolution simultaneous, it crashed SAS).
It just occurs that maybe i'll try sampling just 500 observations from my data set, and see if plotting this subset recreates the same outliers
Good reasoning.
Shortly after my last post I started thinking a sample may work but I wasn't sure if your request was actually to identify ALL outliers or not.
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.
