Hi,
I am analyzing data of different strains with different genotypes and I am trying to format the x axis a certain way. My data is categorical and I would prefer not to change it numbers. I added a xaxis values statement and now my graph looks crazy. Any suggestions? I attached photos of before I used the xaxis value statement and after. This is my code:
*Creating figure plotting estiamted probs & CI**;
proc logistic data=estimated2 alpha=0.2 noprint;
class mosquitostrain (ref= 'NO') genotype / param=ref;
model Alive(Event='1')= mosquitostrain genotype;
/* 1. Use a procedure or DATA step to write Pred, Lower, and Upper limits */
output out=LogiOut predicted=estprob2 l=lower95 u=upper95;
run;
/* 2. Be sure to SORT! */
proc sort data=LogiOut;
by mosquitostrain genotype;
run;
/* 3. Use a BAND statement. If more that one band, use transparency */
title "Predicted Probabilities with 95% Confidence Limits";
title2 "Four Strains";
proc sgplot data=LogiOut;
band x=genotype lower=Lower95 upper=upper95 / group=mosquitostrain transparency=0.75;
series x=genotype y=estprob2 / group=mosquitostrain curvelabel;
xaxis grid;
xaxis values= ('WT,WT' 'WT,HET' 'HET,WT' 'HET,HET' 'MUT,WT' 'WT,MUT' 'HET,MUT' 'MUT,HET' 'MUT,MUT');
yaxis grid label="Predicted Probability of Survival" max=1;
keylegend "L" / location=inside position=NW title="Genotype" across=1 opaque;
run;
Does a line chart make sense here at all?
How do I interpret the connection between HET/HET and HET/MUT? I would have assumed a forest plot was more appropriate. See examples at the link below
https://blogs.sas.com/content/tag/forest-plot/
Ignoring visualization choice, code your categories as numbers, ordered so that 1 is your first category and 2 is your second etc.
Then graph it but apply a label to your X AXIS so that 1 will display as HET/HET and 2 will display as HET/MUT.
You can use an informat to map the codes to numbers and then a format to map the numbers back to the data.
Assuming your data now looks like this, use the code and apply a format instead with the FORMAT statement.
https://support.sas.com/resources/papers/proceedings/proceedings/sugi30/001-30.pdf
GenoCode GenoType Probability UCLM LCLM
1 | HET,HET| 0.2| 0.07| 0.8
2 | HET,MUT| 0.2| 0.07| 0.8
3 | HET,WT| 0.2| 0.07| 0.8
title "Predicted Probabilities with 95% Confidence Limits";
title2 "Four Strains";
proc sgplot data=LogiOut;
band x=genocode lower=Lower95 upper=upper95 / group=mosquitostrain transparency=0.75;
series x=genocode y=estprob2 / group=mosquitostrain curvelabel;
xaxis grid;
format genocode genofmt.;
xaxis values= ('WT,WT' 'WT,HET' 'HET,WT' 'HET,HET' 'MUT,WT' 'WT,MUT' 'HET,MUT' 'MUT,HET' 'MUT,MUT');
yaxis grid label="Predicted Probability of Survival" max=1;
keylegend "L" / location=inside position=NW title="Genotype" across=1 opaque;
run;
In which SAS product are you submitting your code?
-------------------------------------------------------------------------
Four tips to remember when you contact SAS Technical Support
Tricks for SAS Visual Analytics Report Builders
SAS Visual Analytics Learning Center
Does a line chart make sense here at all?
How do I interpret the connection between HET/HET and HET/MUT? I would have assumed a forest plot was more appropriate. See examples at the link below
https://blogs.sas.com/content/tag/forest-plot/
Ignoring visualization choice, code your categories as numbers, ordered so that 1 is your first category and 2 is your second etc.
Then graph it but apply a label to your X AXIS so that 1 will display as HET/HET and 2 will display as HET/MUT.
You can use an informat to map the codes to numbers and then a format to map the numbers back to the data.
Assuming your data now looks like this, use the code and apply a format instead with the FORMAT statement.
https://support.sas.com/resources/papers/proceedings/proceedings/sugi30/001-30.pdf
GenoCode GenoType Probability UCLM LCLM
1 | HET,HET| 0.2| 0.07| 0.8
2 | HET,MUT| 0.2| 0.07| 0.8
3 | HET,WT| 0.2| 0.07| 0.8
title "Predicted Probabilities with 95% Confidence Limits";
title2 "Four Strains";
proc sgplot data=LogiOut;
band x=genocode lower=Lower95 upper=upper95 / group=mosquitostrain transparency=0.75;
series x=genocode y=estprob2 / group=mosquitostrain curvelabel;
xaxis grid;
format genocode genofmt.;
xaxis values= ('WT,WT' 'WT,HET' 'HET,WT' 'HET,HET' 'MUT,WT' 'WT,MUT' 'HET,MUT' 'MUT,HET' 'MUT,MUT');
yaxis grid label="Predicted Probability of Survival" max=1;
keylegend "L" / location=inside position=NW title="Genotype" across=1 opaque;
run;
Bands are drawn by creating polygons. This works best when the x coordinate increases monotonically. That is what happens in the original data case, and you get good polygons. When you reorder the x-axis values, the polygons no longer have monotonically increasing x values. So you get self-intersecting polygons.
To get non-intersecting polygons, the data set should have the x-values in the same order as the x axis values you want. Or, make the x axis linear (1-9) and use a UDF to label the values
As Reeza says, a format is the way to go. For an example, see "Method 2" in the article "Visualize an ANOVA with two-way interactions."
Hi Reeza,
Do you know if it is possible to use a proc format statement with proc logistic? I would also like to label the x axis the same way but using a graph created by proc logistic
ods graphics on;
proc logistic data=Round3E plots(only)=(effect (clband) oddsratio (type=horizontalstat));
class Genotype MosquitoStrain(param=ref ref="NO");
model Alive (event="1") = Genotype MosquitoStrain/clodds=pl;
run;
ods graphics off;
Hello,
I tried this code and got this error message
**trying another way to graph estimated prob & CI**;
ods graphics on;
proc logistic data=Round3E plots(only)=(effect (clband) oddsratio (type=horizontalstat));
format Genotype $GenotFmt.;
class Genotype MosquitoStrain(param=ref ref="NO")
ORDER='WT,WT' 'WT,HET' 'HET,WT' 'HET,HET' 'MUT,WT' 'WT,MUT' 'HET,MUT' 'MUT,HET' 'MUT,MUT';
model Alive (event="1") = Genotype MosquitoStrain/clodds=pl;
run;
ods graphics off;
Hello,
I deleted the order statement and it changed the variable labels but I would like to order them in a specific order (based on their genotype).
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.