Hello,
I work as beginner in bioinformatics. I must do a MA plot. It is possible with R. But I'm not an experimented user of R.
Can I realize with SAS ?
I don't find any response in the community.
Thank you for help.
Thanks for providing the missing third variable. It is not obvious (at least to me) what values of edgemyoPV indicate "significance", because -- contrary to my expectations --
But once you know the criterion for "significant" edgemyoPV values, the implementation is easy: Just replace the assignment statement r=choosec(...) in my suggested DATA step by
if criterion for significant edgemyoPV then r=choosec(sign(edgemyofc)+2,'Down','None','Up'); else r='None';
where "criterion for significant edgemyoPV " stands for a logical expression involving edgemyoPV, for example edgemyoPV>=1.
(Edit: Made the definition of r more similar to the earlier suggestion.)
Can you point us to at least an example of an "MA" plot?
MA is jargon for your field. It may be known as something else to us not in bioinformatics.
Be prepared to discuss, as in provide examples, of your data. Dummy data is okay as long as it provides similar information to what you have.
Instructions here: https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... will show how to turn an existing SAS data set into data step code that can be pasted into a forum code box using the </> icon or attached as text to show exactly what you have and that we can test code against.
Thank you for your response.
The MA plot is on this picture. Axis X = LogCPM (Average Mean) and axis Y = edgemyoFC (Log2 Fold Change).
I insert a table with the two variables. Each point is a gene.
I
Hello @Nathalie1,
I think you will need a third variable defining the color of the scatter points. Below is an example using an arbitrarily defined variable r just for demonstration.
Then you can use the SCATTER statement of PROC SGPLOT with LogCPM as the x variable, edgemyoFC as the y variable and the third variable as the group variable:
/* Create sample data for demonstration */
data have;
set test;
if n(edgemyofc,logcpm)=2;
length r $4;
r=choosec(sign(round(edgemyofc*exp(logcpm/3)))+2,'Down','None','Up'); /* arbitrary definition */
label logcpm='Average Expression'
edgemyofc='Log2 Fold Change'
r='Regulated';
run;
proc sort data=have;
by r;
run;
title 'Average Expression vs. Log2 Fold Change';
proc sgplot data=have;
styleattrs datacontrastcolors=(lime gray red);
scatter x=logcpm y=edgemyofc / group=r markerattrs=(symbol=circlefilled size=5pt);
keylegend / position=right noborder valueattrs=(size=8);
run;
title;
Thank you very much for your response. You already did a lot of for me.
A column missed, it is the pvalue (table test).
I obtain a graph but with the pvalue missing, I don't know if it is correct.
Here, is a code in R.
Generate a plot of log fold change versus mean expression (MA plot)
## S4 method for signature 'data.frame' plotMA( object, ylim = NULL, colNonSig = "gray32", colSig = "red3", colLine = "#ff000080", log = "x", cex=0.45, xlab="mean expression", ylab="log fold change", ... )
object |
A |
ylim |
The limits for the y-axis. If missing, an attempt is made to choose a sensible value. Dots exceeding the limits will be displayed as triangles at the limits, pointing outwards. |
colNonSig |
colour to use for non-significant data points. |
colSig |
colour to use for significant data points. |
colLine |
colour to use for the horizontal (y=0) line. |
log |
which axis/axes should be logarithmic; will be passed to |
cex |
The |
xlab |
The x-axis label. |
ylab |
The y-axis label. |
... |
Further parameters to be passed through to |
Thanks for providing the missing third variable. It is not obvious (at least to me) what values of edgemyoPV indicate "significance", because -- contrary to my expectations --
But once you know the criterion for "significant" edgemyoPV values, the implementation is easy: Just replace the assignment statement r=choosec(...) in my suggested DATA step by
if criterion for significant edgemyoPV then r=choosec(sign(edgemyofc)+2,'Down','None','Up'); else r='None';
where "criterion for significant edgemyoPV " stands for a logical expression involving edgemyoPV, for example edgemyoPV>=1.
(Edit: Made the definition of r more similar to the earlier suggestion.)
Save $250 on SAS Innovate and get a free advance copy of the new SAS For Dummies book! Use the code "SASforDummies" to register. Don't miss out, May 6-9, in Orlando, Florida.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.