turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- SAS/GRAPH and ODS Graphics
- /
- sgplot with multiple regression lines

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

12-05-2012 07:48 PM

Hi,

I am trying to create a plot for my data but I am struggling to get the expected output. The proc sgplot produces the graph with 50 regression lines for the 50 groups in my datasets. My problem is that the lines do not always go to the end of the X axis. I would like to have all the lines stop at the same x. I think my issue is due to the fact that hte max(x) in my dataset is different for each group. Sometimes it is max(x)=50, sometimes max(x)=80....

Is there any way to have SAS draw the regression line for a specific interval. I thought about drawing a line using hte intercept and slope but I am not sure how to do it.

proc sgplot data=listall;

reg x=var1 y= var2 / NOMARKERS group=t ;

quit;

Thanks a lot for any help

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

12-05-2012 08:42 PM

could you first find out the minimum max value of by group, then do something like if x>50 then x=50?

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

12-05-2012 08:53 PM

THanks, I thought about it but I am also plotting the data on my scatter plot. If I add a value in my data set it will output a point which does not exists

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

12-05-2012 10:28 PM

SGPLOT will always use all the data provided to the regression statement to compute the fit line. There is no way to draw only part of the fit line. One possibility would be to compute a new Y2 column that has missing values for x>50. Then, provide Y2 to the reg statement, while you still provide the original Y column to the scatter plot. However, this may not be correct since the new regression line will not be the same as the one with full data.

Now, SGPLOT computes a new data set with the values needed to draw the regression line as a series plot. So, you could use ODS Output to get this computed data set, remove the points of the fit line for x > 50, and then use a SERIES statement to draw the truncated fit line.

Note, this idea works with degree=2, as many points are computed. With default degree=1, only the two end points are computed, so that would get tricky. Here is a program using class data to illustrate the idea:

ods output sgplot=Reg;

proc sgplot data=sashelp.class;

reg x=height y=weight / group=sex degree=2;

run;

proc print data=reg;run;

data reg2;

set reg;

y2=REGRESSION_HEIGHT_WEIGHT_GROU__Y;

if REGRESSION_HEIGHT_WEIGHT_GROU__X >65 then y2=.;

run;

proc print data=reg2;run;

proc sgplot data=reg2;

scatter x=height y=weight / group=sex;

series x=REGRESSION_HEIGHT_WEIGHT_GROU__X y=y2 /

group=REGRESSION_HEIGHT_WEIGHT_GROU_GP;

run;

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

12-06-2012 09:58 AM

I think Sanjay has the right idea, but you should do the analysis in PROC GLM and then plot the predicted curves overlayed on the data.

Here is a ropugh outline:

1) Find the global max and min of the x variable. Save those values in macro variables. For example:

proc sql noprint;

select min(x) into :min from MyData;

select max(x) into :max from MyData;

quit;

2) Construct a new data set, A, that consists of 2 obs for each level of the GROUP variable. For example:

data A;

Group=1;

x=&min; y=.; output;

x=&max; y=.; output;

Group=2;

x=&min; y=.; output;

x=&max; y=.; output;

...

run;

You probably want to program this step.

3) Concatenate the original data and A. Call the new data set B. Run PROC GLM on B and use the OUTPUT statement to get linear predictions. BECAUSE THE Y VARIABLE IS MISSING, the observations from A are not used in the model estimation, but they DO receive predicted values.

4) Use PROC SGPLOT to plot a scatter plot of the observations and a series plot of the predicted values. The scatter plot does not contain any points from A because the Y value is missing. The series plot contains all points from A because the predicted values are nonmissing.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

12-06-2012 02:44 PM

Thanks everyone for your great suggestion.

I ended up using GTL and drawing lines using intercept and slope.I followed this logic:

- Calculated the slope and intersect by group . It gaves me 50 records for my 50 experiment

- Set the dataset with all the data wit hthe data set with slope and intercept

- Created a graph template to plot the data and draw the regression line using intercept and slope

Thanks again

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

12-06-2012 06:24 PM

If you compute the fit parameters yourself (x, y) and slope, you could just as easily compute the second point and for each group, and use SGPLOT series statement to get the same result. You may have more options with a Series instead of LineParm.