Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Home
- /
- Programming
- /
- Graphics
- /
- sgplot with multiple regression lines

Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Posted 12-05-2012 07:48 PM
(5361 views)

Hi,

I am trying to create a plot for my data but I am struggling to get the expected output. The proc sgplot produces the graph with 50 regression lines for the 50 groups in my datasets. My problem is that the lines do not always go to the end of the X axis. I would like to have all the lines stop at the same x. I think my issue is due to the fact that hte max(x) in my dataset is different for each group. Sometimes it is max(x)=50, sometimes max(x)=80....

Is there any way to have SAS draw the regression line for a specific interval. I thought about drawing a line using hte intercept and slope but I am not sure how to do it.

proc sgplot data=listall;

reg x=var1 y= var2 / NOMARKERS group=t ;

quit;

Thanks a lot for any help

6 REPLIES 6

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

SGPLOT will always use all the data provided to the regression statement to compute the fit line. There is no way to draw only part of the fit line. One possibility would be to compute a new Y2 column that has missing values for x>50. Then, provide Y2 to the reg statement, while you still provide the original Y column to the scatter plot. However, this may not be correct since the new regression line will not be the same as the one with full data.

Now, SGPLOT computes a new data set with the values needed to draw the regression line as a series plot. So, you could use ODS Output to get this computed data set, remove the points of the fit line for x > 50, and then use a SERIES statement to draw the truncated fit line.

Note, this idea works with degree=2, as many points are computed. With default degree=1, only the two end points are computed, so that would get tricky. Here is a program using class data to illustrate the idea:

ods output sgplot=Reg;

proc sgplot data=sashelp.class;

reg x=height y=weight / group=sex degree=2;

run;

proc print data=reg;run;

data reg2;

set reg;

y2=REGRESSION_HEIGHT_WEIGHT_GROU__Y;

if REGRESSION_HEIGHT_WEIGHT_GROU__X >65 then y2=.;

run;

proc print data=reg2;run;

proc sgplot data=reg2;

scatter x=height y=weight / group=sex;

series x=REGRESSION_HEIGHT_WEIGHT_GROU__X y=y2 /

group=REGRESSION_HEIGHT_WEIGHT_GROU_GP;

run;

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

I think Sanjay has the right idea, but you should do the analysis in PROC GLM and then plot the predicted curves overlayed on the data.

Here is a ropugh outline:

1) Find the global max and min of the x variable. Save those values in macro variables. For example:

proc sql noprint;

select min(x) into :min from MyData;

select max(x) into :max from MyData;

quit;

2) Construct a new data set, A, that consists of 2 obs for each level of the GROUP variable. For example:

data A;

Group=1;

x=&min; y=.; output;

x=&max; y=.; output;

Group=2;

x=&min; y=.; output;

x=&max; y=.; output;

...

run;

You probably want to program this step.

3) Concatenate the original data and A. Call the new data set B. Run PROC GLM on B and use the OUTPUT statement to get linear predictions. BECAUSE THE Y VARIABLE IS MISSING, the observations from A are not used in the model estimation, but they DO receive predicted values.

4) Use PROC SGPLOT to plot a scatter plot of the observations and a series plot of the predicted values. The scatter plot does not contain any points from A because the Y value is missing. The series plot contains all points from A because the predicted values are nonmissing.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Thanks everyone for your great suggestion.

I ended up using GTL and drawing lines using intercept and slope.I followed this logic:

- Calculated the slope and intersect by group . It gaves me 50 records for my 50 experiment

- Set the dataset with all the data wit hthe data set with slope and intercept

- Created a graph template to plot the data and draw the regression line using intercept and slope

Thanks again

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.

**If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website. **

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.