BookmarkSubscribeRSS Feed
mrom34
Calcite | Level 5

Hi,

I am trying to create a plot for my data but I am struggling to get the expected output. The proc sgplot produces the graph with 50 regression lines for the 50 groups in my datasets. My problem is that the lines do not always go to the end of the X axis. I would like to have all the lines stop at the same x. I think my issue is due to the fact that hte max(x) in my dataset is different for each group. Sometimes it is max(x)=50, sometimes max(x)=80....

Is there any way to have SAS draw the regression line for a specific interval. I thought about drawing a line using hte intercept and slope but I am not sure how to do it.

proc sgplot data=listall;

reg x=var1 y= var2  /  NOMARKERS group=t   ;

quit;

Thanks a lot for any help

6 REPLIES 6
Linlin
Lapis Lazuli | Level 10

could you first find out the minimum max value of by group, then do something like if x>50 then x=50?

mrom34
Calcite | Level 5

THanks, I thought about it but I am also plotting the data on my scatter plot. If I add a value in my data set it will output a point which does not exists

Jay54
Meteorite | Level 14

SGPLOT will always use all the data provided to the regression statement to compute the fit line.  There is no way to draw only part of the fit line.  One possibility would be to compute a new Y2 column that has missing values for x>50.  Then, provide Y2 to the reg statement, while you still provide the original Y column to the scatter plot.  However, this may not be correct since the new regression line will not be the same as the one with full data.

Now, SGPLOT computes a new data set with the values needed to draw the regression line as a series plot.  So, you could use ODS Output to get this computed data set, remove the points of the fit line for x > 50, and then use a SERIES statement to draw the truncated fit line.

Note, this idea works with degree=2, as many points are computed.  With default degree=1, only the two end points are computed, so that would get tricky.  Here is a program using class data to illustrate the idea:

ods output sgplot=Reg;
proc sgplot data=sashelp.class;
  reg x=height y=weight / group=sex degree=2;
  run;

proc print data=reg;run;

data reg2;
  set reg;
  y2=REGRESSION_HEIGHT_WEIGHT_GROU__Y;
  if REGRESSION_HEIGHT_WEIGHT_GROU__X >65 then y2=.;
  run;
proc print data=reg2;run;

proc sgplot data=reg2;
  scatter x=height y=weight / group=sex;
  series x=REGRESSION_HEIGHT_WEIGHT_GROU__X y=y2  /

            group=REGRESSION_HEIGHT_WEIGHT_GROU_GP;
  run;

Rick_SAS
SAS Super FREQ

I think Sanjay has the right idea, but you should do the analysis in PROC GLM and then plot the predicted curves overlayed on the data.

Here is a ropugh outline:

1) Find the global max and min of the x variable. Save those values in macro variables. For example:

proc sql noprint;

select min(x) into :min from MyData;

select max(x) into :max from MyData;

quit;

2) Construct a new data set, A,  that consists of 2 obs for each level of the GROUP variable. For example:

data A;

Group=1;

x=&min; y=.; output;

x=&max; y=.; output;

Group=2;

x=&min; y=.; output;

x=&max; y=.; output;

...

run;

You probably want to program this step.

3) Concatenate the original data and A.  Call the new data set B.  Run PROC GLM on B and use the OUTPUT statement to get linear predictions.  BECAUSE THE Y VARIABLE IS MISSING, the observations from A are not used in the model estimation, but they DO receive predicted values.

4) Use PROC SGPLOT to plot a scatter plot of the observations and a series plot of the predicted values.  The scatter plot does not contain any points from A because the Y value is missing. The series plot contains all points from A because the predicted values are nonmissing.

mrom34
Calcite | Level 5

Thanks everyone for your great suggestion.

I ended up using GTL and drawing lines using intercept and slope.I followed this logic:

- Calculated the slope and intersect by group . It gaves me 50 records for my 50 experiment

- Set the dataset with all the data wit hthe data set with slope and intercept

- Created a graph template to plot the data and draw the regression line using intercept and slope

Thanks again

Jay54
Meteorite | Level 14

If you compute the fit parameters yourself (x, y) and slope, you could just as  easily compute the second point and for each group, and use SGPLOT series statement to get the same result.  You may have more options with a Series instead of LineParm.

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 5620 views
  • 0 likes
  • 4 in conversation