turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- BI
- /
- Enterprise Guide
- /
- Slope Calculation

Topic Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

06-08-2012 04:24 PM

Attached file is a small subset of my large data file. It is the data for one employee over months.

Column A = Month the work was completed

Column B = An integer assigned to the month - this is used to calculate the slope via excel

Column C = Employee #.

Column L = The slope I calculated in Excel. The formula is: =SLOPE(IF($C$2:$C$25000=C2,$M$2:$M$25000),IF($C$2:$C$25000=C2,$B$2:$B$25000))

As I mentioned the excel file is quite large. Thus, it is a very slow to try to calculate the slope with these many data rows and I hope to be able to do this in SAS EG. I am looking for help / direction on how to calculate the SLOPE in SAS EG.

There are up to 12-months worth of data for each employee. I would like to be able to write a program that says, for each employee_ID, take the monthly_weighted_approval_rate for all the months the employee worked and calculate the slope, and, if possible, the R-Square.

I researched the forum and found PROC REG and PROC GLM, but frankly I do not know where to how to start with these.

Any help is appreciated!

Thanks,

Jenne

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to jen123

06-09-2012 04:41 PM

Your workbook has only 70 lines of non missing data. This is how to do the regressions with proc reg :

**libname xl excel "&SASForum.\datasets\slope.xls";**

**/* Keep a single value of Monthly_Weighted_Approval per month */****proc sql;****create table test as****select Emp_ID, Event_month, mean(Monthly_Weighted_Approval) as MWA****from xl.'sheet2$'n****where Monthly_Weighted_Approval is not missing****group by Emp_ID, Event_month;****quit;**

**libname xl clear;**

**/* Do the regression, request output statistics in PE and FS datasets */****proc reg data=test;****by Emp_ID notsorted;****model MWA = Event_Month;****ods output parameterEstimates=PE FitStatistics=FS;****run;**

**/* Join statistics into a table. **** Multiply the slope by 30 to express it per month */****proc sql;****create table stats as****select PE.Emp_ID, PE.Estimate*30 as slope, PE.ProbT, FS.nValue2 as AdjRSquare****from PE, FS****where pe.Emp_ID=fs.Emp_ID and PE.Variable="Event_Month" and FS.Label2="Adj R-Sq";****select * from stats;****quit;**

**PG**

Corrected by PG (replaced divide by multiply)

PG

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to PGStats

06-12-2012 03:21 PM

Thank you PGStats! Your response is very helpful and gives me the other stats I need.

However, when ran, I am not getting the same answer had I calculated manually via excel formula.

1. The slope in table STATS is different compared to the result using the excel formula: =SLOPE(IF($C$2:C$45=C2,$L$2:$L$45),IF($C$2:$C$45=C2,$B$2:$B$45))

- Using the Excel formula, slope = -0.0185881530393469
- Using the code above, the slope = -0.000023351791:

OPR_ID | slope | Probt | AdjRSquare |

abc123 | -0.000023351791 | 0.04577284 | 0.4313875 |

2. "Divide the slope by 30 to express it per month" - Why is this done?

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to jen123

06-12-2012 04:16 PM

Oops, major oops! Sorry. It should be MULTIPLIED by 30. I already did the correction in the code pasted above.

Variable Event_month is imported as a SAS date which is expressed in days. The slope from the MWA = Event_Month regression is thus expressed in MWA per day. MWA per month is thus 30 times as much. You shouldn't expect the slopes to be exactly the same as your Excel regression is on a varying number of observations every month. In my suggested code, only one value is kept per month. The two slopes should be close though.

PG

PG

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to PGStats

06-13-2012 08:18 AM

Hi PG - This worked! Thank you! Few more follow up questions:

1. What is Adjusted R-Square?

2. What is the code to calculate R-Sqare instead of AR2?

3. I got this warning when I ran the /* Do the regression, request output statistics in PE and FS datasets */ portion of the code. Thoughts?

WARNING: The range of variable Event_Month is so small relative to its mean that there may be loss of accuracy in the computations.

You may need to rescale the variable to have a larger value of RANGE/abs(MEAN), for example, by using PROC STANDARD M=0;

Thank you again!!!

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to jen123

06-13-2012 09:40 AM

Jen,

Adjusted R-Squared is the R-squared penalized for the number of variable in the model. When you have just one predictor, I recall it as being the same as R-Squared.

The formula's for both are in the PROC REG documentation on support.sas.com.

BTW, since you posted this to the EGuide forum, you can do the pieces that PG described using the EGuide GUI as well as with code.

Doc Muhlbaier

Duke

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Doc_Duke

06-13-2012 09:58 AM

Hi Duke,

Thanks for the direction to the support.sas.com documentation.

Using the code that PG described, I was able to get a slope very close to the one calculated via excel. However, the R-squared I got from the Linear Regression via excel and the Adjusted R-Squared from PG's code are very different. I will see if I can find the code in the documentation and then do a comparison.

I haven't tried the process via the EG GUI interface yet. My project process is via the GUI except the slope calculation. I am on a time crunch to get this done for work, but will try it via the GUI later in the week and let you know.

Thanks!

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to jen123

06-13-2012 10:04 AM

Hi Duke - Did I mess up and should have posted my question to a different forum?

Thanks!

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to jen123

06-13-2012 10:47 AM

Jen,

On the forum question, your question fit here or it could have fit on the Stat Procedures forum. I made the side comment because many analysts (as opposed to stat programmers) need to use the GUI because they aren't very knowledgeable coders. EGuide is getting more "programmer friendly," though you will see from some other posts that many think it has a ways to go.

Doc

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to jen123

06-13-2012 05:49 PM

I guess question 1 was already answered. As for question 2 and 3, the following version including calculations with and without duplicate

monthly weighted approval rates should answer :

**libname xl excel "&SASForum.\datasets\slope.xls";**

**/* Flag the first observation of every month (variable first). ****Express the month as the number of months since the beginning of the century to avoid****loss of accuracy */ **** **

**data test(keep=Emp_ID month Monthly_Weighted_Approval first);****set xl.'sheet2$'n;****by Event_month notsorted;****if Monthly_Weighted_Approval ne .;****first = first.Event_month;****month = intck("MONTH", '01JAN2000'd, Event_Month);****run;**

** **

**libname xl clear;**

** **

**/* Do the regression on a single observation per month, ****request output statistics in PEfirst and FSfirst datasets */****proc reg data=test(where=(first)) plots=none;****by Emp_ID notsorted;****model Monthly_Weighted_Approval = month;****ods output parameterEstimates=PEfirst FitStatistics=FSfirst;****run;**

** **

**/* Do the regression on all duplicate observations, ****request output statistics in PEall and FSall datasets */****proc reg data=test plots=none;****by Emp_ID notsorted;****model Monthly_Weighted_Approval = month;****ods output parameterEstimates=PEall FitStatistics=FSall;****run;**

** **

**/* Join statistics into tables. */****proc sql;****create table statsFirst as****select PE.Emp_ID, PE.Estimate as slope, PE.ProbT, FS.nValue2 as RSquare****from PEfirst as PE, FSfirst as FS****where pe.Emp_ID=fs.Emp_ID and PE.Variable="Month" and FS.Label2="R-Square";****create table statsAll as****select PE.Emp_ID, PE.Estimate as slope, PE.ProbT, FS.nValue2 as RSquare****from PEall as PE, FSall as FS****where pe.Emp_ID=fs.Emp_ID and PE.Variable="Month" and FS.Label2="R-Square";**** **

**title "One observation per month";****select * from statsFirst;****title "All duplicate observations";****select * from statsAll;****quit;**

** **

**PG**

PG