turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- SAS Procedures
- /
- SAS Code list and explanations

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

12-11-2016 11:03 AM

I am learning this program for the first time for one of my biology classes. My professor hasn't really explained much to us. He just puts the codes in sas and the whole class rushes to scribble down the codes. Needless to say I'm completely lost. I'm not sure what each of these codes are for or if I'm even using them correctly. Is there any current list of sas codes for biology with explanations as to why we use the each specific test? I have a short list of tests he mentioned in passing during class.

Tests:

One Sample T-test

2 Sample T-test

Paired T-test

Wilcoxon Test

ANOVA 1 and 2 way

Factoral ANOVA

Mannwhitney U test

Chi-squared

Fischer's Exact

Tukey HSD and LSD

Regression

Correlation - Spearman

I've been looking though the SAS help webside and I'm still not finding all the answers I need. I have this final exam on Wednesday, if anyone could help point me in a direction or knows the answer it would be appreciated. Thank you!

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to dleighton

12-11-2016 12:08 PM

You can find at least some of these references in:

These aren't specific to biology at all, but each stat listed does have a link to the SAS procedure or technique you can use to calculate it.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to dleighton

12-11-2016 12:49 PM

If you line up each test with a SAS Procedure you could read the overview section and put together a list of definitions.

I would recommend the STATSOFT textbook though, to get an idea of what each test is quickly. Have you taken a statistics course at Unoversity? A first level stats course covers most of what you're mentioning.

Most are under Basic and Descriptive statistics but you can also look them up in the Glossary.

http://www.statsoft.com/Textbook/Basic-Statistics

This is a lot to learn for Wednesday...Good Luck

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Reeza

12-11-2016 12:58 PM

I did take stats at my university but our professor pushed more for excel or doing the problems by hand. I have a basic foundation of terms and procedures. The SAS system is completely new to me and some of the coding I'm not quite sure why it's needed.

Thank you for all the responses with the links. I am looking through them now. If there was a better background from my professor I might understand the output a little better.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to dleighton

12-11-2016 01:03 PM

Hi:

The Statistics 1 e-learning class is free as self-paced e-learning for any adult learner. You can activate the class by setting up a SAS Profile and clicking the Start button on this page: https://support.sas.com/edu/elearning.html?ctry=us&productType=library and the first lesson has an overview of statistics.

cynthia

The Statistics 1 e-learning class is free as self-paced e-learning for any adult learner. You can activate the class by setting up a SAS Profile and clicking the Start button on this page: https://support.sas.com/edu/elearning.html?ctry=us&productType=library and the first lesson has an overview of statistics.

cynthia

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to dleighton

12-11-2016 02:21 PM

Another way to do this is to look at what PROCs your code contains and review the documentation for those procedures. Specifically, walking through the examples is a good idea. I've tried to identify the relevant tests for you below, but reviewing the code samples is a better bet.

Tests:

PROC TTEST

One Sample T-test

2 Sample T-test

Paired T-test

PROC ANOVA

ANOVA 1 and 2 way

Factoral ANOVA

PROC NPAR1WAY

Mannwhitney U test

PROC FREQ

Chi-squared

Fischer's Exact

Not sure??

Tukey HSD and LSD

PROC REG/GLM?

Regression

PROC CORR

Correlation - Spearman

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Reeza

12-11-2016 02:41 PM

That is extremely helpful! In the part of the code that our professor posted it included:

Data assumps;

set assumps;

absresid = abs(resid);

Run;

Proc plot data = assumps vpercent = 50 50;

plot resid * pred

absresid * pred / VREF = 0

Quit;

proc univariate normal plot data= assumps;

var resid;

quit;

This is all following a proc GLM function including the statement [output out = assumps p=pred r=resid";] I know the proc univariate function is supposed to help determine Normality, but Im not sure what these functions actually do. There are nowhere in my notes and just appeared oneday in the SAS code he brought up so we could copy. Are these codes supposed to be included in ever GLM function?

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to dleighton

12-11-2016 04:23 PM

Learning to use the documentation is worth some time, in general when learning the language.

PROC GLM - OUTPUT statement.

OUT=<SAS-data-set> -> This is the name of the data set created.

P(PREDICTED) = <variable-name> -> name of the variable created that corresponds to the residual value

R(RESIDUAL) = <variable-name> -> name of the variable created that corresponds to the residual value

So the GLM procedure creates a dataset called ASSUMPS that contains at least two variables, one for the predicted value and one for the residual value of the model.

A data step can be used for a variety of things, in this case it's creating a variable using the ABS() function.

ABS() -> Returns the absolute value.

So in this step you're creating a variable called ABSRESID that contains the absolute value of the residual.

```
Data assumps;
set assumps;
absresid = abs(resid);
Run;
```

Are you sure it was PROC PLOT and not SGPLOT? I can't find a reference to PROC PLOT but believe it was used back in the day.

This procedure is creating two graphs, one for residual by predicted and one for the absolute value of the residual by the predicted value.

It's also missing a semi-colon.

```
Proc plot data = assumps vpercent = 50 50;
plot resid * pred absresid * pred / VREF = 0;
Quit;
```

The next step, PROC UNIVARIATE is looking to see if the variable RESID is normally distributed. This is the assumption for linear regression - the error is normally distributed.

```
proc univariate normal plot data= assumps;
var resid;
quit;
```

If you need to do this in real life, I would consider the following. This would produce a diagnostic and residuals plots that would help both assess normality and any issues with your regressions - ie outliers.

```
proc glm data=sashelp.class plots=(diagnostics residuals);
model weight = height age;
run;quit;
```

If you want help understanding the output you can look at either the examples in the PROC or at this page, which has annotated output for many common procs you're looking at, based on your questions.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to dleighton

12-12-2016 10:47 AM

A warning about coding following someone else's examples:

dleighton wrote:

That is extremely helpful! In the part of the code that our professor posted it included:

Data assumps;

set assumps;

absresid = abs(resid);

Run;

Please be **extremely cautious**, I would recommend almost never, use the code structure Data datasetname; Set datasetname;

The step replaces the original data. Depending on possible code logic errors (high probability when learning a programming language) or typos it is very easy to destroy your original data. If you use this approach you should make sure that you can always get back to the original data. One example from code I inherited involved recodeing data from a 1 and 2 value coding to a 0 and 1.

data example; set example; var = var -1; /* other code was here*/ run;

Looks simple and much like your professor's example. However the code had a change needed in the "other code" indicated above. So the change was made and rerun. Now the values that had been 2, after the FIRST pass were reduced to 1 and with the second pass further reduced to 0. So all values of var became either missing or 0.

You have been warned. Less obvious are if the codes switch 0 to 1 and 1 to 0 (which I also inherited). That resulted in the rate of a reportable item changing from 56 percent to 44 percent and everyone in the organization thought there had been a drastic change from the previous year 55 percent.