turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- SAS Procedures
- /
- Creating a new variable from existing

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

03-21-2014 07:16 PM

I feel bad for having to ask for help since I've been trying to teach myself SAS 9.3 but I have run into an issue that I cannot solve the syntax of alone. I am trying to obtain all values of X where X < X - STD(X) without calculating SDT(X) separately and returning all values under that constant value but I think it is much more complicated than I have yet figured out. A simplified version of the code I am trying is displayed below. Thank you for anyone who helps I greatly appreciate it.

data dataset1;

set dataset;

X1 = X < X-STD(X)

run;

The current error I am getting is that STD does not have enough arguments.

Accepted Solutions

Solution

03-21-2014
07:43 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

03-21-2014 07:43 PM

The STD function works on a single row of data, since SAS processes data row by row.

You'll have to pre-calculate the STD and merge it in.

Also, the translation for X1=X<X-STD(X) probably isn't what you want and doesn't make sense.

X<x-std(x) == x-x<-std(x) == 0<-std(x) and std(x) is always positive.

The order of operations will probably mean that x<x will resolve to False or 0 and then subtract std(x) which isn't a valid calculation anyways.

I thought I'd just explain why what you're trying won't work

All Replies

Solution

03-21-2014
07:43 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

03-21-2014 07:43 PM

The STD function works on a single row of data, since SAS processes data row by row.

You'll have to pre-calculate the STD and merge it in.

Also, the translation for X1=X<X-STD(X) probably isn't what you want and doesn't make sense.

X<x-std(x) == x-x<-std(x) == 0<-std(x) and std(x) is always positive.

The order of operations will probably mean that x<x will resolve to False or 0 and then subtract std(x) which isn't a valid calculation anyways.

I thought I'd just explain why what you're trying won't work

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

03-21-2014 08:37 PM

Thank you very much, I was hoping there was a away to calculate it within SAS to save time but I guess not.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

03-22-2014 12:48 PM

Your question doesn't make sense is my point, it would never evaluate to true. It's very easy to calculate STD within SAS.

proc sql;

create table want as

select a.*, std(weight) as std_weight, std(height) as std_height

from sashelp.class;

quit;

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

03-22-2014 02:06 PM

You might want to look at using SAS/IML (if you have it licensed) if you want to treat your data as if it was a matrix instead of individual observations.

SAS has many tools for calculating statistics like STD. For example you can use PROC SUMMARY. Or you can roll your own using PROC SQL.

But perhaps what you want is already available in a PROC? Did you look at PROC STDIZE?

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

03-22-2014 02:22 PM

It looks like you want to flag the outliers. Here is how you could write a program do that using PROC SUMMARY to find the MEAN and STDDEV of your variable.

* generate some random data for testing ;

data dataset ;

do _n_=1 to 10 ;

x=rand('normal',5,10);

output;

end;

run;

* Find the mean and standard deviation ;

proc summary data=dataset ;

var x;

output out=means(drop=_freq_ _type_) mean= std= /autoname ;

run;

* Combine and create new LOWVALUE and HIVALUE boolean flag variables ;

data want ;

set dataset ;

if _n_=1 then set means ;

lowvalue = x < x_mean - x_stddev ;

hivalue = x > x_mean + x_stddev ;

run;

x_Std

Obs x x_Mean Dev lowvalue hivalue

1 15.2342 6.79499 10.7221 0 0

2 26.3757 6.79499 10.7221 0 1

3 5.6843 6.79499 10.7221 0 0

4 -3.9084 6.79499 10.7221 0 0

5 8.8976 6.79499 10.7221 0 0

6 -8.4245 6.79499 10.7221 1 0

7 0.2006 6.79499 10.7221 0 0

8 16.1158 6.79499 10.7221 0 0

9 10.2483 6.79499 10.7221 0 0

10 -2.4737 6.79499 10.7221 0 0