turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

10-18-2015 11:12 AM

Goodday to you,

I am trying to trim the data set which is sorted in ascending order. I have no idea how to trim the data set from start and end for k% of data. Basically most of the post I could found is how to perform trimmed mean, but what I want is just trim the data set.

XN = DAT[POS1:ENDPOS];

CALL SORT(XN);

...

I am very new to SAS IML, hope someone could help me out.

Thanks in advance!

Accepted Solutions

Solution

10-20-2015
03:25 AM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to vince_tsp

10-19-2015 06:25 AM

Trimming is the act of truncating the upper and lower tails of an empirical UNIVARIATE distribution, so we don't usually talk about trimming a data set, we talk about trimming a variable.

To trim a variable, look at p. 2-3 of my 2010 SAS Global Forum paper, which has an algorithm for computing the trimmed mean and variance of every column in a matrix. You can modify it to extract the "middle" observations:

```
proc iml;
/* assume v is a column vector. Return the sorted
elements that result from trimming the largest and smallest
proportion of values. The 'prop' parameter is 0 < prop < 1. */
start TrimVec(v, prop);
n = nrow(v); /* num rows (assume no missing values) */
d = ceil(prop*n); /* number of observations to trim */
z = v; /* copy it */
call sort(z,1); /* sort it */
w = z[d+1:n-d, ]; /* trim d largest and d smallest values */
return (w);
finish;
use sashelp.cars;
read all var "mpg_city" into x;
close;
trimX = TrimVec(x, 0.12);
```

The key is the statement

w = z[ d+1:n-d, ];

The expression d+1:n-d uses the index operator ( to represent the observations to keep. You can use the same syntax to extract those rows from an entire matrix:

smallerMatrix = bigMatrix[ d+1:n-d, ];

See the article "Creating vectors that contain evenly spaced values" for a description of the index operator ( .

All Replies

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to vince_tsp

10-19-2015 03:51 AM

Gday!

I am not sure if/why you need IML. Regular datastep code would be something like:

%let percentage = 20;

data trimmed;

set base nobs=size;

if (size * &percentage / 200) <= _N_ <= size - (size * &percentage / 200) then output;

run;

Hope this helps,

Eric

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

10-20-2015 03:26 AM

Thanks for your try to help me. Really appreciated that.

Solution

10-20-2015
03:25 AM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to vince_tsp

10-19-2015 06:25 AM

Trimming is the act of truncating the upper and lower tails of an empirical UNIVARIATE distribution, so we don't usually talk about trimming a data set, we talk about trimming a variable.

To trim a variable, look at p. 2-3 of my 2010 SAS Global Forum paper, which has an algorithm for computing the trimmed mean and variance of every column in a matrix. You can modify it to extract the "middle" observations:

```
proc iml;
/* assume v is a column vector. Return the sorted
elements that result from trimming the largest and smallest
proportion of values. The 'prop' parameter is 0 < prop < 1. */
start TrimVec(v, prop);
n = nrow(v); /* num rows (assume no missing values) */
d = ceil(prop*n); /* number of observations to trim */
z = v; /* copy it */
call sort(z,1); /* sort it */
w = z[d+1:n-d, ]; /* trim d largest and d smallest values */
return (w);
finish;
use sashelp.cars;
read all var "mpg_city" into x;
close;
trimX = TrimVec(x, 0.12);
```

The key is the statement

w = z[ d+1:n-d, ];

The expression d+1:n-d uses the index operator ( to represent the observations to keep. You can use the same syntax to extract those rows from an entire matrix:

smallerMatrix = bigMatrix[ d+1:n-d, ];

See the article "Creating vectors that contain evenly spaced values" for a description of the index operator ( .

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Rick_SAS

10-20-2015 03:28 AM

Thanks Rick, finally I got my solution after 1 weeks try and errors.