Solved
New Contributor
Posts: 3

# Trim data set

Goodday to you,

I am trying to trim the data set which is sorted in ascending order. I have no idea how to trim the data set from start and end for k% of data. Basically most of the post I could found is how to perform trimmed mean, but what I want is just trim the data set.

XN = DAT[POS1:ENDPOS];
CALL SORT(XN);

...

I am very new to SAS IML, hope someone could help me out.

Thanks in advance!

Accepted Solutions
Solution
‎10-20-2015 03:25 AM
SAS Super FREQ
Posts: 4,171

## Re: Trim data set

Posted in reply to vince_tsp

Trimming is the act of truncating the upper and lower tails of an empirical UNIVARIATE distribution, so we don't usually talk about trimming a data set, we talk about trimming a variable.

To trim a variable, look at p. 2-3 of my 2010 SAS Global Forum paper, which has an algorithm for computing the trimmed mean and variance of every column in a matrix.  You can modify it to extract the "middle" observations:

``````proc iml;
/* assume v is a column vector. Return the sorted
elements that result from trimming the largest and smallest
proportion of values. The 'prop' parameter is 0 < prop < 1. */
start TrimVec(v, prop);
n = nrow(v);      /* num rows (assume no missing values) */
d = ceil(prop*n); /* number of observations to trim */
z = v;            /* copy it */
call sort(z,1);   /* sort it */
w = z[d+1:n-d, ]; /* trim d largest and d smallest values */
return (w);
finish;

use sashelp.cars;
read all var "mpg_city" into x;
close;
trimX = TrimVec(x, 0.12);
``````

The key is the statement

w = z[ d+1:n-d, ];

The expression  d+1:n-d uses the index operator ( to represent the observations to keep.  You can use the same syntax to extract those rows from an entire matrix:

smallerMatrix = bigMatrix[ d+1:n-d, ];

See the article "Creating vectors that contain evenly spaced values" for a description of the index operator ( .

All Replies
Contributor
Posts: 32

## Re: Trim data set

Posted in reply to vince_tsp

Gday!

I am not sure if/why you need IML. Regular datastep code would be something like:

%let percentage = 20;

data trimmed;

set base nobs=size;

if (size * &percentage / 200) <= _N_ <= size - (size * &percentage / 200) then output;

run;

Hope this helps,

Eric

New Contributor
Posts: 3

## Re: Trim data set

Thanks for your try to help me. Really appreciated that.
Solution
‎10-20-2015 03:25 AM
SAS Super FREQ
Posts: 4,171

## Re: Trim data set

Posted in reply to vince_tsp

Trimming is the act of truncating the upper and lower tails of an empirical UNIVARIATE distribution, so we don't usually talk about trimming a data set, we talk about trimming a variable.

To trim a variable, look at p. 2-3 of my 2010 SAS Global Forum paper, which has an algorithm for computing the trimmed mean and variance of every column in a matrix.  You can modify it to extract the "middle" observations:

``````proc iml;
/* assume v is a column vector. Return the sorted
elements that result from trimming the largest and smallest
proportion of values. The 'prop' parameter is 0 < prop < 1. */
start TrimVec(v, prop);
n = nrow(v);      /* num rows (assume no missing values) */
d = ceil(prop*n); /* number of observations to trim */
z = v;            /* copy it */
call sort(z,1);   /* sort it */
w = z[d+1:n-d, ]; /* trim d largest and d smallest values */
return (w);
finish;

use sashelp.cars;
read all var "mpg_city" into x;
close;
trimX = TrimVec(x, 0.12);
``````

The key is the statement

w = z[ d+1:n-d, ];

The expression  d+1:n-d uses the index operator ( to represent the observations to keep.  You can use the same syntax to extract those rows from an entire matrix:

smallerMatrix = bigMatrix[ d+1:n-d, ];

See the article "Creating vectors that contain evenly spaced values" for a description of the index operator ( .

New Contributor
Posts: 3

## Re: Trim data set

Thanks Rick, finally I got my solution after 1 weeks try and errors.
🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
• 4 replies
• 436 views
• 1 like
• 3 in conversation