turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- Base SAS Programming
- /
- Do Until - Trying to increment by units of 5 until...

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

01-02-2014 03:37 PM

Greetings,

I have a data set with 904 observations.

All of the observations have a distance field, that shows the distance from a particular point. (The range of the distances is 23 miles to 3,807 miles)

I am attempting to run this data set through a macro do loop, that increments the mileage requirements by 5 miles, in order to find the approximately 100 nearest values.

If there are 99 values within a certain mileage range, and 103 within the previous range + 5 - then I want to keep the 103 observations.

The code I am currently trying is this:

%macro FILTER_OBSERVATIONS(howmany);

%LET SAMPLE_SIZE=0;

%LET MILES=0;

%DO %UNTIL (&SAMPLE_SIZE >= &howmany);

DATA CIP_DATA;

RETAIN DISTANCE /*USE*/ ;

SET Ps_final_cip4;

if cip4='52.02';

RUN;

%IF 0 < DISTANCE <= &MILES %THEN %DO;

%LET SAMPLE_SIZE=%EVAL(&SAMPLE_SIZE + 1);

*USE=1;

%END;

%LET MILES=%EVAL(&MILES + 5);

%END;

%mend FILTER_OBSERVATIONS;

I have a feeling that there is A LOT wrong with the above code, so please, let me know how I should go about doing this.

Accepted Solutions

Solution

01-02-2014
04:33 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

01-02-2014 04:33 PM

You should be able do it with a simple data step. If you want to make the target number of observations or granularity variable then replace the hard coded 100 and 5 to macro variable references.

data want ;

if _n_=1 then do;

if 100 < nobs then do;

set have (keep=distance rename=(distance=cutoff) firstobs=100);

put cutoff=;

cutoff = 5*ceil(cutoff/5);

end;

else cutoff = constant('big');

put cutoff=;

retain cutoff ;

drop cutoff ;

end;

set have nobs=nobs ;

by distance;

if distance <= cutoff;

run;

All Replies

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

01-02-2014 03:50 PM

I'm curious - why not just sort the data by "distance" and output the first 100 observations?

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

01-02-2014 04:04 PM

Well, because the obs directly after 100 could be close enough to also be considered.

Say for example that obs 100 is 403.25 miles away, obs 101 is 403.26, obs 102 is 404.10 obs 103 is 406.07. We are find dropping obs 103 and higher, but would want to keep 101 and 102. Starting at 0 miles and incrementing by 5 miles until we get a number >= 100.

We want at least 100 observations, but want to stop once at least 100 is reached - in increments of 5 miles.

Solution

01-02-2014
04:33 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

01-02-2014 04:33 PM

You should be able do it with a simple data step. If you want to make the target number of observations or granularity variable then replace the hard coded 100 and 5 to macro variable references.

data want ;

if _n_=1 then do;

if 100 < nobs then do;

set have (keep=distance rename=(distance=cutoff) firstobs=100);

put cutoff=;

cutoff = 5*ceil(cutoff/5);

end;

else cutoff = constant('big');

put cutoff=;

retain cutoff ;

drop cutoff ;

end;

set have nobs=nobs ;

by distance;

if distance <= cutoff;

run;

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

01-02-2014 05:07 PM

Tom,

Can you please explain what the constant('big') is doing?

Thank you for the answer, it appears to do exactly what I want.

Greg

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

01-02-2014 05:48 PM

Look up the CONSTANT() function in the help pages. Or read this blog entry. Constant function in SAS | Statistical Programming

If there are less than 100 observations in the source dataset the program uses the CONSTANT() function to set CUTOFF to the maximum possible number so that the IF statement does not eliminate any observations.