BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
GregG
Quartz | Level 8

Greetings,

I have a data set with 904 observations.

All of the observations have a distance field, that shows the distance from a particular point. (The range of the distances is 23 miles to 3,807 miles)

I am attempting to run this data set through a macro do loop, that increments the mileage requirements by 5 miles, in order to find the approximately 100 nearest values.

If there are 99 values within a certain mileage range, and 103 within the previous range + 5 - then I want to keep the 103 observations.

The code I am currently trying is this:

%macro FILTER_OBSERVATIONS(howmany);

%LET SAMPLE_SIZE=0;    

%LET MILES=0;

%DO %UNTIL (&SAMPLE_SIZE >= &howmany);

    DATA CIP_DATA;

        RETAIN DISTANCE /*USE*/ ;

        SET Ps_final_cip4;

        if cip4='52.02';

    RUN;

     

    %IF 0 < DISTANCE <= &MILES %THEN %DO;

        %LET SAMPLE_SIZE=%EVAL(&SAMPLE_SIZE + 1);

        *USE=1;

       %END;

    %LET MILES=%EVAL(&MILES + 5);

%END;

%mend FILTER_OBSERVATIONS;

I have a feeling that there is A LOT wrong with the above code, so please, let me know how I should go about doing this.

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

You should be able do it with a simple data step.  If you want to make the target number of observations or granularity variable then replace the hard coded 100 and 5 to macro variable references.

data want ;

  if _n_=1 then do;

    if 100 < nobs then do;

      set have (keep=distance rename=(distance=cutoff) firstobs=100);

      put cutoff=;

      cutoff = 5*ceil(cutoff/5);

    end;

    else cutoff = constant('big');

    put cutoff=;

    retain cutoff ;

    drop cutoff ;

  end;

  set have nobs=nobs ;

  by distance;

  if distance <= cutoff;

run;

View solution in original post

5 REPLIES 5
Fugue
Quartz | Level 8

I'm curious - why not just sort the data by "distance" and output the first 100 observations?

GregG
Quartz | Level 8

Well, because the obs directly after 100 could be close enough to also be considered.

Say for example that obs 100 is 403.25 miles away, obs 101 is 403.26, obs 102 is 404.10 obs 103 is 406.07. We are find dropping obs 103 and higher, but would want to keep 101 and 102. Starting at 0 miles and incrementing by 5 miles until we get a number >= 100.

We want at least 100 observations, but want to stop once at least 100 is reached - in increments of 5 miles.

Tom
Super User Tom
Super User

You should be able do it with a simple data step.  If you want to make the target number of observations or granularity variable then replace the hard coded 100 and 5 to macro variable references.

data want ;

  if _n_=1 then do;

    if 100 < nobs then do;

      set have (keep=distance rename=(distance=cutoff) firstobs=100);

      put cutoff=;

      cutoff = 5*ceil(cutoff/5);

    end;

    else cutoff = constant('big');

    put cutoff=;

    retain cutoff ;

    drop cutoff ;

  end;

  set have nobs=nobs ;

  by distance;

  if distance <= cutoff;

run;

GregG
Quartz | Level 8

Tom,

Can you please explain what the constant('big') is doing?

Thank you for the answer, it appears to do exactly what I want.

Greg

Tom
Super User Tom
Super User

Look up the CONSTANT() function in the help pages.  Or read this blog entry.  Constant function in SAS | Statistical Programming

If there are less than 100 observations in the source dataset the program uses the CONSTANT() function to set CUTOFF to the maximum possible number so that the IF statement does not eliminate any observations.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 1192 views
  • 3 likes
  • 3 in conversation