DATA Step, Macro, Functions and more

Do Until - Trying to increment by units of 5 until a certain number is reached

Accepted Solution Solved
Reply
Frequent Contributor
Posts: 77
Accepted Solution

Do Until - Trying to increment by units of 5 until a certain number is reached

Greetings,

I have a data set with 904 observations.

All of the observations have a distance field, that shows the distance from a particular point. (The range of the distances is 23 miles to 3,807 miles)

I am attempting to run this data set through a macro do loop, that increments the mileage requirements by 5 miles, in order to find the approximately 100 nearest values.

If there are 99 values within a certain mileage range, and 103 within the previous range + 5 - then I want to keep the 103 observations.

The code I am currently trying is this:

%macro FILTER_OBSERVATIONS(howmany);

%LET SAMPLE_SIZE=0;    

%LET MILES=0;

%DO %UNTIL (&SAMPLE_SIZE >= &howmany);

    DATA CIP_DATA;

        RETAIN DISTANCE /*USE*/ ;

        SET Ps_final_cip4;

        if cip4='52.02';

    RUN;

     

    %IF 0 < DISTANCE <= &MILES %THEN %DO;

        %LET SAMPLE_SIZE=%EVAL(&SAMPLE_SIZE + 1);

        *USE=1;

       %END;

    %LET MILES=%EVAL(&MILES + 5);

%END;

%mend FILTER_OBSERVATIONS;

I have a feeling that there is A LOT wrong with the above code, so please, let me know how I should go about doing this.


Accepted Solutions
Solution
‎01-02-2014 04:33 PM
Super User
Super User
Posts: 6,499

Re: Do Until - Trying to increment by units of 5 until a certain number is reached

You should be able do it with a simple data step.  If you want to make the target number of observations or granularity variable then replace the hard coded 100 and 5 to macro variable references.

data want ;

  if _n_=1 then do;

    if 100 < nobs then do;

      set have (keep=distance rename=(distance=cutoff) firstobs=100);

      put cutoff=;

      cutoff = 5*ceil(cutoff/5);

    end;

    else cutoff = constant('big');

    put cutoff=;

    retain cutoff ;

    drop cutoff ;

  end;

  set have nobs=nobs ;

  by distance;

  if distance <= cutoff;

run;

View solution in original post


All Replies
Super Contributor
Posts: 307

Re: Do Until - Trying to increment by units of 5 until a certain number is reached

I'm curious - why not just sort the data by "distance" and output the first 100 observations?

Frequent Contributor
Posts: 77

Re: Do Until - Trying to increment by units of 5 until a certain number is reached

Well, because the obs directly after 100 could be close enough to also be considered.

Say for example that obs 100 is 403.25 miles away, obs 101 is 403.26, obs 102 is 404.10 obs 103 is 406.07. We are find dropping obs 103 and higher, but would want to keep 101 and 102. Starting at 0 miles and incrementing by 5 miles until we get a number >= 100.

We want at least 100 observations, but want to stop once at least 100 is reached - in increments of 5 miles.

Solution
‎01-02-2014 04:33 PM
Super User
Super User
Posts: 6,499

Re: Do Until - Trying to increment by units of 5 until a certain number is reached

You should be able do it with a simple data step.  If you want to make the target number of observations or granularity variable then replace the hard coded 100 and 5 to macro variable references.

data want ;

  if _n_=1 then do;

    if 100 < nobs then do;

      set have (keep=distance rename=(distance=cutoff) firstobs=100);

      put cutoff=;

      cutoff = 5*ceil(cutoff/5);

    end;

    else cutoff = constant('big');

    put cutoff=;

    retain cutoff ;

    drop cutoff ;

  end;

  set have nobs=nobs ;

  by distance;

  if distance <= cutoff;

run;

Frequent Contributor
Posts: 77

Re: Do Until - Trying to increment by units of 5 until a certain number is reached

Tom,

Can you please explain what the constant('big') is doing?

Thank you for the answer, it appears to do exactly what I want.

Greg

Super User
Super User
Posts: 6,499

Re: Do Until - Trying to increment by units of 5 until a certain number is reached

Look up the CONSTANT() function in the help pages.  Or read this blog entry.  Constant function in SAS | Statistical Programming

If there are less than 100 observations in the source dataset the program uses the CONSTANT() function to set CUTOFF to the maximum possible number so that the IF statement does not eliminate any observations.

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 5 replies
  • 362 views
  • 3 likes
  • 3 in conversation