Apply two types of floors to the dataset

fwu811 · Posted 05-15-2024 12:08 PM

Hi,

I would like to apply two types of floors, respectively for lower and upper segments of a dataset.

Assume the dataset has two columns: Col1 which is unique and asc sorted; Col2 is the actual data the floor applies to

1) Floor 1 on the lower band of Col1 : for any values in Col1 less than3, which is set to by the user - replace their values in Col2 with the Col2 value of Col1 =3

2)Floor 2 on the upper band of Col1: for any values in Col1 greater than 4, which is set to by the user - replace their values in Col2 with the Col2 value of previous Col1 to ensure they are not decreasing when Col1 values increase

Below is an example of the dataset I have and what I want.

Many thanks in advance.

data have;
input Col1 Col2;
datalines;
1 1
2 2
3 3
4 4
5 3
6 6
7 1
8 3
;

data want;
Col1 Col2
1 3
2 3
3 3
4 4
5 4
6 6
7 6
8 6
;

Tom · Posted 05-15-2024 12:35 PM

I do NOT understand your data structure.

Why would using the "Col2 value of Col1 =3" make any sense at all? What is the meaning of COL1? What is the meaning of COL2. What is the relationship between COL1 and COL2? What is the meaning of 3?

If that value is so important why not just put it into its own variable, or perhaps its own dataset?

Why are you calling this operation a FLOOR? That word usually means a lower bound on the possible value. So if the floor is set at 23 and the current value is only 20 you would need to raise it to 23 do that it does not drop below the floor.

Are you trying to set a BAND or RANGE that the values cannot be outside of? So if the minimum value should be 20 and maximum value should be 40 then the adjustment might be something like:

want = max(20,min(have,40));

So that values over 40, like 45, would map to 40 and values under 20, like 15, would map to 20 but values between 20 to 40, like 25, would be unchanged.

Tom · Posted 05-15-2024 12:59 PM

One possible interpretation of what you appear to be asking for is to calculate a running maximum values of COL2. With the added feature of starting with the third value

data want ;
  if _n_=1 then set have(firstobs=3 obs=3 rename=(col2=running_max));
  set have;
  running_max=max(running_max,col2);
run;

or possibly the value where col1=3.

data want ;
  if _n_=1 then do;
    set have(obs=1 rename=(col2=running_max));
    where col1=3;
  end;
  set have;
  running_max=max(running_max,col2);
run;

Result

               running_
Obs    Col1       max      Col2

 1       1         3         1
 2       2         3         2
 3       3         3         3
 4       4         4         4
 5       5         4         3
 6       6         6         6
 7       7         6         1
 8       8         6         3

fwu811 · Posted 05-16-2024 08:50 AM

Hi Tom,

Thanks for your reply, which is useful for all Col1 values less than 3.

The purpose of these checks are due to poor quality of data at both ends.

However, I only want to apply the max Col2 value check for a certain range of Col1 values, e.g., when it is greater than 6, so the data I want are:

data want:

1 3
2 3
3 3
4 4
5 3
6 6
7 6
8 6

Can you amend your code for this purpose?

Thank you again.

Apply two types of floors to the dataset

Re: Apply two types of floors to the dataset

Re: Apply two types of floors to the dataset

Re: Apply two types of floors to the dataset

Apply two types of floors to the dataset

Re: Apply two types of floors to the dataset

Re: Apply two types of floors to the dataset

Re: Apply two types of floors to the dataset

SAS Innovate 2025: Call for Content

Click image to register for webinar

Classroom Training Available!