BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
MACRO_LOOP
Obsidian | Level 7

Hi Everyone,

 

I have data like the below. I need to check that the data is Monotonic (i.e. it increases as we go from top to bottom and as we go from left top right)

 

I need to perform two checks on the data:

1.Check that the values are increasing across rows (from left to right).

2. Check that the values are increasing within columns (from top to bottom).

 

 

Based on the results of the above checks, I would like to write a MACRO or Routine that performs a calculation if any issues with the data are identified. For example:

 

1. Identify instances where the data is not monotonic (i.e if there are values that do not increase as we move from left to right within a row, or from top to bottom within a column.

2 If there is a value that is not increasing, take the average of the two neighboring values

e.g. a) if values are not increasing within a row (take the average of the value to the left of it and to the right),

       b) if a value is not increasing within a column, take the average of the value above and below it.)

 

 

Try as I might, I can't seem to write a Macro that will perform this efficiently.

 

Thus Far, I have been trying to create a macro that combines the LAG and DIF functions, similar to the calculation that is performed in this thread https://communities.sas.com/t5/Base-SAS-Programming/Select-increasing-values-over-time/td-p/308887

 

If any of the above is unclear or there is any additional information that I can provide, please feel free to ask.

 

Best Wishes,

 

ML

 

 

 

Data Have;
INPUT Maturity AAA AA1 AA2 AA3 A1 A2 A3 BBB1 BBB2 BBB3 Diff;
Cards;
2 -34.4327692307692 -28.9316769940566 -23.430584757344 -15.0160480358146 -6.60151131428524 1.81302540724413 4.21578385122723 6.61854229521034 9.02130073919344 47.1116233198386 47.1116233198386
4 -34.6538461538462 -28.6250581859124 -22.5962702179786 -11.5922512756875 -0.588232333396363 10.4157866088948 18.6684725074859 26.921158406077 35.1738443046681 73.2641668853133 73.2641668853133
6 -32.5453846153846 -26.2079144328958 -19.8704442504071 -7.35167531268618 5.16709362503472 17.6858625627556 29.3605366542316 41.0352107457076 52.7098848371835 90.8002074178287 90.8002074178287
8 -28.2504615384615 -21.6939778393066 -15.1374941401517 -1.54399297709882 12.0495081859541 25.6430093490069 39.7456227022061 53.8482360554052 67.9508494086044 106.04117198925 106.04117198925
12 -18.8621538461538 -11.9969879324439 -5.13182201873401 9.97642913974865 25.0846802982313 40.192931456714 57.7175330027979 75.2421345488819 92.7667360949659 130.857058675611 130.857058675611
15 -15.4470815535988 -7.58191563988893 -0.716749726179012 14.3915014323036 29.4997525907863 44.608003749269 62.1326052953529 79.6572068414369 97.1818083875209 135.272130968166 135.272130968166
20 -13.3016276291591 -4.43646171544923 2.42870419826069 17.5369553567433 32.645206515226 47.7534576737087 65.2780592197927 82.8026607658766 100.327262311961 138.417584892606 138.417584892606
30 -7.45893847823167 2.40622743547825 9.27139334918817 24.3796445076708 39.4878956661535 54.5961468246361 72.1207483707201 89.6453499168041 107.169951462888 145.260274043533 145.260274043533

 

1 ACCEPTED SOLUTION

Accepted Solutions
Kurt_Bremser
Super User

See my expanded code. It merges the dataset to itself, starting at the second observation, which provides the look ahead:

data want (drop=_: helpvars:);
merge
  have
  have (
    firstobs=2
    rename=(
      AAA=_AAA
      AA1=_AA1
      AA2=_AA2
      AA3=_AA3
      A1=_A1
      A2=_A2
      A3=_A3
      BBB1=_BBB1
      BBB2=_BBB2
      BBB3=_BBB3
    )
  )
;
array vars {*} AAA AA1 AA2 AA3 A1 A2 A3 BBB1 BBB2 BBB3;
array _vars {*} _:;
array helpvars {10};
predif = 0;
indif = 0;
do i = 1 to dim(vars);
  if i > 1 then if vars{i-1} > vars{i}
  then do;
    indif = 1;
    if i = dim(vars)
    then vars{i} = vars{i-1};
    else vars{i} = (vars{i-1} + vars{i+1}) / 2;
  end;
  helpvars{i} = lag(vars{i});
  if helpvars{i} > vars{i}
  then do;
    predif = 1;
    vars{i} = (helpvars{i} + _vars{i}) / 2;
  end;
end;
drop i;
run;

View solution in original post

10 REPLIES 10
Kurt_Bremser
Super User

Try this:

data want;
set have;
array vars {*} AAA AA1 AA2 AA3 A1 A2 A3 BBB1 BBB2 BBB3;
predif = 0;
indif = 0;
do i = 1 to dim(vars);
  if i > 1 then if vars{i-1} > vars{i} then indif = 1;
  if lag(vars{i}) > vars{i} then predif = 1;
end;
drop i;
run;
MACRO_LOOP
Obsidian | Level 7

Hi Kurt,

 

Thank You for the very helpful response.

 

Your code works perfectly at identifying the column/variable where there is an issue.

 

The challenge I face is that I now need to create a new column that replaces the value which is not increasing and replaces it with the Average of the neighboring values. In this instance, there is an issue with the AAA column when the Maturity is equal to 4.

 

My hope is that I can create a new column (something like "CORRECTED_AAA") that will fix the issue with the Maturity=4 value and merge this with the values from the AAA column that are not causing an issue.

 

I am quite new to SAS and while I have tried using the lag function to do this, I've not been able to get it working exactly as I need.

 

Any suggestions on how I might modify your code to get this outcome?

MACRO_LOOP
Obsidian | Level 7

 

 

 

My intention is to correct instances where the values are not increasing from top to bottom. When this happens, I would like the code to replace the issue value with the following code AVERAGE(Below value, Above Value).

 

In the below table, the variable AAA represents the values that I have. Notice that there is an issue with the value for Maturity=4 NEW_AAA represents the new variable that I am trying to create. It takes the AVERAGE(Maturity=2,Maturity=6) to create a new value that replaces the issue value.

 

The code I have doesn't seem to working as I hoped. It seems to be taking the average of the values that are to the right and left of the problem value, rather than those that are above and below.

Kurt_Bremser
Super User

See my expanded code. It merges the dataset to itself, starting at the second observation, which provides the look ahead:

data want (drop=_: helpvars:);
merge
  have
  have (
    firstobs=2
    rename=(
      AAA=_AAA
      AA1=_AA1
      AA2=_AA2
      AA3=_AA3
      A1=_A1
      A2=_A2
      A3=_A3
      BBB1=_BBB1
      BBB2=_BBB2
      BBB3=_BBB3
    )
  )
;
array vars {*} AAA AA1 AA2 AA3 A1 A2 A3 BBB1 BBB2 BBB3;
array _vars {*} _:;
array helpvars {10};
predif = 0;
indif = 0;
do i = 1 to dim(vars);
  if i > 1 then if vars{i-1} > vars{i}
  then do;
    indif = 1;
    if i = dim(vars)
    then vars{i} = vars{i-1};
    else vars{i} = (vars{i-1} + vars{i+1}) / 2;
  end;
  helpvars{i} = lag(vars{i});
  if helpvars{i} > vars{i}
  then do;
    predif = 1;
    vars{i} = (helpvars{i} + _vars{i}) / 2;
  end;
end;
drop i;
run;
MACRO_LOOP
Obsidian | Level 7

Kurt,

 

I cannot thank you enough!

 

I have been working with your code for the past hour and it is working a treat!

 

I had spent quite a long time working through different approaches to find a solution  and the method that you have provided will prove to be very useful for the future projects. I hope that the code that you have provided is of help to many other users.

 

Thank you very much for all of your time on this. All of the help is highly appreciated!

 

Best Wishes!

 

ML 

RW9
Diamond | Level 26 RW9
Diamond | Level 26

Its a good idea to use code window (its the {i} above post) for clarity when posting code. 

No need to use macro (ever in fact, Base SAS can do everything and macro is just a basic find/replace to create Base SAS!).

 

So across, you can use arrays and for down you can use lag() - although make sure you do not alter the variables in the lag, create a new variable:

data have;
  inpu maturity AAA AA1 AA2 AA3 A1 A2 A3 BBB1 BBB2 BBB3 Diff;
...
;
run;

data want;
  set have;
  array vals{11} aaa--diff;
  do i=2 to 11;
    if vals{i-1} >=vals{i} then do;
      /* do some code here as the current value is not greater than previous */
    end;
  end;
do i=1 to 11;
if lag(vals{i}) >= vals{i}) then do;
/* or
if diff(vals{i}) >= 0 then do;
*/
/* do some code here as the current value is not greater than previous record */
end;
end; run;
MACRO_LOOP
Obsidian | Level 7

Hi RW9,

 

Thank You for the quick reply and the tip on entering the SAS code in the window. It has been very helpful.

 

I've been trying to modify your code so that a new column is created with the corrected value when an issue arises.

 

For example, In my data there is an issue with the AAA variable where the values do not increase as we go from top to bottom. My hope is that I can create a new column that retains the values which are increasing in value (Maturities 2,6,8,12,15,20,30) and replace the value that is causing the issue (Maturity=4) with a new value computed as MEAN(AAA value where Maturity=2, AAA value where Maturity=4).

 

My hope is that a new column/variable  is created for any variable/column where this issue arises.

 

Below is my attempt at modifying the code. Any tips on how I might find a solution?

Data Have;
INPUT Maturity AAA AA1 AA2 AA3 A1 A2 A3 BBB1 BBB2 BBB3 Diff;
Cards;
2 -34.4327692307692 -28.9316769940566 -23.430584757344 -15.0160480358146 -6.60151131428524 1.81302540724413 4.21578385122723 6.61854229521034 9.02130073919344 47.1116233198386 47.1116233198386
4 -34.6538461538462 -28.6250581859124 -22.5962702179786 -11.5922512756875 -0.588232333396363 10.4157866088948 18.6684725074859 26.921158406077 35.1738443046681 73.2641668853133 73.2641668853133
6 -32.5453846153846 -26.2079144328958 -19.8704442504071 -7.35167531268618 5.16709362503472 17.6858625627556 29.3605366542316 41.0352107457076 52.7098848371835 90.8002074178287 90.8002074178287
8 -28.2504615384615 -21.6939778393066 -15.1374941401517 -1.54399297709882 12.0495081859541 25.6430093490069 39.7456227022061 53.8482360554052 67.9508494086044 106.04117198925 106.04117198925
12 -18.8621538461538 -11.9969879324439 -5.13182201873401 9.97642913974865 25.0846802982313 40.192931456714 57.7175330027979 75.2421345488819 92.7667360949659 130.857058675611 130.857058675611
15 -15.4470815535988 -7.58191563988893 -0.716749726179012 14.3915014323036 29.4997525907863 44.608003749269 62.1326052953529 79.6572068414369 97.1818083875209 135.272130968166 135.272130968166
20 -13.3016276291591 -4.43646171544923 2.42870419826069 17.5369553567433 32.645206515226 47.7534576737087 65.2780592197927 82.8026607658766 100.327262311961 138.417584892606 138.417584892606
30 -7.45893847823167 2.40622743547825 9.27139334918817 24.3796445076708 39.4878956661535 54.5961468246361 72.1207483707201 89.6453499168041 107.169951462888 145.260274043533 145.260274043533


data want;
  set have;
  array vals{11} aaa--diff;
  do i=2 to 10;
    if vals{i-1} >=vals{i} then do;

      /* do some code here as the current value is not greater than previous */
	NEW_COL_VAL=MEAN(vals{i-1},vals{i});
    end;
  end;
  do i=1 to 10;
    if lag(vals{i}) >= vals{i} then do;

/* or 
    if diff(vals{i}) >= 0 then do;
*/
      /* do some code here as the current value is not greater than previous record */

	NEW_ROW_VAL=MEAN(LAG(vals{i}),(vals{i}));
	NEW_ROW_VAL1=MEAN(vals{i},vals{i+1});
    end;
	end;

run; 

 

RW9
Diamond | Level 26 RW9
Diamond | Level 26

I would start by showing some simple test data, just a couple of rows, with simple numbers like 1 or 2, then show what the output should look like.  Your code seems to work, but obviously doesn't function as you intend, but I can't really tell what you intend.

MACRO_LOOP
Obsidian | Level 7

Hi RW9,

 

Thank you for the reply.

 

I have simplified my code.

 

My intention is to correct instances where the values are not increasing from top to bottom. When this happens, I would like the code to replace the issue value with the following code AVERAGE(Below value, Above Value).

 

In the below table, the variable AAA represents the values that I have. Notice that there is an issue with the value for Maturity=4 NEW_AAA represents the new variable that I am trying to create. It takes the AVERAGE(Maturity=2,Maturity=6) to create a new value that replaces the issue value.

 

The code I have doesn't seem to working as I hoped. It seems to be taking the average of the values that are to the right and left of the problem value, rather than those that are above and below.

 

 


Data Have; INPUT Maturity AAA NEW_AAA; Cards; 2 1 1 4 5 2.5 6 4 4 8 7 7 12 9 9 15 11 11 20 13 13 30 15 15 data want; set have; array vals{2} aaa--New_AAA; do i=2 to 2; if vals{i-1} >=vals{i} then do; /* do some code here as the current value is not greater than previous */ NEW_COL_VAL=MEAN(vals{i-1},vals{i}); end; end; do i=1 to 2; if lag(vals{i}) >= vals{i} then do; /* or if diff(vals{i}) >= 0 then do; */ /* do some code here as the current value is not greater than previous record */ NEW_ROW_VAL=MEAN(LAG(vals{i}),(vals{i})); NEW_ROW_VAL1=MEAN(vals{i},vals{i+1}); end; end; run;

 

 

It identifies that there is an issue with one of the values and corrects it by taking the AVERAGE(Below value, Above Value) and merges this newly created value with the existing data from the AAA column.

 

I hope 

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 10 replies
  • 2355 views
  • 0 likes
  • 3 in conversation