Solved: Re: compare 1st row 3rd column value to 2nd row 4th column value and c...

haurspvybintek · Posted 09-08-2021 11:38 AM

Hi All,

I'm new to SAS and tried exploring multiple options but not able to find the right one. Hence reaching to the experts here.

I have Data like below:

UniqueID tier disc min_amt max_amt adj_amt_min

1 5 28 1278 2055

2 4 25 849 1295

3 3 25 545 794

4 2 20 369 520

5 1 19 320 320

I did group by unique id

I need to iterate from the top to bottom and update the adj_amt_min, adj_amt_max and calculate adj_disc by comparing in below way.

1. If 1st row third cell value is less than the 2nd row fourth cell value then adj_amt_min should be set as 2nd row fourth cell value else values for adj_amt_min should be same as min_amt.

2. If 2nd row third cell value is less than the 3rd row fourth cell value then adj_amt_min should be set as 3rd row fourth cell value else values for adj_amt_min should be same as min_amt.

3. If 3rd row third cell value is less than the 4th row fourth cell value then adj_amt_min should be set as 4th row fourth cell value else values for adj_amt_min should be same as min_amt.

4. If 4th row third cell value is less than the 5th row fourth cell value then adj_amt_min should be set as 5th row fourth cell value else values for adj_amt_min should be same as min_amt.

5. For fifth row the values for adj_amt_min should be same as min_amt.

My output should look like this:

In the above data we found that in first row 3rd cell is less than the second row 4th cell value, so here in the first row for adj_amt_min will be set as 2nd row 4th cell value

UniqueID tier disc min_amt max_amt adj_amt_min

1 5 28 1278 2055 1295

2 4 25 849 1295 849

3 3 25 545 794 545

4 2 20 369 520 369

5 1 19 320 320 320

Thanks in Advance

Tom · Posted 09-09-2021 02:29 PM

So you want to find the MINimum of the current MAX_AMT and previous MIN_AMT (actually the next since processing in reverse order).

data have;
  input UniqueID $ tier disc min_amt max_amt expected ;
cards;
A1 5 28 1278 2055 2055
A1 4 25 849 1295 1278
A1 3 25 545 794 794
A1 2 20 369 520 520
A1 1 19 320 320 320
B2 5 27 724 1179 1179
B2 4 26 627 859 724
B2 3 20 456 691 627
B2 2 24 353 470 456
B2 1 19 332 332 332
C2 5 30 900 1279 1279
C2 4 35 697 859 859
C2 3 35 476 691 691
C2 2 30 363 470 470
C2 1 29 332 332 332
;

data want;
  set have;
  by uniqueid descending tier;
  previous = lag(min_amt);
  if first.uniqueid then previous=.;
  adj_amt = min(max_amt,previous);
  *drop previous;
  same = adj_amt = expected;
run;

proc print;
run;

       Unique
Obs      ID      tier    disc    min_amt    max_amt    expected    previous    adj_amt    same

  1      A1        5      28       1278       2055       2055           .        2055       1
  2      A1        4      25        849       1295       1278        1278        1278       1
  3      A1        3      25        545        794        794         849         794       1
  4      A1        2      20        369        520        520         545         520       1
  5      A1        1      19        320        320        320         369         320       1
  6      B2        5      27        724       1179       1179           .        1179       1
  7      B2        4      26        627        859        724         724         724       1
  8      B2        3      20        456        691        627         627         627       1
  9      B2        2      24        353        470        456         456         456       1
 10      B2        1      19        332        332        332         353         332       1
 11      C2        5      30        900       1279       1279           .        1279       1
 12      C2        4      35        697        859        859         900         859       1
 13      C2        3      35        476        691        691         697         691       1
 14      C2        2      30        363        470        470         476         470       1
 15      C2        1      29        332        332        332         363         332       1

View solution in original post

ballardw · Posted 09-08-2021 12:38 PM

Since SAS data sets do not use "cells" you want to cast questions in the terms of variable name and observation numbers.

If your data has more than 5 observations then what happens to the rest of the data?

haurspvybintek · Posted 09-08-2021 12:58 PM

1. Corrected the cell to column or reference variable, test data and expected results in the below question posted.

2. There will be different unique id's , Each unique id is linked to tier and tier will have certain discount, minimum amount and maximum amount

I have Data like below:

UniqueID tier disc min_amt max_amt adj_amt_min

1 5 28 1278 2055

1 4 25 849 1295

1 3 25 545 794

1 2 20 369 520

1 1 19 320 320

2 5 25 1400 2095

2 4 24 749 1395

2 3 23 545 794

2 2 23 349 520

2 1 12 330 330

I did group by unique id

I need to iterate from the top to bottom and update the adj_amt_min by comparing in below way.

1. If 1st row third column 'min_amt' value is less than the 2nd row fourth cell value then adj_amt_min should be set as 2nd row fourth column 'max_amt ' value else values for adj_amt_min should be same as min_amt.

2. If 2nd row third column 'min_amt' value is less than the 3rd row fourth cell value then adj_amt_min should be set as 3rd row fourth column 'max_amt ' value else values for adj_amt_min should be same as min_amt.

3. If 3rd row third column 'min_amt' value is less than the 4th row fourth cell value then adj_amt_min should be set as 4th row fourth column 'max_amt ' value else values for adj_amt_min should be same as min_amt.

4. If 4th row third column 'min_amt' value is less than the 5th row fourth cell value then adj_amt_min should be set as 5th row fourth column 'max_amt ' value else values for adj_amt_min should be same as min_amt.

5. For fifth row the values for adj_amt_min should be same as min_amt.

My output should look like this:

In the above data we found that in first row 3rd cell is less than the second row 4th cell value, so here in the first row for adj_amt_min will be set as 2nd row 4th cell value

UniqueID tier disc min_amt max_amt adj_amt_min

1 5 28 1278 2055 1295

1 4 25 849 1295 849

1 3 25 545 794 545

1 2 20 369 520 369

1 1 19 320 320 320

2 5 25 1400 2095 1400

2 4 24 749 1395 794

2 3 23 545 794 545

2 2 23 349 520 349

2 1 12 330 330 330

Tom · Posted 09-09-2021 11:29 AM

Forget the column # or the row # describe it in terms of the values of the variables.

It looks like your data is sorted by UNIQUEID descending TIER.

So you are asking to compare the value of MIN_AMT to the value MAX_AMT on the previous TIER for this UNIQUEID.

If you sort the data by increasing order of TIER instead you can use LAG() to make the comparison.

data have;
  input UniqueID tier disc min_amt max_amt expected ;
cards;
1 5 28 1278 2055 1295
1 4 25 849 1295 849
1 3 25 545 794 545
1 2 20 369 520 369
1 1 19 320 320 320
2 5 25 1400 2095 1400
2 4 24 749 1395 794
2 3 23 545 794 545
2 2 23 349 520 349
2 1 12 330 330 330
;
  
proc sort data=have out=sorted;
  by uniqueid tier;
run;

data want;
  set sorted;
  by uniqueid tier;
  previous = lag(max_amt);
  if first.uniqueid then previous=.;
  adj_amt_min = max(min_amt,previous);
  drop previous;
  same = adj_amt_min = expected;
run;

Results:

       Unique                                                      adj_amt_
Obs      ID      tier    disc    min_amt    max_amt    expected       min      same

  1       1        1      19        320        320        320         320        1
  2       1        2      20        369        520        369         369        1
  3       1        3      25        545        794        545         545        1
  4       1        4      25        849       1295        849         849        1
  5       1        5      28       1278       2055       1295        1295        1
  6       2        1      12        330        330        330         330        1
  7       2        2      23        349        520        349         349        1
  8       2        3      23        545        794        545         545        1
  9       2        4      24        749       1395        794         794        1
 10       2        5      25       1400       2095       1400        1400        1

If you really have to keep the data sorted backwards then you have to work a little harder to "look ahead".

data want;
  set have ;
  by uniqueid descending tier;
  set have(keep=max_amt rename=(max_amt=next) firstobs=2) have(obs=1 drop=_all_);
  if last.uniqueid then next=.;
  adj_amt_min = max(min_amt,next);
  drop next;
  same = adj_amt_min = expected;
run;

haurspvybintek · Posted 09-09-2021 12:36 PM

Thanks Tom for your reply.

Yes i have to sort it backwards and i tried using your code. but still able to see the incorrect values in adj_amt_min.

Thanks,

Tom · Posted 09-09-2021 12:44 PM

The code works on the data you posted. The value of SAME is 1 (TRUE) for all of the example values you provided.

Please provide a updated version of the data (using the data step code in my answer) which will yield results where the calculated value does not match the EXPECTED value.

haurspvybintek · Posted 09-09-2021 02:12 PM

Hi Tom,

Please find my below Test Data and Expected result.

Note: for the first group in the expected result i have mentioned the logic to be implemented in comments. Please let me know if it is not clear.

My Test Data:

UniqueID tier disc min_amt max_amt ;
A1 5 28 1278 2055
A1 4 25 849 1295
A1 3 25 545 794
A1 2 20 369 520
A1 1 19 320 320
B2 5 27 724 1179
B2 4 26 627 859
B2 3 20 456 691
B2 2 24 353 470
B2 1 19 332 332

C2 5 30 900 1279
C2 4 35 697 859
C2 3 35 476 691
C2 2 30 363 470
C2 1 29 332 332

Expected Output:

UniqueID tier disc min_amt max_amt adj_amt;
A1 5 28 1278 2055 2055 /*first.adj_amt in each group should be set to max_amt*/
A1 4 25 849 1295 1278 /*max amt 1295 should be compared to previous row min_amt 1278 and display the min_amt of previous row if 1278< 1295 */
A1 3 25 545 794 794 /*max amt 794 should be compared to previous row min_amt 849 and display the min_amt of previous row if 849 < 794 else adj_amt=max_amt */
A1 2 20 369 520 520 /*max amt 520 should be compared to previous row min_amt 545 and display the min_amt of previous row if 545 < 520 else adj_amt=max_amt*/
A1 1 19 320 320 320 /*max amt 320 should be compared to previous row min_amt 369 and display the min_amt of previous row if 369 < 320 else adj_amt=max_amt*/
B2 5 27 724 1179 1179
B2 4 26 627 859 724
B2 3 20 456 691 627
B2 2 24 353 470 456
B2 1 19 332 332 332

C2 5 30 900 1279 1279
C2 4 35 697 859 859
C2 3 35 476 691 691
C2 2 30 363 470 470
C2 1 29 332 332 332

Tom · Posted 09-09-2021 02:29 PM

So you want to find the MINimum of the current MAX_AMT and previous MIN_AMT (actually the next since processing in reverse order).

data have;
  input UniqueID $ tier disc min_amt max_amt expected ;
cards;
A1 5 28 1278 2055 2055
A1 4 25 849 1295 1278
A1 3 25 545 794 794
A1 2 20 369 520 520
A1 1 19 320 320 320
B2 5 27 724 1179 1179
B2 4 26 627 859 724
B2 3 20 456 691 627
B2 2 24 353 470 456
B2 1 19 332 332 332
C2 5 30 900 1279 1279
C2 4 35 697 859 859
C2 3 35 476 691 691
C2 2 30 363 470 470
C2 1 29 332 332 332
;

data want;
  set have;
  by uniqueid descending tier;
  previous = lag(min_amt);
  if first.uniqueid then previous=.;
  adj_amt = min(max_amt,previous);
  *drop previous;
  same = adj_amt = expected;
run;

proc print;
run;

       Unique
Obs      ID      tier    disc    min_amt    max_amt    expected    previous    adj_amt    same

  1      A1        5      28       1278       2055       2055           .        2055       1
  2      A1        4      25        849       1295       1278        1278        1278       1
  3      A1        3      25        545        794        794         849         794       1
  4      A1        2      20        369        520        520         545         520       1
  5      A1        1      19        320        320        320         369         320       1
  6      B2        5      27        724       1179       1179           .        1179       1
  7      B2        4      26        627        859        724         724         724       1
  8      B2        3      20        456        691        627         627         627       1
  9      B2        2      24        353        470        456         456         456       1
 10      B2        1      19        332        332        332         353         332       1
 11      C2        5      30        900       1279       1279           .        1279       1
 12      C2        4      35        697        859        859         900         859       1
 13      C2        3      35        476        691        691         697         691       1
 14      C2        2      30        363        470        470         476         470       1
 15      C2        1      29        332        332        332         363         332       1

haurspvybintek · Posted 09-09-2021 07:10 PM

Thank You so much. It really worked.

Ksharp · Posted 09-09-2021 10:16 AM

data have;
input UniqueID tier  disc    min_amt         max_amt ;
cards;
1                5    28       1278              2055
2                4    25        849               1295
3                3    25        545               794                
4                2    20        369               520
5                1    19        320               320
;
data want;
 merge have have(keep=max_amt rename=(max_amt=_max_amt) firstobs=2);
 adj_amt_min=max(min_amt,_max_amt);
 drop _max_amt;
run;

haurspvybintek · Posted 09-09-2021 11:02 AM

Thank You so much for taking your time and reply. It worked for one group of data however i'm facing issue when there is multiple groups.

I have many groups that has different tier, disc,min_amt and max_amt. Within Each group the tier's min_amt should be compared with the next tier's max_amt. And this iteration should be done from tier 5 to tier 1 of each group.

I have grouped the data set HAVE by unique id and used your code but in the results _max_amt is displaying the max(max_amt) for tier 1 in uniqueID A1 as 1179. but it should display the max_amt as adj_amt_min since it is the last row within that group. Can you please share what could happened for this incorrect results.

My Test data:

UniqueID tier disc min_amt max_amt ;
A1 5 28 1278 2055
A1 4 25 849 1295
A1 3 25 545 794
A1 2 20 369 520
A1 1 19 320 320
B2 5 28 724 1179
B2 4 25 627 859
B2 3 25 456 691
B2 2 20 353 470
B2 1 19 332 332

Thanks in Advance.

Ksharp · Posted 09-10-2021 07:44 AM

data have;
input UniqueID $ tier  disc    min_amt         max_amt ;
cards;
A1               5    28     1278       2055
A1               4    25      849         1295
A1               3    25      545          794
A1               2    20      369          520
A1               1   19       320           320
B2               5    28      724         1179
B2               4    25      627         859
B2               3    25      456          691
B2               2    20      353          470
B2               1   19       332           332
;
data want;
 merge have have(keep=UniqueID max_amt rename=(UniqueID=_UniqueID max_amt=_max_amt) firstobs=2);
if UniqueID=_UniqueID then adj_amt_min=max(min_amt,_max_amt);
 drop _UniqueID _max_amt;
run;

compare 1st row 3rd cell value to 2nd row 4th cell value and create new column in 1st row & iterate

Re: compare 1st row 3rd column value to 2nd row 4th column value and create new column in 1st row &a

Re: compare 1st row 3rd cell value to 2nd row 4th cell value and create new column in 1st row &&

Re: compare 1st row 3rd column value to 2nd row 4th column value and create new column in 1st row &&

Re: compare 1st row 3rd column value to 2nd row 4th column value and create new column in 1st row &a

Re: compare 1st row 3rd column value to 2nd row 4th column value and create new column in 1st row &a

Re: compare 1st row 3rd column value to 2nd row 4th column value and create new column in 1st row &a

Re: compare 1st row 3rd column value to 2nd row 4th column value and create new column in 1st row &a

Re: compare 1st row 3rd column value to 2nd row 4th column value and create new column in 1st row &a

Re: compare 1st row 3rd column value to 2nd row 4th column value and create new column in 1st row &a

Re: compare 1st row 3rd cell value to 2nd row 4th cell value and create new column in 1st row &&

Re: compare 1st row 3rd cell value to 2nd row 4th cell value and create new column in 1st row &&

Re: compare 1st row 3rd cell value to 2nd row 4th cell value and create new column in 1st row &&

Ready to join fellow brilliant minds for the SAS Hackathon?