BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
haurspvybintek
Calcite | Level 5

Hi All,

 

I'm new to SAS and tried exploring multiple options but not able to find the right one. Hence reaching to the experts here. 

 

I have Data like below:

UniqueID tier  disc    min_amt         max_amt  adj_amt_min

1                5    28       1278              2055

2                4    25        849               1295

3                3    25        545               794                

4                2    20        369               520

5                1    19        320               320

I did group by unique id 

 

I need to iterate from the top to bottom and update the adj_amt_min, adj_amt_max and calculate adj_disc by  comparing in below way.

1. If 1st row third cell value is less than the 2nd row fourth cell value then adj_amt_min should be set as 2nd row fourth cell value else values for adj_amt_min should be same as min_amt.

2. If 2nd row third cell value is less than the 3rd row fourth cell value then adj_amt_min should be set as 3rd row fourth cell value else values for adj_amt_min should be same as min_amt.

3. If 3rd row third cell value is less than the 4th row fourth cell value then adj_amt_min should be set as 4th row fourth cell value else values for adj_amt_min should be same as min_amt.

4. If 4th row third cell value is less than the 5th row fourth cell value then adj_amt_min should be set as 5th row fourth cell value else values for adj_amt_min should be same as min_amt.

5. For fifth row the values for adj_amt_min should be same as min_amt.

 

My output should look like this: 

In the above data we found that in first row  3rd cell is less than the second row 4th cell value, so here in the first row for adj_amt_min will be set as 2nd row 4th cell value

UniqueID tier  disc    min_amt         max_amt  adj_amt_min  

1                5    28       1278              2055           1295

2                4    25        849               1295             849

3                3    25        545               794               545              

4                2    20        369               520               369

5                1    19        320               320               320

 

Thanks in Advance

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

So you want to find the MINimum of the current MAX_AMT and previous MIN_AMT (actually the next since processing in reverse order).

data have;
  input UniqueID $ tier disc min_amt max_amt expected ;
cards;
A1 5 28 1278 2055 2055
A1 4 25 849 1295 1278
A1 3 25 545 794 794
A1 2 20 369 520 520
A1 1 19 320 320 320
B2 5 27 724 1179 1179
B2 4 26 627 859 724
B2 3 20 456 691 627
B2 2 24 353 470 456
B2 1 19 332 332 332
C2 5 30 900 1279 1279
C2 4 35 697 859 859
C2 3 35 476 691 691
C2 2 30 363 470 470
C2 1 29 332 332 332
;

data want;
  set have;
  by uniqueid descending tier;
  previous = lag(min_amt);
  if first.uniqueid then previous=.;
  adj_amt = min(max_amt,previous);
  *drop previous;
  same = adj_amt = expected;
run;

proc print;
run;
       Unique
Obs      ID      tier    disc    min_amt    max_amt    expected    previous    adj_amt    same

  1      A1        5      28       1278       2055       2055           .        2055       1
  2      A1        4      25        849       1295       1278        1278        1278       1
  3      A1        3      25        545        794        794         849         794       1
  4      A1        2      20        369        520        520         545         520       1
  5      A1        1      19        320        320        320         369         320       1
  6      B2        5      27        724       1179       1179           .        1179       1
  7      B2        4      26        627        859        724         724         724       1
  8      B2        3      20        456        691        627         627         627       1
  9      B2        2      24        353        470        456         456         456       1
 10      B2        1      19        332        332        332         353         332       1
 11      C2        5      30        900       1279       1279           .        1279       1
 12      C2        4      35        697        859        859         900         859       1
 13      C2        3      35        476        691        691         697         691       1
 14      C2        2      30        363        470        470         476         470       1
 15      C2        1      29        332        332        332         363         332       1

View solution in original post

11 REPLIES 11
ballardw
Super User

Since SAS data sets do not use "cells" you want to cast questions in the terms of variable name and observation numbers.

If your data has more than 5 observations then what happens to the rest of the data?

haurspvybintek
Calcite | Level 5

1. Corrected the cell to column or reference variable, test data and expected results in the below question posted.

2. There will be different unique id's , Each unique id is linked to tier and tier will have certain discount, minimum amount and maximum amount

I have Data like below:

UniqueID tier  disc    min_amt         max_amt  adj_amt_min

1                5    28       1278              2055

1                4    25        849               1295

1                3    25        545               794                

1                2    20        369               520

1                1    19        320               320

2                5    25       1400              2095

2                4    24        749               1395

2                3    23        545               794                

2                2    23        349               520

2                1    12        330               330

I did group by unique id 

 

I need to iterate from the top to bottom and update the adj_amt_min by  comparing in below way.

1. If 1st row third column 'min_amt' value is less than the 2nd row fourth cell value then adj_amt_min should be set as 2nd row fourth column 'max_amt ' value else values for adj_amt_min should be same as min_amt.

2. If 2nd row third column 'min_amt' value is less than the 3rd row fourth cell value then adj_amt_min should be set as 3rd row fourth column 'max_amt ' value else values for adj_amt_min should be same as min_amt.

3. If 3rd row third column 'min_amt' value is less than the 4th row fourth cell value then adj_amt_min should be set as 4th row fourth column 'max_amt ' value else values for adj_amt_min should be same as min_amt.

4. If 4th row third column 'min_amt' value is less than the 5th row fourth cell value then adj_amt_min should be set as 5th row fourth column 'max_amt ' value else values for adj_amt_min should be same as min_amt.

5. For fifth row the values for adj_amt_min should be same as min_amt.

 

My output should look like this: 

In the above data we found that in first row  3rd cell is less than the second row 4th cell value, so here in the first row for adj_amt_min will be set as 2nd row 4th cell value

UniqueID tier  disc    min_amt         max_amt  adj_amt_min  

1                5    28       1278              2055           1295

1                4    25        849               1295             849

1                3    25        545               794               545              

1                2    20        369               520               369

1                1    19        320               320               320

 

2                5    25       1400              2095           1400

2                4    24        749               1395           794

2                3    23        545               794            545                

2                2    23        349               520            349

2                1    12        330               330            330

Tom
Super User Tom
Super User

Forget the column # or the row # describe it in terms of the values of the variables.

It looks like your data is sorted by UNIQUEID descending TIER.

So you are asking to compare the value of MIN_AMT to the value MAX_AMT on the previous TIER for this UNIQUEID.

 

If you sort the data by increasing order of TIER instead you can use LAG() to make the comparison.

data have;
  input UniqueID tier disc min_amt max_amt expected ;
cards;
1 5 28 1278 2055 1295
1 4 25 849 1295 849
1 3 25 545 794 545
1 2 20 369 520 369
1 1 19 320 320 320
2 5 25 1400 2095 1400
2 4 24 749 1395 794
2 3 23 545 794 545
2 2 23 349 520 349
2 1 12 330 330 330
;
  
proc sort data=have out=sorted;
  by uniqueid tier;
run;

data want;
  set sorted;
  by uniqueid tier;
  previous = lag(max_amt);
  if first.uniqueid then previous=.;
  adj_amt_min = max(min_amt,previous);
  drop previous;
  same = adj_amt_min = expected;
run;

Results:

       Unique                                                      adj_amt_
Obs      ID      tier    disc    min_amt    max_amt    expected       min      same

  1       1        1      19        320        320        320         320        1
  2       1        2      20        369        520        369         369        1
  3       1        3      25        545        794        545         545        1
  4       1        4      25        849       1295        849         849        1
  5       1        5      28       1278       2055       1295        1295        1
  6       2        1      12        330        330        330         330        1
  7       2        2      23        349        520        349         349        1
  8       2        3      23        545        794        545         545        1
  9       2        4      24        749       1395        794         794        1
 10       2        5      25       1400       2095       1400        1400        1

If you really have to keep the data sorted backwards then you have to work a little harder to "look ahead".

data want;
  set have ;
  by uniqueid descending tier;
  set have(keep=max_amt rename=(max_amt=next) firstobs=2) have(obs=1 drop=_all_);
  if last.uniqueid then next=.;
  adj_amt_min = max(min_amt,next);
  drop next;
  same = adj_amt_min = expected;
run;
haurspvybintek
Calcite | Level 5

Thanks Tom for your reply.

 

Yes i have to sort it backwards and i tried using your code. but still able to see the incorrect values in adj_amt_min.

 

Thanks, 

Tom
Super User Tom
Super User

The code works on the data you posted. The value of SAME is 1 (TRUE) for all of the example values you provided.

 

Please provide a updated version of the data (using the data step code in my answer) which will yield results where the calculated value does not match the EXPECTED value.

haurspvybintek
Calcite | Level 5

Hi Tom,

Please find my below Test Data and Expected result.

 

Note:  for the first group in the expected result i have mentioned the logic to be implemented in comments. Please let me know if it is not clear.

 

My Test Data:

UniqueID tier disc min_amt max_amt ;
A1               5    28     1278       2055
A1               4    25      849         1295
A1               3    25      545          794
A1               2    20      369          520
A1               1   19       320           320
B2               5    27      724         1179
B2               4    26      627         859
B2               3    20      456          691
B2               2    24      353          470
B2               1   19       332           332

C2               5    30      900         1279
C2               4    35      697         859
C2               3    35      476          691
C2               2    30      363          470
C2               1    29       332           332

 

Expected Output:

UniqueID tier disc min_amt max_amt    adj_amt;
A1               5    28     1278       2055          2055 /*first.adj_amt in each group should be set to max_amt*/
A1               4    25      849         1295          1278 /*max amt 1295 should be compared to previous row min_amt 1278 and display the min_amt of previous row if 1278< 1295 */
A1               3    25      545          794           794 /*max amt 794 should be compared to previous row min_amt 849 and display the min_amt of previous row if 849 < 794 else adj_amt=max_amt */
A1               2    20      369          520          520 /*max amt 520 should be compared to previous row min_amt 545 and display the min_amt of previous row if 545 < 520 else adj_amt=max_amt*/
A1               1   19       320           320          320 /*max amt 320 should be compared to previous row min_amt 369 and display the min_amt of previous row if 369 < 320 else adj_amt=max_amt*/
B2               5    27      724         1179          1179
B2               4    26      627         859           724
B2               3    20      456          691          627
B2               2    24      353          470          456
B2               1   19       332           332          332

C2               5    30      900         1279          1279
C2               4    35      697         859           859
C2               3    35      476          691          691
C2               2    30      363          470          470
C2               1    29       332           332          332

Tom
Super User Tom
Super User

So you want to find the MINimum of the current MAX_AMT and previous MIN_AMT (actually the next since processing in reverse order).

data have;
  input UniqueID $ tier disc min_amt max_amt expected ;
cards;
A1 5 28 1278 2055 2055
A1 4 25 849 1295 1278
A1 3 25 545 794 794
A1 2 20 369 520 520
A1 1 19 320 320 320
B2 5 27 724 1179 1179
B2 4 26 627 859 724
B2 3 20 456 691 627
B2 2 24 353 470 456
B2 1 19 332 332 332
C2 5 30 900 1279 1279
C2 4 35 697 859 859
C2 3 35 476 691 691
C2 2 30 363 470 470
C2 1 29 332 332 332
;

data want;
  set have;
  by uniqueid descending tier;
  previous = lag(min_amt);
  if first.uniqueid then previous=.;
  adj_amt = min(max_amt,previous);
  *drop previous;
  same = adj_amt = expected;
run;

proc print;
run;
       Unique
Obs      ID      tier    disc    min_amt    max_amt    expected    previous    adj_amt    same

  1      A1        5      28       1278       2055       2055           .        2055       1
  2      A1        4      25        849       1295       1278        1278        1278       1
  3      A1        3      25        545        794        794         849         794       1
  4      A1        2      20        369        520        520         545         520       1
  5      A1        1      19        320        320        320         369         320       1
  6      B2        5      27        724       1179       1179           .        1179       1
  7      B2        4      26        627        859        724         724         724       1
  8      B2        3      20        456        691        627         627         627       1
  9      B2        2      24        353        470        456         456         456       1
 10      B2        1      19        332        332        332         353         332       1
 11      C2        5      30        900       1279       1279           .        1279       1
 12      C2        4      35        697        859        859         900         859       1
 13      C2        3      35        476        691        691         697         691       1
 14      C2        2      30        363        470        470         476         470       1
 15      C2        1      29        332        332        332         363         332       1
Ksharp
Super User
data have;
input UniqueID tier  disc    min_amt         max_amt ;
cards;
1                5    28       1278              2055
2                4    25        849               1295
3                3    25        545               794                
4                2    20        369               520
5                1    19        320               320
;
data want;
 merge have have(keep=max_amt rename=(max_amt=_max_amt) firstobs=2);
 adj_amt_min=max(min_amt,_max_amt);
 drop _max_amt;
run;
haurspvybintek
Calcite | Level 5

Thank You so much for taking your time and reply. It worked for one group of data however i'm facing issue when there is multiple groups.

 

I have many groups that has different tier, disc,min_amt and max_amt. Within Each group the tier's min_amt should be compared with the next tier's max_amt. And this iteration should be done from tier 5 to tier 1 of each group. 

 

I have grouped the data set HAVE by unique id  and used your code but in the results  _max_amt is displaying the max(max_amt) for tier 1 in uniqueID A1 as 1179. but it should display the max_amt as adj_amt_min since it is the last row within that group. Can you please share what could happened for this incorrect results.

 

My Test data:

UniqueID tier disc min_amt max_amt ;
A1               5    28     1278       2055
A1               4    25      849         1295
A1               3    25      545          794
A1               2    20      369          520
A1               1   19       320           320
B2               5    28      724         1179
B2               4    25      627         859
B2               3    25      456          691
B2               2    20      353          470
B2               1   19       332           332

 

Thanks in Advance.

 

 

Ksharp
Super User
data have;
input UniqueID $ tier  disc    min_amt         max_amt ;
cards;
A1               5    28     1278       2055
A1               4    25      849         1295
A1               3    25      545          794
A1               2    20      369          520
A1               1   19       320           320
B2               5    28      724         1179
B2               4    25      627         859
B2               3    25      456          691
B2               2    20      353          470
B2               1   19       332           332
;
data want;
 merge have have(keep=UniqueID max_amt rename=(UniqueID=_UniqueID max_amt=_max_amt) firstobs=2);
if UniqueID=_UniqueID then adj_amt_min=max(min_amt,_max_amt);
 drop _UniqueID _max_amt;
run;

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 11 replies
  • 1803 views
  • 1 like
  • 4 in conversation