Hi All,
I'm new to SAS and tried exploring multiple options but not able to find the right one. Hence reaching to the experts here.
I have Data like below:
UniqueID tier disc min_amt max_amt adj_amt_min
1 5 28 1278 2055
2 4 25 849 1295
3 3 25 545 794
4 2 20 369 520
5 1 19 320 320
I did group by unique id
I need to iterate from the top to bottom and update the adj_amt_min, adj_amt_max and calculate adj_disc by comparing in below way.
1. If 1st row third cell value is less than the 2nd row fourth cell value then adj_amt_min should be set as 2nd row fourth cell value else values for adj_amt_min should be same as min_amt.
2. If 2nd row third cell value is less than the 3rd row fourth cell value then adj_amt_min should be set as 3rd row fourth cell value else values for adj_amt_min should be same as min_amt.
3. If 3rd row third cell value is less than the 4th row fourth cell value then adj_amt_min should be set as 4th row fourth cell value else values for adj_amt_min should be same as min_amt.
4. If 4th row third cell value is less than the 5th row fourth cell value then adj_amt_min should be set as 5th row fourth cell value else values for adj_amt_min should be same as min_amt.
5. For fifth row the values for adj_amt_min should be same as min_amt.
My output should look like this:
In the above data we found that in first row 3rd cell is less than the second row 4th cell value, so here in the first row for adj_amt_min will be set as 2nd row 4th cell value
UniqueID tier disc min_amt max_amt adj_amt_min
1 5 28 1278 2055 1295
2 4 25 849 1295 849
3 3 25 545 794 545
4 2 20 369 520 369
5 1 19 320 320 320
Thanks in Advance
So you want to find the MINimum of the current MAX_AMT and previous MIN_AMT (actually the next since processing in reverse order).
data have;
input UniqueID $ tier disc min_amt max_amt expected ;
cards;
A1 5 28 1278 2055 2055
A1 4 25 849 1295 1278
A1 3 25 545 794 794
A1 2 20 369 520 520
A1 1 19 320 320 320
B2 5 27 724 1179 1179
B2 4 26 627 859 724
B2 3 20 456 691 627
B2 2 24 353 470 456
B2 1 19 332 332 332
C2 5 30 900 1279 1279
C2 4 35 697 859 859
C2 3 35 476 691 691
C2 2 30 363 470 470
C2 1 29 332 332 332
;
data want;
set have;
by uniqueid descending tier;
previous = lag(min_amt);
if first.uniqueid then previous=.;
adj_amt = min(max_amt,previous);
*drop previous;
same = adj_amt = expected;
run;
proc print;
run;
Unique Obs ID tier disc min_amt max_amt expected previous adj_amt same 1 A1 5 28 1278 2055 2055 . 2055 1 2 A1 4 25 849 1295 1278 1278 1278 1 3 A1 3 25 545 794 794 849 794 1 4 A1 2 20 369 520 520 545 520 1 5 A1 1 19 320 320 320 369 320 1 6 B2 5 27 724 1179 1179 . 1179 1 7 B2 4 26 627 859 724 724 724 1 8 B2 3 20 456 691 627 627 627 1 9 B2 2 24 353 470 456 456 456 1 10 B2 1 19 332 332 332 353 332 1 11 C2 5 30 900 1279 1279 . 1279 1 12 C2 4 35 697 859 859 900 859 1 13 C2 3 35 476 691 691 697 691 1 14 C2 2 30 363 470 470 476 470 1 15 C2 1 29 332 332 332 363 332 1
Since SAS data sets do not use "cells" you want to cast questions in the terms of variable name and observation numbers.
If your data has more than 5 observations then what happens to the rest of the data?
1. Corrected the cell to column or reference variable, test data and expected results in the below question posted.
2. There will be different unique id's , Each unique id is linked to tier and tier will have certain discount, minimum amount and maximum amount
I have Data like below:
UniqueID tier disc min_amt max_amt adj_amt_min
1 5 28 1278 2055
1 4 25 849 1295
1 3 25 545 794
1 2 20 369 520
1 1 19 320 320
2 5 25 1400 2095
2 4 24 749 1395
2 3 23 545 794
2 2 23 349 520
2 1 12 330 330
I did group by unique id
I need to iterate from the top to bottom and update the adj_amt_min by comparing in below way.
1. If 1st row third column 'min_amt' value is less than the 2nd row fourth cell value then adj_amt_min should be set as 2nd row fourth column 'max_amt ' value else values for adj_amt_min should be same as min_amt.
2. If 2nd row third column 'min_amt' value is less than the 3rd row fourth cell value then adj_amt_min should be set as 3rd row fourth column 'max_amt ' value else values for adj_amt_min should be same as min_amt.
3. If 3rd row third column 'min_amt' value is less than the 4th row fourth cell value then adj_amt_min should be set as 4th row fourth column 'max_amt ' value else values for adj_amt_min should be same as min_amt.
4. If 4th row third column 'min_amt' value is less than the 5th row fourth cell value then adj_amt_min should be set as 5th row fourth column 'max_amt ' value else values for adj_amt_min should be same as min_amt.
5. For fifth row the values for adj_amt_min should be same as min_amt.
My output should look like this:
In the above data we found that in first row 3rd cell is less than the second row 4th cell value, so here in the first row for adj_amt_min will be set as 2nd row 4th cell value
UniqueID tier disc min_amt max_amt adj_amt_min
1 5 28 1278 2055 1295
1 4 25 849 1295 849
1 3 25 545 794 545
1 2 20 369 520 369
1 1 19 320 320 320
2 5 25 1400 2095 1400
2 4 24 749 1395 794
2 3 23 545 794 545
2 2 23 349 520 349
2 1 12 330 330 330
Forget the column # or the row # describe it in terms of the values of the variables.
It looks like your data is sorted by UNIQUEID descending TIER.
So you are asking to compare the value of MIN_AMT to the value MAX_AMT on the previous TIER for this UNIQUEID.
If you sort the data by increasing order of TIER instead you can use LAG() to make the comparison.
data have;
input UniqueID tier disc min_amt max_amt expected ;
cards;
1 5 28 1278 2055 1295
1 4 25 849 1295 849
1 3 25 545 794 545
1 2 20 369 520 369
1 1 19 320 320 320
2 5 25 1400 2095 1400
2 4 24 749 1395 794
2 3 23 545 794 545
2 2 23 349 520 349
2 1 12 330 330 330
;
proc sort data=have out=sorted;
by uniqueid tier;
run;
data want;
set sorted;
by uniqueid tier;
previous = lag(max_amt);
if first.uniqueid then previous=.;
adj_amt_min = max(min_amt,previous);
drop previous;
same = adj_amt_min = expected;
run;
Results:
Unique adj_amt_ Obs ID tier disc min_amt max_amt expected min same 1 1 1 19 320 320 320 320 1 2 1 2 20 369 520 369 369 1 3 1 3 25 545 794 545 545 1 4 1 4 25 849 1295 849 849 1 5 1 5 28 1278 2055 1295 1295 1 6 2 1 12 330 330 330 330 1 7 2 2 23 349 520 349 349 1 8 2 3 23 545 794 545 545 1 9 2 4 24 749 1395 794 794 1 10 2 5 25 1400 2095 1400 1400 1
If you really have to keep the data sorted backwards then you have to work a little harder to "look ahead".
data want;
set have ;
by uniqueid descending tier;
set have(keep=max_amt rename=(max_amt=next) firstobs=2) have(obs=1 drop=_all_);
if last.uniqueid then next=.;
adj_amt_min = max(min_amt,next);
drop next;
same = adj_amt_min = expected;
run;
Thanks Tom for your reply.
Yes i have to sort it backwards and i tried using your code. but still able to see the incorrect values in adj_amt_min.
Thanks,
The code works on the data you posted. The value of SAME is 1 (TRUE) for all of the example values you provided.
Please provide a updated version of the data (using the data step code in my answer) which will yield results where the calculated value does not match the EXPECTED value.
Hi Tom,
Please find my below Test Data and Expected result.
Note: for the first group in the expected result i have mentioned the logic to be implemented in comments. Please let me know if it is not clear.
My Test Data:
UniqueID tier disc min_amt max_amt ;
A1 5 28 1278 2055
A1 4 25 849 1295
A1 3 25 545 794
A1 2 20 369 520
A1 1 19 320 320
B2 5 27 724 1179
B2 4 26 627 859
B2 3 20 456 691
B2 2 24 353 470
B2 1 19 332 332
C2 5 30 900 1279
C2 4 35 697 859
C2 3 35 476 691
C2 2 30 363 470
C2 1 29 332 332
Expected Output:
UniqueID tier disc min_amt max_amt adj_amt;
A1 5 28 1278 2055 2055 /*first.adj_amt in each group should be set to max_amt*/
A1 4 25 849 1295 1278 /*max amt 1295 should be compared to previous row min_amt 1278 and display the min_amt of previous row if 1278< 1295 */
A1 3 25 545 794 794 /*max amt 794 should be compared to previous row min_amt 849 and display the min_amt of previous row if 849 < 794 else adj_amt=max_amt */
A1 2 20 369 520 520 /*max amt 520 should be compared to previous row min_amt 545 and display the min_amt of previous row if 545 < 520 else adj_amt=max_amt*/
A1 1 19 320 320 320 /*max amt 320 should be compared to previous row min_amt 369 and display the min_amt of previous row if 369 < 320 else adj_amt=max_amt*/
B2 5 27 724 1179 1179
B2 4 26 627 859 724
B2 3 20 456 691 627
B2 2 24 353 470 456
B2 1 19 332 332 332
C2 5 30 900 1279 1279
C2 4 35 697 859 859
C2 3 35 476 691 691
C2 2 30 363 470 470
C2 1 29 332 332 332
So you want to find the MINimum of the current MAX_AMT and previous MIN_AMT (actually the next since processing in reverse order).
data have;
input UniqueID $ tier disc min_amt max_amt expected ;
cards;
A1 5 28 1278 2055 2055
A1 4 25 849 1295 1278
A1 3 25 545 794 794
A1 2 20 369 520 520
A1 1 19 320 320 320
B2 5 27 724 1179 1179
B2 4 26 627 859 724
B2 3 20 456 691 627
B2 2 24 353 470 456
B2 1 19 332 332 332
C2 5 30 900 1279 1279
C2 4 35 697 859 859
C2 3 35 476 691 691
C2 2 30 363 470 470
C2 1 29 332 332 332
;
data want;
set have;
by uniqueid descending tier;
previous = lag(min_amt);
if first.uniqueid then previous=.;
adj_amt = min(max_amt,previous);
*drop previous;
same = adj_amt = expected;
run;
proc print;
run;
Unique Obs ID tier disc min_amt max_amt expected previous adj_amt same 1 A1 5 28 1278 2055 2055 . 2055 1 2 A1 4 25 849 1295 1278 1278 1278 1 3 A1 3 25 545 794 794 849 794 1 4 A1 2 20 369 520 520 545 520 1 5 A1 1 19 320 320 320 369 320 1 6 B2 5 27 724 1179 1179 . 1179 1 7 B2 4 26 627 859 724 724 724 1 8 B2 3 20 456 691 627 627 627 1 9 B2 2 24 353 470 456 456 456 1 10 B2 1 19 332 332 332 353 332 1 11 C2 5 30 900 1279 1279 . 1279 1 12 C2 4 35 697 859 859 900 859 1 13 C2 3 35 476 691 691 697 691 1 14 C2 2 30 363 470 470 476 470 1 15 C2 1 29 332 332 332 363 332 1
Thank You so much. It really worked.
data have;
input UniqueID tier disc min_amt max_amt ;
cards;
1 5 28 1278 2055
2 4 25 849 1295
3 3 25 545 794
4 2 20 369 520
5 1 19 320 320
;
data want;
merge have have(keep=max_amt rename=(max_amt=_max_amt) firstobs=2);
adj_amt_min=max(min_amt,_max_amt);
drop _max_amt;
run;
Thank You so much for taking your time and reply. It worked for one group of data however i'm facing issue when there is multiple groups.
I have many groups that has different tier, disc,min_amt and max_amt. Within Each group the tier's min_amt should be compared with the next tier's max_amt. And this iteration should be done from tier 5 to tier 1 of each group.
I have grouped the data set HAVE by unique id and used your code but in the results _max_amt is displaying the max(max_amt) for tier 1 in uniqueID A1 as 1179. but it should display the max_amt as adj_amt_min since it is the last row within that group. Can you please share what could happened for this incorrect results.
My Test data:
UniqueID tier disc min_amt max_amt ;
A1 5 28 1278 2055
A1 4 25 849 1295
A1 3 25 545 794
A1 2 20 369 520
A1 1 19 320 320
B2 5 28 724 1179
B2 4 25 627 859
B2 3 25 456 691
B2 2 20 353 470
B2 1 19 332 332
Thanks in Advance.
data have;
input UniqueID $ tier disc min_amt max_amt ;
cards;
A1 5 28 1278 2055
A1 4 25 849 1295
A1 3 25 545 794
A1 2 20 369 520
A1 1 19 320 320
B2 5 28 724 1179
B2 4 25 627 859
B2 3 25 456 691
B2 2 20 353 470
B2 1 19 332 332
;
data want;
merge have have(keep=UniqueID max_amt rename=(UniqueID=_UniqueID max_amt=_max_amt) firstobs=2);
if UniqueID=_UniqueID then adj_amt_min=max(min_amt,_max_amt);
drop _UniqueID _max_amt;
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.