BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
sas_newbie94
Obsidian | Level 7

Hello i would like to derive column E and F in SAS. I have done it via Excel and this is as follows.

 

Column E: =IF(D2=1,1,IF(OR(C2=1,A2<>A1),"",E1))

 

Column F: =IF(AND(E1="",E2=1),B2,IF(AND(E2=1,E1=1),IF(B2>F1,B2,F1),""))

 

help.jpg

1 ACCEPTED SOLUTION

Accepted Solutions
Kurt_Bremser
Super User

I overlooked a detail in my code; the new condition needs to be implemented in another place:

data want;
set have;
by accno;
retain newflag newamt;
if first.accno or sequence = 1
then do;
  newflag = .;
  newamt = .;
end;
if flag = 1
then do;
  newflag = 1;
  newamt = max(amt,newamt);
end;
run;

View solution in original post

22 REPLIES 22
PaigeMiller
Diamond | Level 26

Speaking as someone who is not particularly good at Excel, I can't really interpret your Excel code, but ... YOU can! Write down your code in plain English as IF THEN statements, and show us those. From there, show your plain English to us. It should be relatively easy to translate plain English IF THEN statements into SAS IF THEN statements.

--
Paige Miller
sas_newbie94
Obsidian | Level 7

Ah okay! So for Column E, the new flag column is derived basing on Column D the flag column. As long as there's a flag indicator where Column D hits 1, it will carry forward to subsequent rows in Column E until the point where Column C becomes 1 or when Column A is a new account number. It will loop and check for column D whether there's a flag for 1 and then once hit, it will add 1 to that row and the subsequent rows in Column E. I tried looking up and using the LAG function but it does not seems to work in this scenario. 

 

For Column F, it's slightly more complicated where at the very first instance when Column E (derived above) has a flag 1, we will take that row $ Amt in Column B. For the subsequent rows, we are required to compare the newly derived amount in Column F with Column B, the higher of the two prevails and will be the new value in Column F which can be seen in the image uploaded.

mklangley
Lapis Lazuli | Level 10

Like @PaigeMiller suggested, converting your Excel formulas to plain English is a good first step.  As another approach, I attempted to convert your formulas to pseudo-code.  You can revise it if there are any errors.

 

if D2 = 1
    then column_E = "1"

else if C2 = 1 or A2 ne A1

    then column_E = " "

else column_E = E1

 

if (E1 = " " and E2 = 1)

    then column_F = B2

else if (E2 = 1 and E1 = 1)

    then do

        if B2 > F1 then column_F = B2

        else column_F = F1

    end

else column_F = " "

 

(As a side note, make sure the formulas correspond to the data you provided.  The formulas you included reference cells in row 1, which contains the headers (e.g. E1 is NEW FLAG).  I don't think you intended those to be part of it.)

sas_newbie94
Obsidian | Level 7

Hi thanks for the replies guys! Really appreciate it.

 

For SAS, unfortunately there's no cell reference like what I can do in Excel. Not sure if you have any idea how we can compare a single cell record in SAS with the subsequent records? I have tried using LAG in SAS but it's only able to compare previous record for Column B. For my case i will need a dynamic LAG to compare with my newly derived values in Column F against Column B row value.

Kurt_Bremser
Super User

You do have column names: accno, amt, sequence and flag. Use these, and add "from previous row" or "from first row in group" where adequate.

 

You can also describe your algorithm this way:

"in the first row of each accno I want to do X, in the next row I want to do Y, in subsequent rows I want to do Z".

sas_newbie94
Obsidian | Level 7

Thanks for the tips! 

 

For the first derived column, [NEW FLAG; Column E]:

In the first row of each account number (Column A), if there's no flag indication with the value 1 (Column D), my newly derived value (Column E) will be a blank. If an account number has an indication of value 1 (Column D). my new Flag column will inherit the value of 1 (Column E). This value will be inherited to subsequent rows with the value of 1 until there's a new account number (it will stop inheriting the value of 1 and check for the new account number if there's an indication of value 1 (Column D) again, in a way it will be 'resetted') OR if it's the same account number, the Sequence column (Column C) indicates a 1, this will also 'reset' the check and stop inheriting the value of 1 until there's a flag indication with the value of 1 (Column D) again.

 

For the second derived column, [NEW $ AMT; Column F]:

This part it will tend to be slightly trickier. In the first row where it's either a new accno OR same accno but sequence of 1, where the NEW FLAG Column E has a value of 1, we will take the first $ Amt and placed it under NEW $AMT which can be seen highlighted in yellow. For subsequent rows, the newly derived value in Column F we will have to compare with Column B and take the higher of the two which can be seen in the image attached.

 

help2.jpg

Tom
Super User Tom
Super User

I don't see any need to look ahead in the photo you just posted.

ballardw
Super User

@sas_newbie94 wrote:

Hi thanks for the replies guys! Really appreciate it.

 

For SAS, unfortunately there's no cell reference like what I can do in Excel. Not sure if you have any idea how we can compare a single cell record in SAS with the subsequent records? I have tried using LAG in SAS but it's only able to compare previous record for Column B. For my case i will need a dynamic LAG to compare with my newly derived values in Column F against Column B row value.


RETAIN would likely be a better approach. That makes a variable whose values are maintained across iterations of the data step. You have to set the value for the retained value, likely copying an existing variable to the retained one as needed.

 

A not particularly useful example of the syntax where the Retain statement sets an initial value:

data work.class;
   set sashelp.class;
   retain beforejane 'Yes';
   if name='Jane' then beforejane='No';
run;

You should specify a length for retained character variables before the retain statement if you do not initialize them, or the initial value is shorter than some expected values that need to be assigned.

TomKari
Onyx | Level 15

I think this will get you started. Note that the important part is the "retain" statement; it allows carrying a result forward. See if this is going in the direction you want, and we can keep working on it.

 

Tom

 

data Have;
	input AccNo AMT Sequence FLAG;
	cards;
1 111663 29 .
1 111663 30 .
1 111663 31 1
1 111663 32 .
1 111663 1 .
1 111663 2 1
2 111663 1 1
3 111663 1 .
3 111663 2 1
run;

data Want;
	set Have;
	retain OldAcc . Switch 0 NEWFLAG . NEWAMT .;

	if AccNo > OldAcc | Sequence = 1 then
		Switch = 0;

	if FLAG = 1 then
		Switch = 1;

	if Switch = 1 then
		do;
			NEWFLAG = 1;
			NEWAMT = AMT;
		end;
	else
		do;
			NEWFLAG = .;
			NEWAMT = .;
		end;

	OldAcc = AccNo;
run;
mklangley
Lapis Lazuli | Level 10

@sas_newbie94   Give this a try:

 

data have;
    infile datalines missover;
    input accNo AMT Sequence Flag;
    datalines;
    1 111663 29  
    1 111760 30  
    1 111986 31 1
    1 108676 32  
    1 108774 33  
    1 108348 34  
    1 269278 35  
    1 269004 36  
    1 265680 1   
    1 265120 2  1
    1 264504 3   
    1 263420 4   
    2 194655 1  1
    3 114645 1   
    3 114378 2  1
    3 200000 3   
    3 15000 4    
    ;
run;


options mprint;
data want (drop=start x y);
    set have;
    retain start y;

    /* New_Flag logic */
    if flag = 1
        then do;
            new_flag = 1;
            start = 1;
        end;
    else if sequence = 1 or accNo ne lag(accNo)
        then do;
            new_flag = .;
            start = 0;
        end;
    else if start = 1
        then new_flag = 1;
    else new_flag = lag(new_flag);

    x = new_flag;

    /* New_Amt logic */
    if x = . and new_flag = 1
        then new_amt = amt;
    else if (new_flag = 1 and x = 1)
        then new_amt = max(amt, y);
    else new_amt = .;

    y = new_amt;
run;

mklangley_0-1601915755728.png

 

sas_newbie94
Obsidian | Level 7

Thanks @mklangley for your help! It's really close to getting the desired output! However apologies i may have flagged incorrectly for one of the rows for New $ Amt (Column F) which I have corrected the values and highlighted in green. Since it's a new account number, it should take the $ Amt at Column B as the NEW $ Amt and not the prior $ Amt from Account 1. Thankful for your help! 

 

help4.jpg

 

sas_newbie94_0-1601917335248.png

 

 

 

 

mklangley
Lapis Lazuli | Level 10

Okay, how about this?

options mprint;
data want (drop=start x y);
    set have;
    retain start y;

    /* New_Flag logic */
    if flag = 1
        then do;
            new_flag = 1;
            start = 1;
        end;
    else if sequence = 1 or accNo ne lag(accNo)
        then do;
            new_flag = .;
            start = 0;
        end;
    else if start = 1
        then new_flag = 1;

    x = new_flag;

    /* New_Amt logic */
    if x = . and new_flag = 1
        then new_amt = amt;
    else if (new_flag = 1 and x = 1 and accNo = lag(accNo))
        then new_amt = max(amt, y);
    else if (new_flag = 1 and x = 1 and accNo ne lag(accNo))
        then new_amt = amt;
    else new_amt = .;

    y = new_amt;
run;

proc print; run;
sas_newbie94
Obsidian | Level 7

Dear @mklangley ,

 

Thanks for your prompt replies and speedy updates. Your code is really good and I appreciate your help. However, I realised that I have another scenario where there's another logic to deriving new $ AMT column F. I would like to seek your help once again.

 

For Column F, new $ Amt, at the first time where Column D hits a value of 1, it will take the $ AMT for that row and inherit across subsequent rows until it hits another value of 1 in column D flag, that's where we will compare the two $ AMT and take the higher of the two. Take note that the new $ AMT needs to only be included for new flag = 1 and do the checks accordingly.

 

Desired Output as follows:

 

sas_newbie94_0-1601977737327.png

 

Kurt_Bremser
Super User

Please post your data in a data step with datalines, as @mklangley has already shown. We cannot test SAS code against pictures. Data steps with datalines are not rocket science, but a very useful skill, and help greatly in illustrating issues.

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

Creating Custom Steps in SAS Studio

Check out this tutorial series to learn how to build your own steps in SAS Studio.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 22 replies
  • 3542 views
  • 5 likes
  • 7 in conversation