Multiple IF, THEN, DO, ELSE not producing consistent results

Accepted Solution Solved
Reply
Contributor
Posts: 33
Accepted Solution

Multiple IF, THEN, DO, ELSE not producing consistent results

Good morning,

    I have spent days trying to work the logic on some IF, THEN, DO, ELSE statements and every time I get one to work, the other doesn't.

I am using SAS 9.2. I have tried consecutive IF, THEN, DO statements to try to make the IF THEN ELSE statements read certain blocks of data but that hasn't been working for me. I have tried everything I know how to try.

 

Background:

The data is a result of comparing two datasets for differences using PROC COMPARE.

 

Need:

I need to be able to differentiate between a difference as a result of having a value missing in one dataset and not the other and when the difference is as a result of an actual difference (no missing values). 

 

Issues:

I found I needed to remove the X's from difference because sometimes the if statements would work if the DIFF value was blank. I included two WANT2 tables to show a couple of examples of what I've tried. But I could have included 50. I can get everything to = 1 or MISSING, but not either.

 

 

data have;
infile datalines dlm = ':' truncover;
length SOURCE $10. ID1 $2. ID2 $2. PHONE $15.;
input SOURCE $ ID1 $ ID2 $ PHONE $ ;
datalines;
SOURCE1: 1: 1: 5555555:
SOURCE2: 1: 1: 5555555:
DIFF: 1: 1: .:
SOURCE1: 1: 2: :
SOURCE2: 1: 2: :
DIFF: 1: 2: .:
SOURCE1: 2: 1: 9999999:
SOURCE2: 2: 1: 
DIFF: 2: 1: XXXXXXX:
SOURCE1: 3: 1:  :
SOURCE2: 3: 1: 4444444:
DIFF: 3: 1: XXXXXXX:
SOURCE1: 4: 1: 4448888:
SOURCE2: 4: 1: 4444444:
DIFF: 4: 1: ...XXXX:
;
run;

 
data want;
set have;
if SOURCE = "DIFF" and index(PHONE,'X') then PHONE = "";
else PHONE = PHONE;
run;

data want2;
set have;
by 
ID1 ID2;
if SOURCE = "SOURCE2" then do;
if PHONE = "" and lag(PHONE) ne "" then do;
if SOURCE = "DIFF" then PHONE = "MISSING";
end;

else if PHONE ne "" and lag(PHONE) = "" then do;
if SOURCE = "DIFF" then PHONE = "MISSING";
end;

else if PHONE ne lag(PHONE) then do;
if SOURCE = "DIFF" then PHONE = '1';
end;

end;

run;

 

 

data want2;
set have;
by 
ID1 ID2;
if SOURCE = "SOURCE2" and PHONE = "" and lag(PHONE) ne "" then do;
if SOURCE = "DIFF" then PHONE = "MISSING";
end;

else if SOURCE = "SOURCE2" and PHONE ne "" and lag(PHONE) = "" then do;
if SOURCE = "DIFF" then PHONE = "MISSING";
end;

else if SOURCE = "SOURCE2" and PHONE ne lag(PHONE) then do;
if SOURCE = "DIFF" then PHONE = '1';
end;

run;

 I would love the end result to be this:

 

 

SOURCE    ID1  ID2   PHONE

SOURCE1  1     1       5555555

SOURCE2  1     1       5555555

DIFF           1     1       

SOURCE1  1     2       

SOURCE2  1     2

DIFF            1     2

SOURCE1   2    1       9999999

SOURCE2   2    1     

DIFF            2    1        MISSING

SOURCE1   3     1

SOURCE2   3     1      4444444

DIFF            3      1     MISSING

SOURCE1  4      1     4448888

SOURCE2  4      1     4447777

DIFF            4      1     1 

 

 

Where am I going wrong?


Accepted Solutions
Solution
‎11-20-2017 11:29 AM
Respected Advisor
Posts: 2,155

Re: Multiple IF, THEN, DO, ELSE not producing consistent results

[ Edited ]
Posted in reply to sas-inquirer

While I haven't gone through this in a lot of detail, the LAG function inside an IF statement doesn't work the way you think it should work.

 

You need to use the LAG function in a command that always executes and is not inside an IF statement.

 

For example, in dataset WANT2, try this immediately after the BY command:

 

prevphone=lag(phone);

and of course use prevphone in the rest of the code.

--
Paige Miller

View solution in original post


All Replies
Solution
‎11-20-2017 11:29 AM
Respected Advisor
Posts: 2,155

Re: Multiple IF, THEN, DO, ELSE not producing consistent results

[ Edited ]
Posted in reply to sas-inquirer

While I haven't gone through this in a lot of detail, the LAG function inside an IF statement doesn't work the way you think it should work.

 

You need to use the LAG function in a command that always executes and is not inside an IF statement.

 

For example, in dataset WANT2, try this immediately after the BY command:

 

prevphone=lag(phone);

and of course use prevphone in the rest of the code.

--
Paige Miller
Contributor
Posts: 33

Re: Multiple IF, THEN, DO, ELSE not producing consistent results

Posted in reply to PaigeMiller

Thank you PaigeMiller. I have learned my lesson with lag. Your suggestion worked great. I had to add some coding to get my desired results, but everything looks just the way it should. Thank you so much. I cannot wait to put this issue to rest!

data want3 (drop = phone_result_fill prevphone phone_result);
set want;
by 
ID1 ID2;

prevphone = lag(phone);

if source = "source2" then do;
if phone = "" and prevphone ne "" or phone ne "" and prevphone = "" then phone_result = "MISSING";
else if phone ne prevphone then phone_result = "1";
end;

retain phone_result_fill;
if not missing (phone_result) then phone_result_fill = phone_result;
phone_result = phone_result_fill;

if source = "DIFF" then phone = phone_result;

run;
Super User
Super User
Posts: 8,634

Re: Multiple IF, THEN, DO, ELSE not producing consistent results

Posted in reply to sas-inquirer

The fact that you do not format your code in a readable way, and use upcase coding really makes your code hard to read.  From what I gather from your post, I would imagine your issue is caused by using the lag() functions in if's.  You can read several posts on here about why it is not a good idea to do this.  For my money I would find it far simpler to transpose (yes, I know against what I normally suggest) and then do the conditionals:

data have;
  infile datalines dlm = ':' truncover;
  length source $10. id1 $2. id2 $2. phone $15.;
  input source $ id1 $ id2 $ phone $ ;
datalines;
SOURCE1: 1: 1: 5555555:
SOURCE2: 1: 1: 5555555:
DIFF: 1: 1: .:
SOURCE1: 1: 2: :
SOURCE2: 1: 2: :
DIFF: 1: 2: .:
SOURCE1: 2: 1: 9999999:
SOURCE2: 2: 1: 
DIFF: 2: 1: XXXXXXX:
SOURCE1: 3: 1:  :
SOURCE2: 3: 1: 4444444:
DIFF: 3: 1: XXXXXXX:
SOURCE1: 4: 1: 4448888:
SOURCE2: 4: 1: 4444444:
DIFF: 4: 1: ...XXXX:
;
run;

proc transpose data=have out=want;
  by id1 id2;
  var phone;
  id source;
run;

This should put all the necessary information on one row.

Contributor
Posts: 33

Re: Multiple IF, THEN, DO, ELSE not producing consistent results

Thank you for your response RW9. I will certainly not be using lag in IF statements any more. I like your idea of using transpose, but my actual data set is way too large for transpose to work. Thanks again (and I'll work on making my code more readable).

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 4 replies
  • 81 views
  • 0 likes
  • 3 in conversation