BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
sas-inquirer
Quartz | Level 8

Good morning,

    I have spent days trying to work the logic on some IF, THEN, DO, ELSE statements and every time I get one to work, the other doesn't.

I am using SAS 9.2. I have tried consecutive IF, THEN, DO statements to try to make the IF THEN ELSE statements read certain blocks of data but that hasn't been working for me. I have tried everything I know how to try.

 

Background:

The data is a result of comparing two datasets for differences using PROC COMPARE.

 

Need:

I need to be able to differentiate between a difference as a result of having a value missing in one dataset and not the other and when the difference is as a result of an actual difference (no missing values). 

 

Issues:

I found I needed to remove the X's from difference because sometimes the if statements would work if the DIFF value was blank. I included two WANT2 tables to show a couple of examples of what I've tried. But I could have included 50. I can get everything to = 1 or MISSING, but not either.

 

 

data have;
infile datalines dlm = ':' truncover;
length SOURCE $10. ID1 $2. ID2 $2. PHONE $15.;
input SOURCE $ ID1 $ ID2 $ PHONE $ ;
datalines;
SOURCE1: 1: 1: 5555555:
SOURCE2: 1: 1: 5555555:
DIFF: 1: 1: .:
SOURCE1: 1: 2: :
SOURCE2: 1: 2: :
DIFF: 1: 2: .:
SOURCE1: 2: 1: 9999999:
SOURCE2: 2: 1: 
DIFF: 2: 1: XXXXXXX:
SOURCE1: 3: 1:  :
SOURCE2: 3: 1: 4444444:
DIFF: 3: 1: XXXXXXX:
SOURCE1: 4: 1: 4448888:
SOURCE2: 4: 1: 4444444:
DIFF: 4: 1: ...XXXX:
;
run;

 
data want;
set have;
if SOURCE = "DIFF" and index(PHONE,'X') then PHONE = "";
else PHONE = PHONE;
run;

data want2;
set have;
by 
ID1 ID2;
if SOURCE = "SOURCE2" then do;
if PHONE = "" and lag(PHONE) ne "" then do;
if SOURCE = "DIFF" then PHONE = "MISSING";
end;

else if PHONE ne "" and lag(PHONE) = "" then do;
if SOURCE = "DIFF" then PHONE = "MISSING";
end;

else if PHONE ne lag(PHONE) then do;
if SOURCE = "DIFF" then PHONE = '1';
end;

end;

run;

 

 

data want2;
set have;
by 
ID1 ID2;
if SOURCE = "SOURCE2" and PHONE = "" and lag(PHONE) ne "" then do;
if SOURCE = "DIFF" then PHONE = "MISSING";
end;

else if SOURCE = "SOURCE2" and PHONE ne "" and lag(PHONE) = "" then do;
if SOURCE = "DIFF" then PHONE = "MISSING";
end;

else if SOURCE = "SOURCE2" and PHONE ne lag(PHONE) then do;
if SOURCE = "DIFF" then PHONE = '1';
end;

run;

 I would love the end result to be this:

 

 

SOURCE    ID1  ID2   PHONE

SOURCE1  1     1       5555555

SOURCE2  1     1       5555555

DIFF           1     1       

SOURCE1  1     2       

SOURCE2  1     2

DIFF            1     2

SOURCE1   2    1       9999999

SOURCE2   2    1     

DIFF            2    1        MISSING

SOURCE1   3     1

SOURCE2   3     1      4444444

DIFF            3      1     MISSING

SOURCE1  4      1     4448888

SOURCE2  4      1     4447777

DIFF            4      1     1 

 

 

Where am I going wrong?

1 ACCEPTED SOLUTION

Accepted Solutions
PaigeMiller
Diamond | Level 26

While I haven't gone through this in a lot of detail, the LAG function inside an IF statement doesn't work the way you think it should work.

 

You need to use the LAG function in a command that always executes and is not inside an IF statement.

 

For example, in dataset WANT2, try this immediately after the BY command:

 

prevphone=lag(phone);

and of course use prevphone in the rest of the code.

--
Paige Miller

View solution in original post

4 REPLIES 4
PaigeMiller
Diamond | Level 26

While I haven't gone through this in a lot of detail, the LAG function inside an IF statement doesn't work the way you think it should work.

 

You need to use the LAG function in a command that always executes and is not inside an IF statement.

 

For example, in dataset WANT2, try this immediately after the BY command:

 

prevphone=lag(phone);

and of course use prevphone in the rest of the code.

--
Paige Miller
sas-inquirer
Quartz | Level 8

Thank you PaigeMiller. I have learned my lesson with lag. Your suggestion worked great. I had to add some coding to get my desired results, but everything looks just the way it should. Thank you so much. I cannot wait to put this issue to rest!

data want3 (drop = phone_result_fill prevphone phone_result);
set want;
by 
ID1 ID2;

prevphone = lag(phone);

if source = "source2" then do;
if phone = "" and prevphone ne "" or phone ne "" and prevphone = "" then phone_result = "MISSING";
else if phone ne prevphone then phone_result = "1";
end;

retain phone_result_fill;
if not missing (phone_result) then phone_result_fill = phone_result;
phone_result = phone_result_fill;

if source = "DIFF" then phone = phone_result;

run;
RW9
Diamond | Level 26 RW9
Diamond | Level 26

The fact that you do not format your code in a readable way, and use upcase coding really makes your code hard to read.  From what I gather from your post, I would imagine your issue is caused by using the lag() functions in if's.  You can read several posts on here about why it is not a good idea to do this.  For my money I would find it far simpler to transpose (yes, I know against what I normally suggest) and then do the conditionals:

data have;
  infile datalines dlm = ':' truncover;
  length source $10. id1 $2. id2 $2. phone $15.;
  input source $ id1 $ id2 $ phone $ ;
datalines;
SOURCE1: 1: 1: 5555555:
SOURCE2: 1: 1: 5555555:
DIFF: 1: 1: .:
SOURCE1: 1: 2: :
SOURCE2: 1: 2: :
DIFF: 1: 2: .:
SOURCE1: 2: 1: 9999999:
SOURCE2: 2: 1: 
DIFF: 2: 1: XXXXXXX:
SOURCE1: 3: 1:  :
SOURCE2: 3: 1: 4444444:
DIFF: 3: 1: XXXXXXX:
SOURCE1: 4: 1: 4448888:
SOURCE2: 4: 1: 4444444:
DIFF: 4: 1: ...XXXX:
;
run;

proc transpose data=have out=want;
  by id1 id2;
  var phone;
  id source;
run;

This should put all the necessary information on one row.

sas-inquirer
Quartz | Level 8

Thank you for your response RW9. I will certainly not be using lag in IF statements any more. I like your idea of using transpose, but my actual data set is way too large for transpose to work. Thanks again (and I'll work on making my code more readable).

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 1025 views
  • 0 likes
  • 3 in conversation