BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
alexdamado
Calcite | Level 5

Hello guys,

 

I have a table containning different names for the same registry. The error is due to the name variable being pulled from different databases and is simply a different spelling since the ID is definitely the same.

 

I would like to correct this error by simply substituting one of the versions of the name by the other.

 

I came up with the following table to exemplify my problem:

 

DATA HAVE;
INFILE DATALINES DLM='	';
LENGTH NAME $ 20;
INPUT NAME $ ID;
DATALINES;
ALEX	123456
ALEX	123456
ALEXANDER	123456
;
RUN;

To try and correct it I wanted to use the lag function to check if the ID from the current row is the same from the row above but with a different name and, when that is the case, substitute the name in the current row with the name on the previous row. I came up with the following code.

 

DATA WANT;
 SET HAVE;
 IF ID = LAG(ID)
     AND NAME NE LAG(NAME)
 THEN NAME = LAG(NAME);
RUN;

But this code gives me a blank when the result I was expecting would be Alex. I was able to determine that the problem is in the IF statement, because using only the the lag(name) as a new variable, without the if statement, it works fine.

 

I am at a loss of what the problem might be and would appreciate any help or pointers.

 

Thanks in advance for the help

 

Alex (Alexander : ) )

1 ACCEPTED SOLUTION

Accepted Solutions
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Don't use conditional assignments with lag.  See introduction in:

http://support.sas.com/resources/papers/proceedings09/055-2009.pdf

 

The easiest way to fix your data is to first extract the id and decode you want, be it the first or some other calculation, then merge that back onto the original data.  Alternatively you could retain a value, e.g:

data want;
  set have;
  by id;
  length lstname $200;
  retain lstname;
  if first.id then lstname=name;
  else if name ne lstname then name=lstname;
run;

Note how I don't code all in uppercase as I don't want to shout the code at you.

 

 

View solution in original post

4 REPLIES 4
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Don't use conditional assignments with lag.  See introduction in:

http://support.sas.com/resources/papers/proceedings09/055-2009.pdf

 

The easiest way to fix your data is to first extract the id and decode you want, be it the first or some other calculation, then merge that back onto the original data.  Alternatively you could retain a value, e.g:

data want;
  set have;
  by id;
  length lstname $200;
  retain lstname;
  if first.id then lstname=name;
  else if name ne lstname then name=lstname;
run;

Note how I don't code all in uppercase as I don't want to shout the code at you.

 

 

alexdamado
Calcite | Level 5

Thank you for the quick reply and apologies for the shouting : )

Tom
Super User Tom
Super User

Your two different calls to LAG(NAME) are maintaining two different stacks of values.  Since the second one only executes occasionally the values it pulls back off the stack are totally different.

It looks like you are trying to keep the first copy of the name.  So just tell SAS that is what you are doing.

data want ;
  set have ;
  by id;
  if first.id then name1 = name;
  name=name1;
  retain name1 ;
  drop name1;
run;

 

novinosrin
Tourmaline | Level 20

You can use lag in conditional statements lag but indeed it is tricky, either you populate the lagged var in another variable or use ifc/ifn

 

So your logic is correct and requires a very minor tweak as i mentioned, populate lagged var in another variable as illustrated below

DATA HAVE;
INFILE DATALINES DLM='	';
LENGTH NAME $ 20;
INPUT NAME $ ID;
DATALINES;
ALEX	123456
ALEX	123456
ALEXANDER	123456
;
RUN;


DATA WANT;
 SET HAVE;
 temp=lag(name);
 IF ID = LAG(ID)
     AND NAME NE LAG(NAME)
 THEN NAME =temp;
 drop temp;
RUN;

 or


DATA HAVE;
INFILE DATALINES DLM='	';
LENGTH NAME $ 20;
INPUT NAME $ ID;
DATALINES;
ALEX	123456
ALEX	123456
ALEXANDER	123456
;
RUN;

DATA WANT;
 SET HAVE;
 name=ifc( ID = LAG(ID) and  NAME NE LAG(NAME),lag(name),name);
RUN;

 

 

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 7711 views
  • 2 likes
  • 4 in conversation