DATA Step, Macro, Functions and more

LAG FUNCTION VS EXCEL IF FUNCTION

Accepted Solution Solved
Reply
Contributor
Posts: 32
Accepted Solution

LAG FUNCTION VS EXCEL IF FUNCTION

HI, 

 

Need some help, i have a list of code and need to do some comparison between first and second and so on. My code as below:-

 

 

data have;
infile cards ;
input code $16.;
cards;
10017290064CONE
10019220069CFCP
10019220069PONE
10022640063CLADY
10031300063CONE
10031420069CFCP
10031420069PGEN
10031420069PGEN
10031420069PONE
10031770067CONE
10034410067CONE
10034480060CONE
10044510062PGEN
10044510062PGEN
10046570064CONE
10046570064CONE
10049240061CONE
10051110061LADY
10051230067PGEN
10051230067PONE
;

data want;
	SET have;
	IF FIRST.CODE THEN 
		SGCHK = "DIFF";
	ELSE
		IF lag(CODE) = CODE THEN 
			SGCHK = "DUP";
		ELSE 
			SGCHK = "DIFF";
RUN;

and output as below:- 

 

codeSGCHK
10017290064CONEDIFF
10019220069CFCPDIFF
10019220069PONEDIFF
10022640063CLADYDIFF
10031300063CONEDIFF
10031420069CFCPDIFF
10031420069PGENDIFF
10031420069PGENDUP
10031420069PONEDIFF
10031770067CONEDIFF
10034410067CONEDIFF
10034480060CONEDIFF
10044510062PGENDIFF
10044510062PGENDUP
10046570064CONEDIFF
10046570064CONEDUP
10049240061CONEDIFF
10051110061LADYDIFF
10051230067PGENDIFF
10051230067PONEDIFF

 

 

Is there a way to get output as below?

 

codeSGCHK
10017290064CONEDIFF
10019220069CFCPDIFF
10019220069PONEDIFF
10022640063CLADYDIFF
10031300063CONEDIFF
10031420069CFCPDIFF
10031420069PGENDUP
10031420069PGENDIFF
10031420069PONEDIFF
10031770067CONEDIFF
10034410067CONEDIFF
10034480060CONEDIFF
10044510062PGENDUP
10044510062PGENDIFF
10046570064CONEDUP
10046570064CONEDIFF
10049240061CONEDIFF
10051110061LADYDIFF
10051230067PGENDIFF
10051230067PONEDIFF

 

Thank you


Accepted Solutions
Solution
‎06-30-2017 04:37 AM
Super User
Posts: 6,936

Re: LAG FUNCTION VS EXCEL IF FUNCTION

Then you "simply" read dataset "have" two times side-by-side, but have one read shifted by one observation:

data want;
merge
  have
  have (firstobs=2 rename=(code=code2))
;
if code = code2
then sgchk = 'DUP';
else sgchk = 'DIF';
drop code2;
run;
---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers

View solution in original post


All Replies
Super User
Posts: 17,824

Re: LAG FUNCTION VS EXCEL IF FUNCTION

Use the BY groups first/last values. 

 

data want;
Set have;
 
By code;

If first.code and not last.code then SGCHK='DUP';
Else sgchk='DIFF';

Run;
Contributor
Posts: 32

Re: LAG FUNCTION VS EXCEL IF FUNCTION

[ Edited ]

Thank Reeza,

 

it seem not working if code appear more than 2  as per below:-

 

10034410067CONE - appear 3 times

10051230067PONE - appear 4 times

 

data have;
infile cards ;
input code $16.;
cards;
10017290064CONE
10019220069CFCP
10019220069PONE
10022640063CLADY
10031300063CONE
10031420069CFCP
10031420069PGEN
10031420069PGEN
10031420069PONE
10031770067CONE
10034410067CONE
10034410067CONE
10034410067CONE 10034480060CONE 10044510062PGEN 10044510062PGEN 10046570064CONE 10046570064CONE 10049240061CONE 10051110061LADY 10051230067PGEN 10051230067PONE
10051230067PONE
10051230067PONE
10051230067PONE ; data want; Set have; By code; If first.code and not last.code then SGCHK='DUP'; Else sgchk='DIFF'; Run;

and output is 

 

codeSGCHK
10017290064CONEDIF
10019220069CFCPDIF
10019220069PONEDIF
10022640063CLADYDIF
10031300063CONEDIF
10031420069CFCPDIF
10031420069PGENDUP
10031420069PGENDIF
10031420069PONEDIF
10031770067CONEDIF
10034410067CONEDUP
10034410067CONEDIF
10034410067CONEDIF
10034480060CONEDIF
10044510062PGENDUP
10044510062PGENDIF
10046570064CONEDUP
10046570064CONEDIF
10049240061CONEDIF
10051110061LADYDIF
10051230067PGENDIF
10051230067PONEDUP
10051230067PONEDIF
10051230067PONEDIF
10051230067PONEDIF


The output should be 

 

codeSGCHK
10017290064CONEDIF
10019220069CFCPDIF
10019220069PONEDIF
10022640063CLADYDIF
10031300063CONEDIF
10031420069CFCPDIF
10031420069PGENDUP
10031420069PGENDIF
10031420069PONEDIF
10031770067CONEDIF
10034410067CONEDUP
10034410067CONEDUP
10034410067CONEDIF
10034480060CONEDIF
10044510062PGENDUP
10044510062PGENDIF
10046570064CONEDUP
10046570064CONEDIF
10049240061CONEDIF
10051110061LADYDIF
10051230067PGENDIF
10051230067PONEDUP
10051230067PONEDUP
10051230067PONEDUP
10051230067PONEDIF
Super User
Posts: 6,936

Re: LAG FUNCTION VS EXCEL IF FUNCTION

I miss the logic here:

10031420069CFCP DIFF
10031420069PGEN DUP
10031420069PGEN DIFF
10031420069PONE DIFF

Why should the first appearance of 10031420069PGEN be marked DUP, and the second(!) DIFF?

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Contributor
Posts: 32

Re: LAG FUNCTION VS EXCEL IF FUNCTION

Hi KurtBremser,

 

I'm trying to move my task from Ms Excel to Sas, that DUP and DIFF is for checking purposes(when i was using Ms Excel, formula is =IF(A2=A3,"DUP","DIF") ), I wanted to keep that in SAS for checking purposes before fully use SAS to do my task.

Super User
Posts: 6,936

Re: LAG FUNCTION VS EXCEL IF FUNCTION


sagulolo wrote:

Hi KurtBremser,

 

I'm trying to move my task from Ms Excel to Sas, that DUP and DIFF is for checking purposes(when i was using Ms Excel, formula is =IF(A2=A3,"DUP","DIF") ), I wanted to keep that in SAS for checking purposes before fully use SAS to do my task.


That does not explain the flaw (as I perceive it) in your logic. Calling a record DIF when it's a duplicate of the preceding record makes my toenails curl.

You might want to review the logic in your Excel.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Contributor
Posts: 32

Re: LAG FUNCTION VS EXCEL IF FUNCTION

Hi KurtBremser,

 

when i have this listing

10017290064CONEDIF
10019220069CFCPDIF
10019220069PONEDIF
10022640063CLADYDIF
10031300063CONEDIF
10031420069CFCPDIF
10031420069PGENDUP
10031420069PGENDIF
10031420069PONEDIF
10031770067CONEDIF
10034410067CONEDUP
10034410067CONEDUP
10034410067CONEDIF
10034480060CONEDIF
10044510062PGENDUP
10044510062PGENDIF
10046570064CONEDUP
10046570064CONEDIF
10049240061CONEDIF
10051110061LADYDIF
10051230067PGENDIF
10051230067PONEDUP
10051230067PONEDUP
10051230067PONEDUP
10051230067PONEDIF

 

and im going to produce a new list 

 

10017290064CONEDIF
10019220069CFCPDIF
10019220069PONEDIF
10022640063CLADYDIF
10031300063CONEDIF
10031420069PGENDIF
10031420069PONEDIF
10034410067CONEDIF
10046570064CONEDIF
10049240061CONEDIF
10051110061LADYDIF
10051230067PONEDIF

 

of course, it will always have more variable and im always wanted the record with DIF(last row instead of first row)

Super User
Posts: 6,936

Re: LAG FUNCTION VS EXCEL IF FUNCTION

So you want to set "DUP" when an observation is a duplicate of the following record?

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Contributor
Posts: 32

Re: LAG FUNCTION VS EXCEL IF FUNCTION

Dear KurtBremser,

 

Yes, sorry for my bad english.

 

 

Solution
‎06-30-2017 04:37 AM
Super User
Posts: 6,936

Re: LAG FUNCTION VS EXCEL IF FUNCTION

Then you "simply" read dataset "have" two times side-by-side, but have one read shifted by one observation:

data want;
merge
  have
  have (firstobs=2 rename=(code=code2))
;
if code = code2
then sgchk = 'DUP';
else sgchk = 'DIF';
drop code2;
run;
---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Contributor
Posts: 32

Re: LAG FUNCTION VS EXCEL IF FUNCTION

Dear KurtBremser,

Thank you so much, this is really help me a lot.
☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 10 replies
  • 150 views
  • 0 likes
  • 3 in conversation