## Comparing rows in data step

Solved
Frequent Contributor
Posts: 82

# Comparing rows in data step

Hello,

I’m trying to create a flag named prod in the below code which basically should compare column Status in each row by EMPID.  The prod flag should be ‘Y’ if the EMPID has different values in status columns or else it should be ‘N’, meaning they have the same status in all the rows.

Source Data Set:

Data aa;

infile cards;

input EMPID \$3. @5 Status \$3.;

cards;

123 Opn

123 Opn

345 Cld

345 Opn

567 Cld

567 Cld

567 Cld

999 Opn

984 Cld

236 Cld

236 Cld

236 Opn

;

Run;

Desired output:

EMPID Status  Prod

123   Opn     N

123   Opn     N

345   Cld     Y

345   Opn     Y

567   Cld     N

567   Cld     N

567   Cld     N

999   Opn     N

984   Cld     N

236   Cld     Y

236   Cld     Y

236   Opn     Y

My Code:

Data aa1;

retain PROD;

set aa;

by EMPID Status;

if first.Status = last.Status then PROD = 'N';

else PROD = 'N';

Run;

So in the above case, EMPID 236 Prod is ‘Y’ because they have 2 diff status values(Cld,Opn) whereas EMPID 123 is ‘N’ because the status is same for both the rows. I tried using first and last method but the output wasn’t the way I expected. I am beginner SAS programmer so wondering if I could get some help to code this one. Thanks in advance for your help!

Accepted Solutions
Solution
‎07-30-2014 06:00 PM
Super User
Posts: 13,542

## Re: Comparing rows in data step

proc sql;

create table want

as select a.*, case

when b.prodcount=1 then 'N'

when b.prodcount=2 then 'Y'

end as prod

from AA as a left join (select empid,count(*) as prodcount from (select distinct empid,status from aa)

group by empid) as b

on AA.Empid=b.empid;

quit;

Does the match but will result in data sorted on EmpID if that is a problem.

All Replies
Solution
‎07-30-2014 06:00 PM
Super User
Posts: 13,542

## Re: Comparing rows in data step

proc sql;

create table want

as select a.*, case

when b.prodcount=1 then 'N'

when b.prodcount=2 then 'Y'

end as prod

from AA as a left join (select empid,count(*) as prodcount from (select distinct empid,status from aa)

group by empid) as b

on AA.Empid=b.empid;

quit;

Does the match but will result in data sorted on EmpID if that is a problem.

Frequent Contributor
Posts: 82

## Re: Comparing rows in data step

I am getting output as expected. Thank you!. Wondering if this can be acheived using data step, asking just for my own learning.

Super Contributor
Posts: 275

## Re: Comparing rows in data step

data want;

do until (last.empid);

set aa;

by empid notsorted;

proc='N';

if not first.empid and status^=lag(status) then do;

proc='Y';

leave;

end;

end;

do until(last.empid);

set aa;

by empid notsorted;

output;

end;

run;

Posts: 3,167

## Re: Comparing rows in data step

Alternative:

proc sql;

create table want as

select *,

case

when (select count(distinct status) from aa where a.empid=empid) >1 then 'Y'

else 'N'

end

as flag from aa a;

quit;

Haikuo

Frequent Contributor
Posts: 82

## Re: Comparing rows in data step

Haikuo - Thank you for your  response. I ran your solution, it keeps running and never ends. My dataset size is close to 800K records.

Posts: 3,167

## Re: Comparing rows in data step

You are right. Mine only makes academic sense, it is very very inefficient, it pulls the whole table to match each obs.

Haikuo

Frequent Contributor
Posts: 82

## Re: Comparing rows in data step

No worries. Thanks anyways for your prompt response.

Posts: 3,167

## Re: Comparing rows in data step

Well, you just made me feel I owe you something , here is a data step solution, not necessarily slower than Proc SQL:

Data aa;

infile cards;

input EMPID \$3. @5 Status \$3.;

cards;

123 Opn

123 Opn

345 Cld

345 Opn

567 Cld

567 Cld

567 Cld

999 Opn

984 Cld

236 Cld

236 Cld

236 Opn

;

Run;

data want;

do until (last.empid);

set aa;

by empid notsorted;

if first.empid then DO; _s=status; flag='N';end;

if status ne _s then flag='Y';

end;

do until (last.empid);

set aa;

by empid notsorted;

output;

end;

drop _s;

run;

If you would, please bench mark both solutions ( 's and this one), let us know.

Thanks,

Haikuo

New Contributor
Posts: 3

## Re: Comparing rows in data step

With some modification , make the statement of b.prodcount=2 into  else 'Y'

Super User
Posts: 10,778

## Re: Comparing rows in data step

Not sure how fast this code would be .

```Data aa;
infile cards;
input EMPID \$3. @5 Status \$3.;
cards;
123 Opn
123 Opn
345 Cld
345 Opn
567 Cld
567 Cld
567 Cld
999 Opn
984 Cld
236 Cld
236 Cld
236 Opn
;
Run;
proc sql;
create table want as
select *,count(distinct Status) as flag
from aa
group by empid;
quit;

```

Xia Keshan

Super Contributor
Posts: 319

## Re: Comparing rows in data step

Hello,

here is a data step solution:

data want(drop=i j statusnew);
set aa;
by empid notsorted ;
retain flag statusnew j;

i+1;

if statusnew ne status and flag ne 'Y' then flag='Y';

if first.empid then do;
flag='N';
j=i;
end;

statusnew=status;

if last.empid then
do k=j to i;
set aa point=k;
output;
end;

run;

🔒 This topic is solved and locked.