DATA Step, Macro, Functions and more

Comparing rows in data step

Accepted Solution Solved
Reply
Contributor
Posts: 71
Accepted Solution

Comparing rows in data step

Hello,

I’m trying to create a flag named prod in the below code which basically should compare column Status in each row by EMPID.  The prod flag should be ‘Y’ if the EMPID has different values in status columns or else it should be ‘N’, meaning they have the same status in all the rows.

Source Data Set:  

Data aa;

infile cards;

input EMPID $3. @5 Status $3.;

cards;

123 Opn

123 Opn

345 Cld

345 Opn

567 Cld

567 Cld

567 Cld

999 Opn

984 Cld

236 Cld

236 Cld

236 Opn

;

Run;

Desired output:

EMPID Status  Prod

123   Opn     N

123   Opn     N

345   Cld     Y

345   Opn     Y

567   Cld     N

567   Cld     N

567   Cld     N

999   Opn     N

984   Cld     N

236   Cld     Y

236   Cld     Y

236   Opn     Y

My Code:

Data aa1;

retain PROD;

set aa;

by EMPID Status;

if first.Status = last.Status then PROD = 'N';

else PROD = 'N';

Run;

So in the above case, EMPID 236 Prod is ‘Y’ because they have 2 diff status values(Cld,Opn) whereas EMPID 123 is ‘N’ because the status is same for both the rows. I tried using first and last method but the output wasn’t the way I expected. I am beginner SAS programmer so wondering if I could get some help to code this one. Thanks in advance for your help!


Accepted Solutions
Solution
‎07-30-2014 06:00 PM
Super User
Posts: 10,500

Re: Comparing rows in data step

proc sql;                                                                                                                              

create table want                                                                                                                     

as select a.*, case                                                                                                                   

                           when b.prodcount=1 then 'N'                                                                       

                            when b.prodcount=2 then 'Y'                                                                        

                         end as prod                                                                                              

from AA as a left join (select empid,count(*) as prodcount from (select distinct empid,status from aa)                                

                         group by empid) as b                                                

                        on AA.Empid=b.empid;                                                                                           

quit;

Does the match but will result in data sorted on EmpID if that is a problem.

View solution in original post


All Replies
Solution
‎07-30-2014 06:00 PM
Super User
Posts: 10,500

Re: Comparing rows in data step

proc sql;                                                                                                                              

create table want                                                                                                                     

as select a.*, case                                                                                                                   

                           when b.prodcount=1 then 'N'                                                                       

                            when b.prodcount=2 then 'Y'                                                                        

                         end as prod                                                                                              

from AA as a left join (select empid,count(*) as prodcount from (select distinct empid,status from aa)                                

                         group by empid) as b                                                

                        on AA.Empid=b.empid;                                                                                           

quit;

Does the match but will result in data sorted on EmpID if that is a problem.

Contributor
Posts: 71

Re: Comparing rows in data step

I am getting output as expected. Thank you!. Wondering if this can be acheived using data step, asking just for my own learning.Smiley Happy

Super Contributor
Posts: 275

Re: Comparing rows in data step

data want;

  do until (last.empid);

     set aa;

  by empid notsorted;

  proc='N';

  if not first.empid and status^=lag(status) then do;

      proc='Y';

      leave;

   end;

   end;

   do until(last.empid);

      set aa;

      by empid notsorted;

      output;

  end;

run;

Respected Advisor
Posts: 3,124

Re: Comparing rows in data step

Alternative:

proc sql;

     create table want as

           select *,

                case

                     when (select count(distinct status) from aa where a.empid=empid) >1 then 'Y'

                     else 'N'

                end

           as flag from aa a;

quit;


Haikuo

Contributor
Posts: 71

Re: Comparing rows in data step

Haikuo - Thank you for your  response. I ran your solution, it keeps running and never ends. My dataset size is close to 800K records.

Respected Advisor
Posts: 3,124

Re: Comparing rows in data step

You are right. Mine only makes academic sense, it is very very inefficient, it pulls the whole table to match each obs.

Haikuo

Contributor
Posts: 71

Re: Comparing rows in data step

No worries. Thanks anyways for your prompt response.

Respected Advisor
Posts: 3,124

Re: Comparing rows in data step

Well, you just made me feel I owe you something Smiley Happy, here is a data step solution, not necessarily slower than Proc SQL:

Data aa;

     infile cards;

     input EMPID $3. @5 Status $3.;

     cards;

123 Opn

123 Opn

345 Cld

345 Opn

567 Cld

567 Cld

567 Cld

999 Opn

984 Cld

236 Cld

236 Cld

236 Opn

;

Run;

data want;

do until (last.empid);

set aa;

       by empid notsorted;

         if first.empid then DO; _s=status; flag='N';end;

           if status ne _s then flag='Y';

     end;

do until (last.empid);

set aa;

       by empid notsorted;

output;

end;

drop _s;

run;

If you would, please bench mark both solutions ( 's and this one), let us know.

Thanks,

Haikuo

New Contributor
Posts: 3

Re: Comparing rows in data step

With some modification , make the statement of b.prodcount=2 into  else 'Y'

Super User
Posts: 9,681

Re: Comparing rows in data step

Not sure how fast this code would be .

Data aa;
infile cards;
input EMPID $3. @5 Status $3.;
cards;
123 Opn
123 Opn
345 Cld
345 Opn
567 Cld
567 Cld
567 Cld
999 Opn
984 Cld
236 Cld
236 Cld
236 Opn
;
Run;
proc sql;
create table want as
 select *,count(distinct Status) as flag
  from aa
   group by empid;
quit;

Xia Keshan

Super Contributor
Posts: 305

Re: Comparing rows in data step

Hello,

here is a data step solution:

data want(drop=i j statusnew);
set aa;
by empid notsorted ;
retain flag statusnew j;

i+1;

if statusnew ne status and flag ne 'Y' then flag='Y';

if first.empid then do;
flag='N';
j=i;
end;

statusnew=status;


if last.empid then
do k=j to i;
set aa point=k;
output;
end;

run;

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 11 replies
  • 529 views
  • 3 likes
  • 7 in conversation