BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
vicky07
Quartz | Level 8

Hello,

I’m trying to create a flag named prod in the below code which basically should compare column Status in each row by EMPID.  The prod flag should be ‘Y’ if the EMPID has different values in status columns or else it should be ‘N’, meaning they have the same status in all the rows.

Source Data Set:  

Data aa;

infile cards;

input EMPID $3. @5 Status $3.;

cards;

123 Opn

123 Opn

345 Cld

345 Opn

567 Cld

567 Cld

567 Cld

999 Opn

984 Cld

236 Cld

236 Cld

236 Opn

;

Run;

Desired output:

EMPID Status  Prod

123   Opn     N

123   Opn     N

345   Cld     Y

345   Opn     Y

567   Cld     N

567   Cld     N

567   Cld     N

999   Opn     N

984   Cld     N

236   Cld     Y

236   Cld     Y

236   Opn     Y

My Code:

Data aa1;

retain PROD;

set aa;

by EMPID Status;

if first.Status = last.Status then PROD = 'N';

else PROD = 'N';

Run;

So in the above case, EMPID 236 Prod is ‘Y’ because they have 2 diff status values(Cld,Opn) whereas EMPID 123 is ‘N’ because the status is same for both the rows. I tried using first and last method but the output wasn’t the way I expected. I am beginner SAS programmer so wondering if I could get some help to code this one. Thanks in advance for your help!

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

proc sql;                                                                                                                              

create table want                                                                                                                     

as select a.*, case                                                                                                                   

                           when b.prodcount=1 then 'N'                                                                       

                            when b.prodcount=2 then 'Y'                                                                        

                         end as prod                                                                                              

from AA as a left join (select empid,count(*) as prodcount from (select distinct empid,status from aa)                                

                         group by empid) as b                                                

                        on AA.Empid=b.empid;                                                                                           

quit;

Does the match but will result in data sorted on EmpID if that is a problem.

View solution in original post

11 REPLIES 11
ballardw
Super User

proc sql;                                                                                                                              

create table want                                                                                                                     

as select a.*, case                                                                                                                   

                           when b.prodcount=1 then 'N'                                                                       

                            when b.prodcount=2 then 'Y'                                                                        

                         end as prod                                                                                              

from AA as a left join (select empid,count(*) as prodcount from (select distinct empid,status from aa)                                

                         group by empid) as b                                                

                        on AA.Empid=b.empid;                                                                                           

quit;

Does the match but will result in data sorted on EmpID if that is a problem.

vicky07
Quartz | Level 8

I am getting output as expected. Thank you!. Wondering if this can be acheived using data step, asking just for my own learning.Smiley Happy

slchen
Lapis Lazuli | Level 10

data want;

  do until (last.empid);

     set aa;

  by empid notsorted;

  proc='N';

  if not first.empid and status^=lag(status) then do;

      proc='Y';

      leave;

   end;

   end;

   do until(last.empid);

      set aa;

      by empid notsorted;

      output;

  end;

run;

Haikuo
Onyx | Level 15

Alternative:

proc sql;

     create table want as

           select *,

                case

                     when (select count(distinct status) from aa where a.empid=empid) >1 then 'Y'

                     else 'N'

                end

           as flag from aa a;

quit;


Haikuo

vicky07
Quartz | Level 8

Haikuo - Thank you for your  response. I ran your solution, it keeps running and never ends. My dataset size is close to 800K records.

Haikuo
Onyx | Level 15

You are right. Mine only makes academic sense, it is very very inefficient, it pulls the whole table to match each obs.

Haikuo

vicky07
Quartz | Level 8

No worries. Thanks anyways for your prompt response.

Haikuo
Onyx | Level 15

Well, you just made me feel I owe you something Smiley Happy, here is a data step solution, not necessarily slower than Proc SQL:

Data aa;

     infile cards;

     input EMPID $3. @5 Status $3.;

     cards;

123 Opn

123 Opn

345 Cld

345 Opn

567 Cld

567 Cld

567 Cld

999 Opn

984 Cld

236 Cld

236 Cld

236 Opn

;

Run;

data want;

do until (last.empid);

set aa;

       by empid notsorted;

         if first.empid then DO; _s=status; flag='N';end;

           if status ne _s then flag='Y';

     end;

do until (last.empid);

set aa;

       by empid notsorted;

output;

end;

drop _s;

run;

If you would, please bench mark both solutions ( 's and this one), let us know.

Thanks,

Haikuo

hello_fj
Calcite | Level 5

With some modification , make the statement of b.prodcount=2 into  else 'Y'

Ksharp
Super User

Not sure how fast this code would be .

Data aa;
infile cards;
input EMPID $3. @5 Status $3.;
cards;
123 Opn
123 Opn
345 Cld
345 Opn
567 Cld
567 Cld
567 Cld
999 Opn
984 Cld
236 Cld
236 Cld
236 Opn
;
Run;
proc sql;
create table want as
 select *,count(distinct Status) as flag
  from aa
   group by empid;
quit;

Xia Keshan

Loko
Barite | Level 11

Hello,

here is a data step solution:

data want(drop=i j statusnew);
set aa;
by empid notsorted ;
retain flag statusnew j;

i+1;

if statusnew ne status and flag ne 'Y' then flag='Y';

if first.empid then do;
flag='N';
j=i;
end;

statusnew=status;


if last.empid then
do k=j to i;
set aa point=k;
output;
end;

run;

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 11 replies
  • 1563 views
  • 3 likes
  • 7 in conversation