Help using Base SAS procedures

mark series if another variable meets criteria

Accepted Solution Solved
Reply
Super Contributor
Posts: 401
Accepted Solution

mark series if another variable meets criteria

I have a set of accts that I'd like to be marked if a criteria is met by other variables.I tried retain but can't quite get it, I wonder if index needs to beused..

This is what I mean

Have:

Acct     ST1   ST2

205       01      PC

205       02      AD

206       02      BC

207       01      AD

208       01      PC

208       03      AD

208       04      AD

209       02      PC

209       03      BC

Want:

I need all the Acct series to be marked that has a 02 in the ST and PC in ST2.

Acct     ST1   ST2     USE

205       01      PC       N

205       02      AD       N

206       02      BC       Y

207       01      AD       Y

208       01      PC       Y

208       03      AD       Y

208       04      AD       Y

209       02      PC       N

209       03      BC       N

Thanks.


Accepted Solutions
Solution
‎10-27-2011 02:55 PM
PROC Star
Posts: 7,471

Re: mark series if another variable meets criteria

Then just use something like:

data Have;

  input (Acct     ST1   ST2) ($);

  cards;

205       01      PC

205       02      AD

206       02      BC

207       01      AD

208       01      PC

208       03      AD

208       04      AD

209       02      PC

209       03      BC

;

data want (drop=_Smiley Happy;

  do until (last.acct);

    set have;

    by acct;

    if st1 eq "02" and

     st2 eq "PC" then _test=1;

  end;

  do until (last.acct);

    set have;

    by acct;

    if _test then use="N";

    else use="Y";

    output;

  end;

run;

View solution in original post


All Replies
PROC Star
Posts: 7,471

mark series if another variable meets criteria

Are you sure your "want" is correct?  From it, I don't understand why you assign the values you've assigned.

Super Contributor
Posts: 401

Re: mark series if another variable meets criteria

Yes, I have a large list of accounts (many duplicates) that I'd like to mark as N if they have ST1 = 02 and ST2 = PC.. but mark by their corresponding acct... I want to eventually create a rate from the overall population where USE = Y and the PC's and 02 are not to be used in the calculation.

Super User
Super User
Posts: 7,042

mark series if another variable meets criteria

I find boolean flags easier to create and work with.

PROC SQL ;

   create table want as select *,max( st='02' ) as use

      from have

      group by acct

   ;

quit;

Or a data step using DOW loops

data want ;

    do until (last.acct) ;

        set have;

        by acct;

        use = max(use,st='02');

    end;

    use = max(0,use);

    do until (last.acct);

      set have;

      by acct ;

    end;

run;

Or merge it back with itself.

data want ;

    merge have have(rename=(st=st2) where=(st2='02') );

    by acct ;

    if first.acct then use=(st2='02');

   drop st2;

run;

PROC Star
Posts: 7,471

mark series if another variable meets criteria

I think that the following does what you want:

data want (drop=_Smiley Happy;

  do until (last.acct);

    set have;

    by acct;

    if st1 eq "02" then _02=1;

    if st2 eq "PC" then _PC=1;

  end;

  do until (last.acct);

    set have;

    by acct;

    if _02 and _PC then use="N";

    else use="Y";

    output;

  end;

run;

Super Contributor
Posts: 401

Re: mark series if another variable meets criteria

Thanks this is extremely helpful..

The only thing different I'd add is that these 2 statements should be used collectively and joined wiht an 'AND' .. right now the results I'm gettin gis individually either a 02 or a PC

  if st1 eq "02" then _02=1;

  if st2 eq "PC" then _PC=1;

PROC Star
Posts: 7,471

Re: mark series if another variable meets criteria

The code I suggested provides the result that you showed in your example, i.e., if any acct had both a value of st2="PC" in at least one if its records and a value of st1="02" in at least one of its records it was assigned a value of "N", otherwise it was assigned a value of "Y".

Super Contributor
Posts: 401

Re: mark series if another variable meets criteria

You are correct.. my mistake.  I just noticed I wanted:  if it has ST1=02 and ST2=PC

Acct     ST1   ST2     USE

205       01      PC       Y

205       02      AD       Y

206       02      BC       Y

207       01      AD       Y

208       01      PC       Y

208       03      AD       Y

208       04      AD       Y

209       02      PC       N

209       03      BC       N

Solution
‎10-27-2011 02:55 PM
PROC Star
Posts: 7,471

Re: mark series if another variable meets criteria

Then just use something like:

data Have;

  input (Acct     ST1   ST2) ($);

  cards;

205       01      PC

205       02      AD

206       02      BC

207       01      AD

208       01      PC

208       03      AD

208       04      AD

209       02      PC

209       03      BC

;

data want (drop=_Smiley Happy;

  do until (last.acct);

    set have;

    by acct;

    if st1 eq "02" and

     st2 eq "PC" then _test=1;

  end;

  do until (last.acct);

    set have;

    by acct;

    if _test then use="N";

    else use="Y";

    output;

  end;

run;

Super Contributor
Posts: 401

Re: mark series if another variable meets criteria

Thank you.  I'm not even gonna try to figure out how you guys come up with these correct responses so quickly.. It's impressive.

Valued Guide
Posts: 765

Re: mark series if another variable meets criteria

Hi ... here's another idea ...

data want;

use = 'Y';

do until(last.acct);

   set have (in=one) have;

   by acct;

   if one and st1 eq '02' and st2 eq 'PC' then use = 'N';

   if ^one then output;

end;

run;

PROC Star
Posts: 7,471

Re: mark series if another variable meets criteria

Mike, Definitely an interesting approach.  You know, of course, that I had to see which method ran quicker.

The approach you suggested took 25% more cpu time to run (and, yes, I would also have posted it if it went the results went the other way).  In fact, I was expecting your approach to run more quickly.

Valued Guide
Posts: 765

Re: mark series if another variable meets criteria

Hi ... OK, and I'm sure you tried it on a larger data set ... here's another,

a variation on Tom's MERGE ...

data want;

merge have(in=one where=(st1 eq '02' and st2 eq 'PC')) have;

by acct;

use = ifc(one, 'N', 'Y');

run;

ps  the self-interleave would be a bit faster if I had added an ELSE ...

if one and st1 eq '02' and st2 eq 'PC' then use = 'N';

else

if ^one then output;

pps  for further reading ... Howard Schreier's "Interleaving a Dataset with Itself: How and Why"

http://www.nesug.org/proceedings/nesug03/cc/cc002.pdf

Super User
Posts: 10,023

Re: mark series if another variable meets criteria

How about;

data Have;
  input (Acct     ST1   ST2) ($);
  cards;
205       01      PC
205       02      AD
206       02      BC
207       01      AD
208       01      PC
208       03      AD
208       04      AD
209       02      PC
209       03      BC
;
run;
proc sql;
 create table want(drop=a b) as
  select *,sum(st1='02') as a ,sum(st2='PC') as b,
        case when calculated a ge 1 and calculated b ge 1 then 'N' else 'Y' end as use
   from have
    group by acct;quit;

Ksharp

Message was edited by: xia keshan

Super Contributor
Posts: 401

Re: mark series if another variable meets criteria

Very well done, Ksharp

🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 14 replies
  • 165 views
  • 3 likes
  • 5 in conversation