BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
podarum
Quartz | Level 8

I have a set of accts that I'd like to be marked if a criteria is met by other variables.I tried retain but can't quite get it, I wonder if index needs to beused..

This is what I mean

Have:

Acct     ST1   ST2

205       01      PC

205       02      AD

206       02      BC

207       01      AD

208       01      PC

208       03      AD

208       04      AD

209       02      PC

209       03      BC

Want:

I need all the Acct series to be marked that has a 02 in the ST and PC in ST2.

Acct     ST1   ST2     USE

205       01      PC       N

205       02      AD       N

206       02      BC       Y

207       01      AD       Y

208       01      PC       Y

208       03      AD       Y

208       04      AD       Y

209       02      PC       N

209       03      BC       N

Thanks.

1 ACCEPTED SOLUTION

Accepted Solutions
art297
Opal | Level 21

Then just use something like:

data Have;

  input (Acct     ST1   ST2) ($);

  cards;

205       01      PC

205       02      AD

206       02      BC

207       01      AD

208       01      PC

208       03      AD

208       04      AD

209       02      PC

209       03      BC

;

data want (drop=_:);

  do until (last.acct);

    set have;

    by acct;

    if st1 eq "02" and

     st2 eq "PC" then _test=1;

  end;

  do until (last.acct);

    set have;

    by acct;

    if _test then use="N";

    else use="Y";

    output;

  end;

run;

View solution in original post

14 REPLIES 14
art297
Opal | Level 21

Are you sure your "want" is correct?  From it, I don't understand why you assign the values you've assigned.

podarum
Quartz | Level 8

Yes, I have a large list of accounts (many duplicates) that I'd like to mark as N if they have ST1 = 02 and ST2 = PC.. but mark by their corresponding acct... I want to eventually create a rate from the overall population where USE = Y and the PC's and 02 are not to be used in the calculation.

Tom
Super User Tom
Super User

I find boolean flags easier to create and work with.

PROC SQL ;

   create table want as select *,max( st='02' ) as use

      from have

      group by acct

   ;

quit;

Or a data step using DOW loops

data want ;

    do until (last.acct) ;

        set have;

        by acct;

        use = max(use,st='02');

    end;

    use = max(0,use);

    do until (last.acct);

      set have;

      by acct ;

    end;

run;

Or merge it back with itself.

data want ;

    merge have have(rename=(st=st2) where=(st2='02') );

    by acct ;

    if first.acct then use=(st2='02');

   drop st2;

run;

art297
Opal | Level 21

I think that the following does what you want:

data want (drop=_:);

  do until (last.acct);

    set have;

    by acct;

    if st1 eq "02" then _02=1;

    if st2 eq "PC" then _PC=1;

  end;

  do until (last.acct);

    set have;

    by acct;

    if _02 and _PC then use="N";

    else use="Y";

    output;

  end;

run;

podarum
Quartz | Level 8

Thanks this is extremely helpful..

The only thing different I'd add is that these 2 statements should be used collectively and joined wiht an 'AND' .. right now the results I'm gettin gis individually either a 02 or a PC

  if st1 eq "02" then _02=1;

  if st2 eq "PC" then _PC=1;

art297
Opal | Level 21

The code I suggested provides the result that you showed in your example, i.e., if any acct had both a value of st2="PC" in at least one if its records and a value of st1="02" in at least one of its records it was assigned a value of "N", otherwise it was assigned a value of "Y".

podarum
Quartz | Level 8

You are correct.. my mistake.  I just noticed I wanted:  if it has ST1=02 and ST2=PC

Acct     ST1   ST2     USE

205       01      PC       Y

205       02      AD       Y

206       02      BC       Y

207       01      AD       Y

208       01      PC       Y

208       03      AD       Y

208       04      AD       Y

209       02      PC       N

209       03      BC       N

art297
Opal | Level 21

Then just use something like:

data Have;

  input (Acct     ST1   ST2) ($);

  cards;

205       01      PC

205       02      AD

206       02      BC

207       01      AD

208       01      PC

208       03      AD

208       04      AD

209       02      PC

209       03      BC

;

data want (drop=_:);

  do until (last.acct);

    set have;

    by acct;

    if st1 eq "02" and

     st2 eq "PC" then _test=1;

  end;

  do until (last.acct);

    set have;

    by acct;

    if _test then use="N";

    else use="Y";

    output;

  end;

run;

podarum
Quartz | Level 8

Thank you.  I'm not even gonna try to figure out how you guys come up with these correct responses so quickly.. It's impressive.

MikeZdeb
Rhodochrosite | Level 12

Hi ... here's another idea ...

data want;

use = 'Y';

do until(last.acct);

   set have (in=one) have;

   by acct;

   if one and st1 eq '02' and st2 eq 'PC' then use = 'N';

   if ^one then output;

end;

run;

art297
Opal | Level 21

Mike, Definitely an interesting approach.  You know, of course, that I had to see which method ran quicker.

The approach you suggested took 25% more cpu time to run (and, yes, I would also have posted it if it went the results went the other way).  In fact, I was expecting your approach to run more quickly.

MikeZdeb
Rhodochrosite | Level 12

Hi ... OK, and I'm sure you tried it on a larger data set ... here's another,

a variation on Tom's MERGE ...

data want;

merge have(in=one where=(st1 eq '02' and st2 eq 'PC')) have;

by acct;

use = ifc(one, 'N', 'Y');

run;

ps  the self-interleave would be a bit faster if I had added an ELSE ...

if one and st1 eq '02' and st2 eq 'PC' then use = 'N';

else

if ^one then output;

pps  for further reading ... Howard Schreier's "Interleaving a Dataset with Itself: How and Why"

http://www.nesug.org/proceedings/nesug03/cc/cc002.pdf

Ksharp
Super User

How about;

data Have;
  input (Acct     ST1   ST2) ($);
  cards;
205       01      PC
205       02      AD
206       02      BC
207       01      AD
208       01      PC
208       03      AD
208       04      AD
209       02      PC
209       03      BC
;
run;
proc sql;
 create table want(drop=a b) as
  select *,sum(st1='02') as a ,sum(st2='PC') as b,
        case when calculated a ge 1 and calculated b ge 1 then 'N' else 'Y' end as use
   from have
    group by acct;quit;

Ksharp

Message was edited by: xia keshan

podarum
Quartz | Level 8

Very well done, Ksharp

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 14 replies
  • 1034 views
  • 3 likes
  • 5 in conversation