BookmarkSubscribeRSS Feed
user1
Calcite | Level 5

Can anyone help on this?  For example, I want to delete all persons' records who have score of 2. Many thanks!

data new;

input ID Score;

datalines;

1 2

1 7

1 .

2 9

2 6

2 5

3 2

3 9

3 6

;

run;

The the new table will be:

IDScore
29
26
25
10 REPLIES 10
tish
Calcite | Level 5

If you want to have a table including all the people with the score of 2, then:

data score2;

set new;

where (score = 2);

run;

If you want to have the table as above where you want all the scores for the person with the ID of 2, then:

data ID2;

set new;

where (ID = 2);

run;

Haikuo
Onyx | Level 15

There would be many way to do it, below just a few among them:

data new;

input ID Score;

datalines;

1 2

1 7

1 .

2 9

2 6

2 5

3 2

3 9

3 6

;

run;

/*sql*/

proc sql;

create table want as

select * from new group by id having sum(score=2)=0;

quit;

/*dow*/

data want;

  do until (last.id);

    set new;

     by id notsorted;

       if score=2 then flag=1;

  end;

 

  do until (last.id);

    set new;

      by id notsorted;

        if flag ne 1 then output;

  end;

  drop flag;

  run;

  /*stack*/

data want;

retain flag;

  set new (in=up) new;

    by id;

    if first.id then call missing(flag);

    if score=2 then flag=1;

    if not up and flag ne 1 then output;

    drop flag;

run;

Haikuo

Amir
PROC Star

Hi,

You could try the following:

proc sql noprint;

  create table want as

  select *

  from new

  where id not in (select id

                   from new

                   where score=2)

  ;

quit;

Regards,

Amir

user1
Calcite | Level 5

Thanks for your help, everyone!

Regards.

ballardw
Super User

if score=2 then delete;

would also work in a datastep.

I might not actually delete them but when I go to the next step in analysis I would remove them using a dataset option.

proc means data=new (where=(score ne 2));

for example.

KoMartin66
Obsidian | Level 7

You could also implement a hash table solution.

DATA NEW2(DROP=RC ) ;

DECLARE  HASH  AA(DATASET:' work.new(where=(score=2) ) ' ) ;  /* only the offering IDs are ready into hash object */

RC=AA.DEFINEKEY( 'ID' ) ;   RC=AA.DEFINEDONE() ;

DO UNTIL(EOF1) ;

   SET WORK.NEW  END=EOF1 ;

   RC=AA.FIND() ;

   IF RC NE 0 THEN OUTPUT ;   /* RC=0 indicates a match, which you don't want in this instance */

END ;

STOP ;

RUN ;

Astounding
PROC Star

All the hashing posts are forcing me to learn it.  Here's a slight improvement, applicable if the data are already sorted:

do until (eof1);

   set work.new end=eof1;

   by id;

   if first.id then rc = aa.find();

   if rc ne 0 then output;

end;

Search the hash table just once per ID.

KoMartin66
Obsidian | Level 7

The beauty of the hash is it doesn't require the data to be pre-sorted.  Had the original post entered the datalines in some non-sorted order, your suggestion would then have to incur a sort procedure in order to make the BY processing approach work and your overall run time increases, based on the additional I/O required by the sort. 

Patrick
Opal | Level 21

Or just this way:

DATA NEW2(DROP=RC ) ;

if _n_=1 then

do;

   DECLARE  HASH  AA(DATASET:' work.new(where=(score=2) ) ' ) ;  /* only the offering IDs are ready into hash object */

   RC=AA.DEFINEKEY( 'ID' ) ; 

   RC=AA.DEFINEDONE() ;

end;

set new;

if aa.check();

RUN ;

Haikuo
Onyx | Level 15

I guess the more, the merrier. Here is an array() approach, with the assumption that data is presorted by state and each state only has less than 1000 obs.

/*array*/

data want;

array t(1:1000,1:2) _temporary_;

do _n_=1 by 1 until (last.state);

set new;

by state;

t(_n_, 1)=state;

t(_n_,2)=score;

if score=2 then flag=1;

end;

if flag ne 1 then do i=1 by 1 until (i=_n_);

    state=t(i,1);

    score=t(i,2);

    output;

    end;

call missing (of t(*));

drop flag i

run;

Haikuo

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 10 replies
  • 2764 views
  • 3 likes
  • 8 in conversation