## deleting groups

Occasional Contributor
Posts: 7

# deleting groups

Group              Item              Date

Group              Item              Date

A                       25              31/05/2000

A                       27              31/05/2000

A                       30              31/05/2000

A                       32              31/05/2000

A                       12              31/05/2001

A                       24              31/05/2001

A                       13              31/05/2001

A                       14              31/05/2001

B                       26              30/09/2000

B                       28              30/09/2000

B                       36              30/09/2000

B                       34              30/09/2001

B                       39              30/09/2001

B                       45              30/09/2001

B                       46              30/09/2001

C                       15              31/12/1999

C                       26              31/12/1999

C                       90              31/12/1999

C                       91              31/12/1999

RESULTS I NEED:

A                       12              31/05/2001

A                       24              31/05/2001

A                       13              31/05/2001

A                       14              31/05/2001

B                       34              30/09/2001

B                       39              30/09/2001

B                       45              30/09/2001

B                       46              30/09/2001

C                       90              31/12/1999

C                       91              31/12/1999

Hi Everyone,

Q1) Does anyone know how can I get the results above? Basically my criteria is to eliminate a entire group in that particular year if item between 25-30 exist in that group in that year. For example, there are items between number 25-30 for group A in year 2000, therefore my goal is to eliminate all of group A in that year. The same goes for group B and group C.

Q2)  I have an additional question here (just curious to know). What if I want to delete all groups with item 26 regardless of year. That is if item 26 exist in one of the year for that group, I want to delete that group from my entire data (regardless of the year). THANK YOU!

Posts: 3,167

## Re: deleting groups

By using 2X DOW:

Q1:

data have;

input Group\$           Item              Date  :ddmmyy10.;

format date ddmmyy10.;

_y=year(date);

cards;

A                      25              31/05/2000

A                       27              31/05/2000

A                       30              31/05/2000

A                       32              31/05/2000

A                       12              31/05/2001

A                       24              31/05/2001

A                       13              31/05/2001

A                       14              31/05/2001

B                       26              30/09/2000

B                       28              30/09/2000

B                       36              30/09/2000

B                       34              30/09/2001

B                       39              30/09/2001

B                       45              30/09/2001

B                       46              30/09/2001

C                       15              31/12/1999

C                       26              31/12/1999

C                       90              31/12/1999

C                       91              31/12/1999

;

data want (drop=_;

do until (last._y);

set have;

by group _y;

if  25<=item<=30 then _f=1;

end;

do until (last._y);

set have;

by group _y;

if _f ne 1 then  output;

end;

run;

proc print;run;

Q2:

data want (drop=_;

do until (last.group);

set have;

by group ;

if  item=26 then _f=1;

end;

do until (last.group);

set have;

by group;

if _f ne 1 then  output;

end;

run;

Regards,

Haikuo

Posts: 3,167

## Re: deleting groups

If go for a SQL solution, you could have:

Q1

proc sql;

create table want (drop=_ as

select * from have

group by group, _y

having sum(25<=item<=30)=0;

quit;

Q2:

proc sql;

create table want (drop=_ as

select * from have

group by group

having sum(item=26)=0;

quit;

Regards,

Haikuo

Occasional Contributor
Posts: 7

## Re: deleting groups

Hi Hai.Kuo,

Your solutions provided above is absolutely correct. However, when I used those sets of codes on my larger sample, I couldn't get the results I need. Some are eliminated while some are not. Any idea as to what might have caused this? What I did was that:

I tried to change the code a little bit in a sense that I want groups in years that strictly have items between 25 and 30 (opposite to the one on top). What I did was to change [ if _f ne 1 then  output] to [ if _f = 1 then  output]. However, some of the years that did not meet this criteria remained because _f=1 is assigned. For e.g. using the data below, for group A, 2001 is supposed to be eliminate because besides item 25, they have items outside 25-30 range. I want years with items 25-30 only. Where have I gone wrong. Your help is appreciated. Thank You.

Group              Item              Date

A                       25              31/05/2000

A                       27              31/05/2000

A                       30              31/05/2000

A                       30              31/05/2000

A                       25              31/05/2001

A                       24              31/05/2001

A                       13              31/05/2001

A                       14              31/05/2001

B                       26              30/09/2000

B                       28              30/09/2000

B                       29              30/09/2000

B                       89              30/09/2001

B                       39              30/09/2001

B                       27              30/09/2001

B                       46              30/09/2001

C                       15              31/12/1999

C                       26              31/12/1999

C                       90              31/12/1999

C                       91              31/12/1999

RESULTS I NEED:

A                       25              31/05/2000

A                       27              31/05/2000

A                       30              31/05/2000

A                       30              31/05/2000

B                       26              30/09/2000

B                       28              30/09/2000

B                       29              30/09/2000

Posts: 5,543

## Re: deleting groups

The SQL solution provided by Haikuo is much simpler to understand and modify. For your new requirement, use:

proc sql;

create table want (drop=_y) as

select * from have

group by group, _y

having max(item<25 or item>30)=0;

quit;

PG

PG
Occasional Contributor
Posts: 7

## Re: deleting groups

Hi,

Thanks for the help. Did what u said but the message 'ERROR: Expression using IN has components that are of different data types.' keeps appearing under the 'having max(item<25 or item>30)=0;' line. I don't understand what seems to be the problem. Could it be blanks along the way?

Occasional Contributor
Posts: 17

## Re: deleting groups

I had the same error too when using SQL. All of my data for that column are numbers. I wonder how could they be of different data types.

Posts: 5,543

## Re: deleting groups

I simplified the query some more. I do not encounter any problem with the following test run with version 9.3 :

data have;

input Group\$ Item Date :ddmmyy10.;

format date ddmmyy10.;

cards;

A                       25              31/05/2000

A                       27              31/05/2000

A                       30              31/05/2000

A                       30              31/05/2000

A                       25              31/05/2001

A                       24              31/05/2001

A                       13              31/05/2001

A                       14              31/05/2001

B                       26              30/09/2000

B                       28              30/09/2000

B                       29              30/09/2000

B                       89              30/09/2001

B                       39              30/09/2001

B                       27              30/09/2001

B                       46              30/09/2001

C                       15              31/12/1999

C                       26              31/12/1999

C                       90              31/12/1999

C                       91              31/12/1999

;

proc sql;

create table want as

select group, item, date from have

group by group, year(date)

having max(item<25)=0 and max(item>30)=0;

quit;

as expected, observations from groups A and B of year 2000 are retained.

PG

PG
Posts: 5,543

## Re: deleting groups

Or, if you want to avoid remerging and remain within standard SQL :

proc sql;

create table want as

select H.*

from have as H inner join

(select group, year(date) as y from have

group by group, calculated y

having min(item)>= 25 and max(item)<=30) as G

on H.group=G.group and year(H.date)=G.y;

quit;

PG

PG
Super Contributor
Posts: 1,636

## Re: deleting groups

Nice!  "calculated" is not required.   Linlin

Occasional Contributor
Posts: 7

## Re: deleting groups

I can't get the sql method to work in both cases but the non-sql method works fine.

Super Contributor
Posts: 1,636

## Re: deleting groups

Hi,

my code is long, but it is easy to understand:

data have;

input Group\$           Item              Date  :ddmmyy10.;

format date ddmmyy10.;

_y=year(date);

cards;

A                       25              31/05/2000

A                       27              31/05/2000

A                       30              31/05/2000

A                       30              31/05/2000

A                       25              31/05/2001

A                       24              31/05/2001

A                       13              31/05/2001

A                       14              31/05/2001

B                       26              30/09/2000

B                       28              30/09/2000

B                       29              30/09/2000

B                       89              30/09/2001

B                       39              30/09/2001

B                       27              30/09/2001

B                       46              30/09/2001

C                       15              31/12/1999

C                       26              31/12/1999

C                       90              31/12/1999

C                       91              31/12/1999

;

proc sql;

create table temp  as

select group,_y from have

group by group, _y

having min(item)>24 and max(item)<31;

create table want as

select a.group,a.item,date from have a,temp b

where a.group=b.group and a._y=b._y;

quit;

proc print;run;

obs    Group    Item          Date

1       A       25     31/05/2000

2       A       27     31/05/2000

3       A       30     31/05/2000

4       A       30     31/05/2000

5       B       26     30/09/2000

6       B       28     30/09/2000

7       B       29     30/09/2000

New Contributor
Posts: 2

## Re: deleting groups

Re: your original question. If you want to remove the whole group because items 25-30 fall in a particular year, then the entire group C should be gone as in the following:

data a;

input Group \$ Item Date ddmmyy10.;

format Date ddmmyy10.;

cards;

A  25 31/05/2000

A  27 31/05/2000

A  30 31/05/2000

A  32 31/05/2000

A  12 31/05/2001

A  24 31/05/2001

A  13 31/05/2001

A  14 31/05/2001

B  26 30/09/2000

B  28 30/09/2000

B  36 30/09/2000

B  34 30/09/2001

B  39 30/09/2001

B  45 30/09/2001

B  46 30/09/2001

C  15 31/12/1999

C  26 31/12/1999

C  90 31/12/1999

C  91 31/12/1999

; run;

proc sql;

delete * from a where year(Date) in

(select year(Date) from a where Item between 25 and 30);

proc print data=a; *entire group C is gone;

5 6 7 A 12 31/05/2001 A 24 31/05/2001 A 13 31/05/2001 A 14 31/05/2001 B 34 30/09/2001 B 39 30/09/2001 B 45 30/09/2001 B 46 30/09/2001

Not sure if this is what you want.

New Contributor
Posts: 2

## Re: deleting groups

Re: 2nd question of your original post:

data a;

input Group \$ Item Date ddmmyy10.;

format Date ddmmyy10.;

cards;

A  25 31/05/2000

A  27 31/05/2000

A  30 31/05/2000

A  32 31/05/2000

A  12 31/05/2001

A  24 31/05/2001

A  13 31/05/2001

A  14 31/05/2001

B  26 30/09/2000

B  28 30/09/2000

B  36 30/09/2000

B  34 30/09/2001

B  39 30/09/2001

B  45 30/09/2001

B  46 30/09/2001

C  15 31/12/1999

C  26 31/12/1999

C  90 31/12/1999

C  91 31/12/1999

; run;

proc sql;

delete * from a where Group in

(select Group from a where Item = 26);

proc print; *Only group a is left;

1 2 3 A 25 31/05/2000 A 27 31/05/2000 A 30 31/05/2000 A 32 31/05/2000 A 12 31/05/2001 A 24 31/05/2001 A 13 31/05/2001 A 14 31/05/2001
Discussion stats
• 13 replies
• 1169 views
• 0 likes
• 6 in conversation