deleting groups

Reply
Occasional Contributor
Posts: 7

deleting groups

Group              Item              Date                                                                                            

Group              Item              Date                                                                 

A                       25              31/05/2000

A                       27              31/05/2000

A                       30              31/05/2000

A                       32              31/05/2000

A                       12              31/05/2001

A                       24              31/05/2001

A                       13              31/05/2001

A                       14              31/05/2001

B                       26              30/09/2000

B                       28              30/09/2000

B                       36              30/09/2000

B                       34              30/09/2001

B                       39              30/09/2001

B                       45              30/09/2001

B                       46              30/09/2001

C                       15              31/12/1999

C                       26              31/12/1999

C                       90              31/12/1999

C                       91              31/12/1999

RESULTS I NEED:

A                       12              31/05/2001

A                       24              31/05/2001

A                       13              31/05/2001

A                       14              31/05/2001

B                       34              30/09/2001

B                       39              30/09/2001

B                       45              30/09/2001

B                       46              30/09/2001

C                       90              31/12/1999

C                       91              31/12/1999

Hi Everyone,

Q1) Does anyone know how can I get the results above? Basically my criteria is to eliminate a entire group in that particular year if item between 25-30 exist in that group in that year. For example, there are items between number 25-30 for group A in year 2000, therefore my goal is to eliminate all of group A in that year. The same goes for group B and group C.

Q2)  I have an additional question here (just curious to know). What if I want to delete all groups with item 26 regardless of year. That is if item 26 exist in one of the year for that group, I want to delete that group from my entire data (regardless of the year). THANK YOU!

Respected Advisor
Posts: 3,124

Re: deleting groups

By using 2X DOW:

Q1:

data have;

input Group$           Item              Date  :ddmmyy10.;

format date ddmmyy10.;

_y=year(date);

cards;

A                      25              31/05/2000

A                       27              31/05/2000

A                       30              31/05/2000

A                       32              31/05/2000

A                       12              31/05/2001

A                       24              31/05/2001

A                       13              31/05/2001

A                       14              31/05/2001

B                       26              30/09/2000

B                       28              30/09/2000

B                       36              30/09/2000

B                       34              30/09/2001

B                       39              30/09/2001

B                       45              30/09/2001

B                       46              30/09/2001

C                       15              31/12/1999

C                       26              31/12/1999

C                       90              31/12/1999

C                       91              31/12/1999

;

data want (drop=_Smiley Happy;

   do until (last._y);

      set have;

       by group _y;

       if  25<=item<=30 then _f=1;

    end;

   do until (last._y);

      set have;

       by group _y;

       if _f ne 1 then  output;

    end;

    run;

    proc print;run;

Q2:

data want (drop=_Smiley Happy;

   do until (last.group);

      set have;

       by group ;

       if  item=26 then _f=1;

    end;

   do until (last.group);

      set have;

       by group;

       if _f ne 1 then  output;

    end;

    run;

Regards,

Haikuo

Respected Advisor
Posts: 3,124

Re: deleting groups

If go for a SQL solution, you could have:

Q1

proc sql;

create table want (drop=_Smiley Happy as

  select * from have

    group by group, _y

    having sum(25<=item<=30)=0;

quit;

Q2:

proc sql;

create table want (drop=_Smiley Happy as

  select * from have

    group by group

    having sum(item=26)=0;

quit;

Regards,

Haikuo

Occasional Contributor
Posts: 7

Re: deleting groups

Hi Hai.Kuo,

Your solutions provided above is absolutely correct. However, when I used those sets of codes on my larger sample, I couldn't get the results I need. Some are eliminated while some are not. Any idea as to what might have caused this? What I did was that:

I tried to change the code a little bit in a sense that I want groups in years that strictly have items between 25 and 30 (opposite to the one on top). What I did was to change [ if _f ne 1 then  output] to [ if _f = 1 then  output]. However, some of the years that did not meet this criteria remained because _f=1 is assigned. For e.g. using the data below, for group A, 2001 is supposed to be eliminate because besides item 25, they have items outside 25-30 range. I want years with items 25-30 only. Where have I gone wrong. Your help is appreciated. Thank You.

Group              Item              Date                                                               

A                       25              31/05/2000

A                       27              31/05/2000

A                       30              31/05/2000

A                       30              31/05/2000

A                       25              31/05/2001

A                       24              31/05/2001

A                       13              31/05/2001

A                       14              31/05/2001

B                       26              30/09/2000

B                       28              30/09/2000

B                       29              30/09/2000

B                       89              30/09/2001

B                       39              30/09/2001

B                       27              30/09/2001

B                       46              30/09/2001

C                       15              31/12/1999

C                       26              31/12/1999

C                       90              31/12/1999

C                       91              31/12/1999

RESULTS I NEED:

A                       25              31/05/2000

A                       27              31/05/2000

A                       30              31/05/2000

A                       30              31/05/2000

B                       26              30/09/2000

B                       28              30/09/2000

B                       29              30/09/2000

Respected Advisor
Posts: 4,646

Re: deleting groups

The SQL solution provided by Haikuo is much simpler to understand and modify. For your new requirement, use:

proc sql;

create table want (drop=_y) as

  select * from have

    group by group, _y

    having max(item<25 or item>30)=0;

quit;

PG

PG
Occasional Contributor
Posts: 7

Re: deleting groups

Hi,

Thanks for the help. Did what u said but the message 'ERROR: Expression using IN has components that are of different data types.' keeps appearing under the 'having max(item<25 or item>30)=0;' line. I don't understand what seems to be the problem. Could it be blanks along the way?

Occasional Contributor
Posts: 17

Re: deleting groups

I had the same error too when using SQL. All of my data for that column are numbers. I wonder how could they be of different data types.

Respected Advisor
Posts: 4,646

Re: deleting groups

I simplified the query some more. I do not encounter any problem with the following test run with version 9.3 :

data have;

input Group$ Item Date :ddmmyy10.;

format date ddmmyy10.;

cards;

A                       25              31/05/2000

A                       27              31/05/2000

A                       30              31/05/2000

A                       30              31/05/2000

A                       25              31/05/2001

A                       24              31/05/2001

A                       13              31/05/2001

A                       14              31/05/2001

B                       26              30/09/2000

B                       28              30/09/2000

B                       29              30/09/2000

B                       89              30/09/2001

B                       39              30/09/2001

B                       27              30/09/2001

B                       46              30/09/2001

C                       15              31/12/1999

C                       26              31/12/1999

C                       90              31/12/1999

C                       91              31/12/1999

;

proc sql;

create table want as

  select group, item, date from have

    group by group, year(date)

    having max(item<25)=0 and max(item>30)=0;

quit;

as expected, observations from groups A and B of year 2000 are retained.

PG

PG
Respected Advisor
Posts: 4,646

Re: deleting groups

Or, if you want to avoid remerging and remain within standard SQL :

proc sql;

create table want as

  select H.*

  from have as H inner join

   (select group, year(date) as y from have

    group by group, calculated y

    having min(item)>= 25 and max(item)<=30) as G

  on H.group=G.group and year(H.date)=G.y;

quit;

PG

PG
Super Contributor
Posts: 1,636

Re: deleting groups

Nice!  "calculated" is not required.   Linlin

Occasional Contributor
Posts: 7

Re: deleting groups

I can't get the sql method to work in both cases but the non-sql method works fine.

Super Contributor
Posts: 1,636

Re: deleting groups

Hi,

my code is long, but it is easy to understand:

data have;

input Group$           Item              Date  :ddmmyy10.;

format date ddmmyy10.;

_y=year(date);

cards;

A                       25              31/05/2000

A                       27              31/05/2000

A                       30              31/05/2000

A                       30              31/05/2000

A                       25              31/05/2001

A                       24              31/05/2001

A                       13              31/05/2001

A                       14              31/05/2001

B                       26              30/09/2000

B                       28              30/09/2000

B                       29              30/09/2000

B                       89              30/09/2001

B                       39              30/09/2001

B                       27              30/09/2001

B                       46              30/09/2001

C                       15              31/12/1999

C                       26              31/12/1999

C                       90              31/12/1999

C                       91              31/12/1999

;

proc sql;

create table temp  as

  select group,_y from have

    group by group, _y

   having min(item)>24 and max(item)<31;

create table want as

  select a.group,a.item,date from have a,temp b

    where a.group=b.group and a._y=b._y;

quit;

proc print;run;

                   obs    Group    Item          Date

                         1       A       25     31/05/2000

                         2       A       27     31/05/2000

                         3       A       30     31/05/2000

                         4       A       30     31/05/2000

                         5       B       26     30/09/2000

                         6       B       28     30/09/2000

                         7       B       29     30/09/2000

New Contributor
Posts: 2

Re: deleting groups

Re: your original question. If you want to remove the whole group because items 25-30 fall in a particular year, then the entire group C should be gone as in the following:

data a;

input Group $ Item Date ddmmyy10.;

format Date ddmmyy10.;

cards;

A  25 31/05/2000

A  27 31/05/2000

A  30 31/05/2000

A  32 31/05/2000

A  12 31/05/2001

A  24 31/05/2001

A  13 31/05/2001

A  14 31/05/2001

B  26 30/09/2000

B  28 30/09/2000

B  36 30/09/2000

B  34 30/09/2001

B  39 30/09/2001

B  45 30/09/2001

B  46 30/09/2001

C  15 31/12/1999

C  26 31/12/1999

C  90 31/12/1999

C  91 31/12/1999

; run;

proc sql;

delete * from a where year(Date) in

  (select year(Date) from a where Item between 25 and 30);

proc print data=a; *entire group C is gone;

5A1231/05/2001
6A2431/05/2001
7A1331/05/2001
8A1431/05/2001
12B3430/09/2001
13B3930/09/2001
14B4530/09/2001
15B4630/09/2001

Not sure if this is what you want.

New Contributor
Posts: 2

Re: deleting groups

Re: 2nd question of your original post:

data a;

input Group $ Item Date ddmmyy10.;

format Date ddmmyy10.;

cards;

A  25 31/05/2000

A  27 31/05/2000

A  30 31/05/2000

A  32 31/05/2000

A  12 31/05/2001

A  24 31/05/2001

A  13 31/05/2001

A  14 31/05/2001

B  26 30/09/2000

B  28 30/09/2000

B  36 30/09/2000

B  34 30/09/2001

B  39 30/09/2001

B  45 30/09/2001

B  46 30/09/2001

C  15 31/12/1999

C  26 31/12/1999

C  90 31/12/1999

C  91 31/12/1999

; run;

proc sql;

delete * from a where Group in

(select Group from a where Item = 26);

proc print; *Only group a is left;

1A2531/05/2000
2A2731/05/2000
3A3031/05/2000
4A3231/05/2000
5A1231/05/2001
6A2431/05/2001
7A1331/05/2001
8A1431/05/2001
Ask a Question
Discussion stats
  • 13 replies
  • 977 views
  • 0 likes
  • 6 in conversation