How to delete specific group of observations

Accepted Solution Solved
Reply
Contributor
Posts: 56
Accepted Solution

How to delete specific group of observations

Good afternoon.

 

I wonder if someone could provide some assistance with this? What I need is to compare groups of records to find the minimum value for each group and delete the record with the minimum value (easy). However, if the group hours are the same over the entire group I need to keep them all. In the example provided in record 9482 both records would need to be kept. For record 28552 only the record for week 1 should be deleted since 1 hour is less than all the other records for 28552 (therefore it's the minimum). Week 8 also has a minimum hours of 1 but only the first should be removed. Once a deletion over the group has been performed group checking ceases. For record group 30799 I would keep both.

 

I am using EG 7.15.

 

I would very much appreciate any help someone out there might be willing to provide.

 

Thanks.


Accepted Solutions
Solution
‎05-23-2018 06:35 PM
PROC Star
Posts: 8,165

Re: How to delete specific group of observations

Not sure if @novinosrin's revised code does what you want.

 

Since his original code did, except for your new criterion, I'd go with:

data need;
  set have;
  by record;
  recnum=_n_;
run;

proc sql;
  create table want as
    select *
      from need
        group by RECORD
          having std(HOURS)=0 or 
          min(hours) ne hours or
          (min(hours) eq hours and recnum ne min(recnum))
  ;
quit;

Art, CEO, AnalystFinder.com

 

View solution in original post


All Replies
PROC Star
Posts: 1,833

Re: How to delete specific group of observations

[ Edited ]
data have;
input RECORD	WEEK	HOURS;
cards;
9482	0	1
9482	1	1
28552	0	1
28552	1	1.5
28552	2	1.5
28552	3	1.5
28552	4	1.5
28552	5	1.5
28552	6	1.5
28552	7	1.5
28552	8	1.5
28552	9	1.5
28552	10	1.5
28552	12	1.5
29198	3	1.5
29198	4	1.5
29198	5	1.5
29198	6	1.5
29198	7	1.5
29198	8	1
29198	9	1.5
29198	10	1.5
29198	12	1.5
30799	5	1.5
30799	6	1.5
30799	7	1.5
30799	8	1.5
30799	9	1.5
30799	10	1.5
30799	12	1.5
33635	0	2
33635	1	2
33635	2	2
33635	3	2
33635	4	2
33635	5	2
33635	6	2
33635	7	2
33635	8	2
33635	9	2
33635	10	2
33635	12	2
38239	12	1.5
49749	3	1.5
49749	4	1.5
49749	5	1.5
49749	6	1.5
49749	7	1.5
49749	8	1.5
49749	9	1.5
50335	2	1.5
50335	3	1.5
50335	4	1.5
50335	5	1.5
50335	6	1.5
50335	7	1.5
50335	8	1.5
50335	9	1.5
50335	10	1.5
50335	12	1.5
59024	0	1.5
59024	1	1.5
59024	2	1.5
59024	3	1.5
59024	4	1.5
59024	5	1.5
59024	6	1.5
59024	7	1.5
59024	8	1.5
59024	9	1.5
59024	10	1.5
59024	12	1.5
62270	1	1.5
62270	2	1.5
62270	3	1.5
62270	4	1.5
62270	5	1.5
62270	6	1.5
62270	7	1.5
62270	8	1.5
62270	9	1.5
;


proc sql;
create table want as
select *
from have
group by RECORD
having std(HOURS)=0 or min(hours) ne hours;
quit;

 Might require slight tweaking but i am gonna wait for your feedback first. I am little too tired to dig deep but will do so interactively

Contributor
Posts: 56

Re: How to delete specific group of observations

Posted in reply to novinosrin

That works great except I gave you incomplete information, sorry. In the group pasted below the provided code removes all the 1 hour records instead of just the first. So I guess it only needs to make one deletion over the group.

 

Sorry about that. My mistake.

 

 

RECORD WEEK HOURS
275487 0 1
275487 1 1
275487 2 1
275487 3 1
275487 4 1
275487 5 1
275487 6 1
275487 7 1
275487 8 2
275487 9 2
275487 10 2
275487 12 2
PROC Star
Posts: 1,833

Re: How to delete specific group of observations

it was my mistake too. Anyway, safe bet is to switch to datastep. Be right back in 30 mins if somebody else hasn;t answered by then

PROC Star
Posts: 1,833

Re: How to delete specific group of observations

[ Edited ]
Posted in reply to novinosrin

Feeling the eyestrain  and tiredness at 5:20pm . so requesting you to Kindly test thoroughly  and let me know. I will clean up the code for you if you want

data have;
input RECORD	WEEK	HOURS;
cards;
9482	0	1
9482	1	1
28552	0	1
28552	1	1.5
28552	2	1.5
28552	3	1.5
28552	4	1.5
28552	5	1.5
28552	6	1.5
28552	7	1.5
28552	8	1.5
28552	9	1.5
28552	10	1.5
28552	12	1.5
29198	3	1.5
29198	4	1.5
29198	5	1.5
29198	6	1.5
29198	7	1.5
29198	8	1
29198	9	1.5
29198	10	1.5
29198	12	1.5
30799	5	1.5
30799	6	1.5
30799	7	1.5
30799	8	1.5
30799	9	1.5
30799	10	1.5
30799	12	1.5
33635	0	2
33635	1	2
33635	2	2
33635	3	2
33635	4	2
33635	5	2
33635	6	2
33635	7	2
33635	8	2
33635	9	2
33635	10	2
33635	12	2
38239	12	1.5
49749	3	1.5
49749	4	1.5
49749	5	1.5
49749	6	1.5
49749	7	1.5
49749	8	1.5
49749	9	1.5
50335	2	1.5
50335	3	1.5
50335	4	1.5
50335	5	1.5
50335	6	1.5
50335	7	1.5
50335	8	1.5
50335	9	1.5
50335	10	1.5
50335	12	1.5
59024	0	1.5
59024	1	1.5
59024	2	1.5
59024	3	1.5
59024	4	1.5
59024	5	1.5
59024	6	1.5
59024	7	1.5
59024	8	1.5
59024	9	1.5
59024	10	1.5
59024	12	1.5
62270	1	1.5
62270	2	1.5
62270	3	1.5
62270	4	1.5
62270	5	1.5
62270	6	1.5
62270	7	1.5
62270	8	1.5
62270	9	1.5
275487	0	1
275487	1	1
275487	2	1
275487	3	1
275487	4	1
275487	5	1
275487	6	1
275487	7	1
275487	8	2
275487	9	2
275487	10	2
275487	12	2
;
data want;
do _n_=1 by 1 until(last.record);
set have;
by record notsorted;
array t(100);
t(_n_)=hours;
min=min(min,hours);
if last.record then std=std(of t(*));
end;
do _n_=1 by 1 until(last.record);
set have;
if _n_=1 then call missing(f);
by record notsorted;
if std=0 then output;
else  if std ne 0 and  min=hours and not f then  f=_n_;
if std ne 0 and f and _n_>f then output;
end;
drop t: f std;
run;

 

 

Contributor
Posts: 56

Re: How to delete specific group of observations

Posted in reply to novinosrin
Thank you so much.
Solution
‎05-23-2018 06:35 PM
PROC Star
Posts: 8,165

Re: How to delete specific group of observations

Not sure if @novinosrin's revised code does what you want.

 

Since his original code did, except for your new criterion, I'd go with:

data need;
  set have;
  by record;
  recnum=_n_;
run;

proc sql;
  create table want as
    select *
      from need
        group by RECORD
          having std(HOURS)=0 or 
          min(hours) ne hours or
          (min(hours) eq hours and recnum ne min(recnum))
  ;
quit;

Art, CEO, AnalystFinder.com

 

Contributor
Posts: 56

Re: How to delete specific group of observations

This works art297. I misread the results. Thank you both very much for all your assistance.

 

Jeff

PROC Star
Posts: 1,833

Re: How to delete specific group of observations

@art297  Thank you. Yours is neat. @art297 solution is great

 

just to correct:

 

data want;
do _n_=1 by 1 until(last.record);
set have;
by record notsorted;
array t(100);
t(_n_)=hours;
min=min(min,hours);
if last.record then std=std(of t(*));
end;
do _n_=1 by 1 until(last.record);
set have;
if _n_=1 then call missing(f);
by record notsorted;
if std=0 then output;
else  if std ne 0 and  min=hours and not f then  f=_n_;
if std ne 0 and f and _n_>f  then output;
else  if std ne 0 and min ne hours then output;
end;
drop t: f std;
run;
☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 8 replies
  • 150 views
  • 0 likes
  • 3 in conversation