BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Jeff_DOC
Pyrite | Level 9

Good afternoon.

 

I wonder if someone could provide some assistance with this? What I need is to compare groups of records to find the minimum value for each group and delete the record with the minimum value (easy). However, if the group hours are the same over the entire group I need to keep them all. In the example provided in record 9482 both records would need to be kept. For record 28552 only the record for week 1 should be deleted since 1 hour is less than all the other records for 28552 (therefore it's the minimum). Week 8 also has a minimum hours of 1 but only the first should be removed. Once a deletion over the group has been performed group checking ceases. For record group 30799 I would keep both.

 

I am using EG 7.15.

 

I would very much appreciate any help someone out there might be willing to provide.

 

Thanks.

1 ACCEPTED SOLUTION

Accepted Solutions
art297
Opal | Level 21

Not sure if @novinosrin's revised code does what you want.

 

Since his original code did, except for your new criterion, I'd go with:

data need;
  set have;
  by record;
  recnum=_n_;
run;

proc sql;
  create table want as
    select *
      from need
        group by RECORD
          having std(HOURS)=0 or 
          min(hours) ne hours or
          (min(hours) eq hours and recnum ne min(recnum))
  ;
quit;

Art, CEO, AnalystFinder.com

 

View solution in original post

8 REPLIES 8
novinosrin
Tourmaline | Level 20
data have;
input RECORD	WEEK	HOURS;
cards;
9482	0	1
9482	1	1
28552	0	1
28552	1	1.5
28552	2	1.5
28552	3	1.5
28552	4	1.5
28552	5	1.5
28552	6	1.5
28552	7	1.5
28552	8	1.5
28552	9	1.5
28552	10	1.5
28552	12	1.5
29198	3	1.5
29198	4	1.5
29198	5	1.5
29198	6	1.5
29198	7	1.5
29198	8	1
29198	9	1.5
29198	10	1.5
29198	12	1.5
30799	5	1.5
30799	6	1.5
30799	7	1.5
30799	8	1.5
30799	9	1.5
30799	10	1.5
30799	12	1.5
33635	0	2
33635	1	2
33635	2	2
33635	3	2
33635	4	2
33635	5	2
33635	6	2
33635	7	2
33635	8	2
33635	9	2
33635	10	2
33635	12	2
38239	12	1.5
49749	3	1.5
49749	4	1.5
49749	5	1.5
49749	6	1.5
49749	7	1.5
49749	8	1.5
49749	9	1.5
50335	2	1.5
50335	3	1.5
50335	4	1.5
50335	5	1.5
50335	6	1.5
50335	7	1.5
50335	8	1.5
50335	9	1.5
50335	10	1.5
50335	12	1.5
59024	0	1.5
59024	1	1.5
59024	2	1.5
59024	3	1.5
59024	4	1.5
59024	5	1.5
59024	6	1.5
59024	7	1.5
59024	8	1.5
59024	9	1.5
59024	10	1.5
59024	12	1.5
62270	1	1.5
62270	2	1.5
62270	3	1.5
62270	4	1.5
62270	5	1.5
62270	6	1.5
62270	7	1.5
62270	8	1.5
62270	9	1.5
;


proc sql;
create table want as
select *
from have
group by RECORD
having std(HOURS)=0 or min(hours) ne hours;
quit;

 Might require slight tweaking but i am gonna wait for your feedback first. I am little too tired to dig deep but will do so interactively

Jeff_DOC
Pyrite | Level 9

That works great except I gave you incomplete information, sorry. In the group pasted below the provided code removes all the 1 hour records instead of just the first. So I guess it only needs to make one deletion over the group.

 

Sorry about that. My mistake.

 

 

RECORD WEEK HOURS
275487 0 1
275487 1 1
275487 2 1
275487 3 1
275487 4 1
275487 5 1
275487 6 1
275487 7 1
275487 8 2
275487 9 2
275487 10 2
275487 12 2
novinosrin
Tourmaline | Level 20

it was my mistake too. Anyway, safe bet is to switch to datastep. Be right back in 30 mins if somebody else hasn;t answered by then

novinosrin
Tourmaline | Level 20

Feeling the eyestrain  and tiredness at 5:20pm . so requesting you to Kindly test thoroughly  and let me know. I will clean up the code for you if you want

data have;
input RECORD	WEEK	HOURS;
cards;
9482	0	1
9482	1	1
28552	0	1
28552	1	1.5
28552	2	1.5
28552	3	1.5
28552	4	1.5
28552	5	1.5
28552	6	1.5
28552	7	1.5
28552	8	1.5
28552	9	1.5
28552	10	1.5
28552	12	1.5
29198	3	1.5
29198	4	1.5
29198	5	1.5
29198	6	1.5
29198	7	1.5
29198	8	1
29198	9	1.5
29198	10	1.5
29198	12	1.5
30799	5	1.5
30799	6	1.5
30799	7	1.5
30799	8	1.5
30799	9	1.5
30799	10	1.5
30799	12	1.5
33635	0	2
33635	1	2
33635	2	2
33635	3	2
33635	4	2
33635	5	2
33635	6	2
33635	7	2
33635	8	2
33635	9	2
33635	10	2
33635	12	2
38239	12	1.5
49749	3	1.5
49749	4	1.5
49749	5	1.5
49749	6	1.5
49749	7	1.5
49749	8	1.5
49749	9	1.5
50335	2	1.5
50335	3	1.5
50335	4	1.5
50335	5	1.5
50335	6	1.5
50335	7	1.5
50335	8	1.5
50335	9	1.5
50335	10	1.5
50335	12	1.5
59024	0	1.5
59024	1	1.5
59024	2	1.5
59024	3	1.5
59024	4	1.5
59024	5	1.5
59024	6	1.5
59024	7	1.5
59024	8	1.5
59024	9	1.5
59024	10	1.5
59024	12	1.5
62270	1	1.5
62270	2	1.5
62270	3	1.5
62270	4	1.5
62270	5	1.5
62270	6	1.5
62270	7	1.5
62270	8	1.5
62270	9	1.5
275487	0	1
275487	1	1
275487	2	1
275487	3	1
275487	4	1
275487	5	1
275487	6	1
275487	7	1
275487	8	2
275487	9	2
275487	10	2
275487	12	2
;
data want;
do _n_=1 by 1 until(last.record);
set have;
by record notsorted;
array t(100);
t(_n_)=hours;
min=min(min,hours);
if last.record then std=std(of t(*));
end;
do _n_=1 by 1 until(last.record);
set have;
if _n_=1 then call missing(f);
by record notsorted;
if std=0 then output;
else  if std ne 0 and  min=hours and not f then  f=_n_;
if std ne 0 and f and _n_>f then output;
end;
drop t: f std;
run;

 

 

Jeff_DOC
Pyrite | Level 9
Thank you so much.
art297
Opal | Level 21

Not sure if @novinosrin's revised code does what you want.

 

Since his original code did, except for your new criterion, I'd go with:

data need;
  set have;
  by record;
  recnum=_n_;
run;

proc sql;
  create table want as
    select *
      from need
        group by RECORD
          having std(HOURS)=0 or 
          min(hours) ne hours or
          (min(hours) eq hours and recnum ne min(recnum))
  ;
quit;

Art, CEO, AnalystFinder.com

 

Jeff_DOC
Pyrite | Level 9

This works art297. I misread the results. Thank you both very much for all your assistance.

 

Jeff

novinosrin
Tourmaline | Level 20

@art297  Thank you. Yours is neat. @art297 solution is great

 

just to correct:

 

data want;
do _n_=1 by 1 until(last.record);
set have;
by record notsorted;
array t(100);
t(_n_)=hours;
min=min(min,hours);
if last.record then std=std(of t(*));
end;
do _n_=1 by 1 until(last.record);
set have;
if _n_=1 then call missing(f);
by record notsorted;
if std=0 then output;
else  if std ne 0 and  min=hours and not f then  f=_n_;
if std ne 0 and f and _n_>f  then output;
else  if std ne 0 and min ne hours then output;
end;
drop t: f std;
run;

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 755 views
  • 0 likes
  • 3 in conversation