Good afternoon.
I wonder if someone could provide some assistance with this? What I need is to compare groups of records to find the minimum value for each group and delete the record with the minimum value (easy). However, if the group hours are the same over the entire group I need to keep them all. In the example provided in record 9482 both records would need to be kept. For record 28552 only the record for week 1 should be deleted since 1 hour is less than all the other records for 28552 (therefore it's the minimum). Week 8 also has a minimum hours of 1 but only the first should be removed. Once a deletion over the group has been performed group checking ceases. For record group 30799 I would keep both.
I am using EG 7.15.
I would very much appreciate any help someone out there might be willing to provide.
Thanks.
Not sure if @novinosrin's revised code does what you want.
Since his original code did, except for your new criterion, I'd go with:
data need;
set have;
by record;
recnum=_n_;
run;
proc sql;
create table want as
select *
from need
group by RECORD
having std(HOURS)=0 or
min(hours) ne hours or
(min(hours) eq hours and recnum ne min(recnum))
;
quit;
Art, CEO, AnalystFinder.com
data have;
input RECORD WEEK HOURS;
cards;
9482 0 1
9482 1 1
28552 0 1
28552 1 1.5
28552 2 1.5
28552 3 1.5
28552 4 1.5
28552 5 1.5
28552 6 1.5
28552 7 1.5
28552 8 1.5
28552 9 1.5
28552 10 1.5
28552 12 1.5
29198 3 1.5
29198 4 1.5
29198 5 1.5
29198 6 1.5
29198 7 1.5
29198 8 1
29198 9 1.5
29198 10 1.5
29198 12 1.5
30799 5 1.5
30799 6 1.5
30799 7 1.5
30799 8 1.5
30799 9 1.5
30799 10 1.5
30799 12 1.5
33635 0 2
33635 1 2
33635 2 2
33635 3 2
33635 4 2
33635 5 2
33635 6 2
33635 7 2
33635 8 2
33635 9 2
33635 10 2
33635 12 2
38239 12 1.5
49749 3 1.5
49749 4 1.5
49749 5 1.5
49749 6 1.5
49749 7 1.5
49749 8 1.5
49749 9 1.5
50335 2 1.5
50335 3 1.5
50335 4 1.5
50335 5 1.5
50335 6 1.5
50335 7 1.5
50335 8 1.5
50335 9 1.5
50335 10 1.5
50335 12 1.5
59024 0 1.5
59024 1 1.5
59024 2 1.5
59024 3 1.5
59024 4 1.5
59024 5 1.5
59024 6 1.5
59024 7 1.5
59024 8 1.5
59024 9 1.5
59024 10 1.5
59024 12 1.5
62270 1 1.5
62270 2 1.5
62270 3 1.5
62270 4 1.5
62270 5 1.5
62270 6 1.5
62270 7 1.5
62270 8 1.5
62270 9 1.5
;
proc sql;
create table want as
select *
from have
group by RECORD
having std(HOURS)=0 or min(hours) ne hours;
quit;
Might require slight tweaking but i am gonna wait for your feedback first. I am little too tired to dig deep but will do so interactively
That works great except I gave you incomplete information, sorry. In the group pasted below the provided code removes all the 1 hour records instead of just the first. So I guess it only needs to make one deletion over the group.
Sorry about that. My mistake.
RECORD | WEEK | HOURS |
275487 | 0 | 1 |
275487 | 1 | 1 |
275487 | 2 | 1 |
275487 | 3 | 1 |
275487 | 4 | 1 |
275487 | 5 | 1 |
275487 | 6 | 1 |
275487 | 7 | 1 |
275487 | 8 | 2 |
275487 | 9 | 2 |
275487 | 10 | 2 |
275487 | 12 | 2 |
it was my mistake too. Anyway, safe bet is to switch to datastep. Be right back in 30 mins if somebody else hasn;t answered by then
Feeling the eyestrain and tiredness at 5:20pm . so requesting you to Kindly test thoroughly and let me know. I will clean up the code for you if you want
data have;
input RECORD WEEK HOURS;
cards;
9482 0 1
9482 1 1
28552 0 1
28552 1 1.5
28552 2 1.5
28552 3 1.5
28552 4 1.5
28552 5 1.5
28552 6 1.5
28552 7 1.5
28552 8 1.5
28552 9 1.5
28552 10 1.5
28552 12 1.5
29198 3 1.5
29198 4 1.5
29198 5 1.5
29198 6 1.5
29198 7 1.5
29198 8 1
29198 9 1.5
29198 10 1.5
29198 12 1.5
30799 5 1.5
30799 6 1.5
30799 7 1.5
30799 8 1.5
30799 9 1.5
30799 10 1.5
30799 12 1.5
33635 0 2
33635 1 2
33635 2 2
33635 3 2
33635 4 2
33635 5 2
33635 6 2
33635 7 2
33635 8 2
33635 9 2
33635 10 2
33635 12 2
38239 12 1.5
49749 3 1.5
49749 4 1.5
49749 5 1.5
49749 6 1.5
49749 7 1.5
49749 8 1.5
49749 9 1.5
50335 2 1.5
50335 3 1.5
50335 4 1.5
50335 5 1.5
50335 6 1.5
50335 7 1.5
50335 8 1.5
50335 9 1.5
50335 10 1.5
50335 12 1.5
59024 0 1.5
59024 1 1.5
59024 2 1.5
59024 3 1.5
59024 4 1.5
59024 5 1.5
59024 6 1.5
59024 7 1.5
59024 8 1.5
59024 9 1.5
59024 10 1.5
59024 12 1.5
62270 1 1.5
62270 2 1.5
62270 3 1.5
62270 4 1.5
62270 5 1.5
62270 6 1.5
62270 7 1.5
62270 8 1.5
62270 9 1.5
275487 0 1
275487 1 1
275487 2 1
275487 3 1
275487 4 1
275487 5 1
275487 6 1
275487 7 1
275487 8 2
275487 9 2
275487 10 2
275487 12 2
;
data want;
do _n_=1 by 1 until(last.record);
set have;
by record notsorted;
array t(100);
t(_n_)=hours;
min=min(min,hours);
if last.record then std=std(of t(*));
end;
do _n_=1 by 1 until(last.record);
set have;
if _n_=1 then call missing(f);
by record notsorted;
if std=0 then output;
else if std ne 0 and min=hours and not f then f=_n_;
if std ne 0 and f and _n_>f then output;
end;
drop t: f std;
run;
Not sure if @novinosrin's revised code does what you want.
Since his original code did, except for your new criterion, I'd go with:
data need;
set have;
by record;
recnum=_n_;
run;
proc sql;
create table want as
select *
from need
group by RECORD
having std(HOURS)=0 or
min(hours) ne hours or
(min(hours) eq hours and recnum ne min(recnum))
;
quit;
Art, CEO, AnalystFinder.com
This works art297. I misread the results. Thank you both very much for all your assistance.
Jeff
@art297 Thank you. Yours is neat. @art297 solution is great
just to correct:
data want;
do _n_=1 by 1 until(last.record);
set have;
by record notsorted;
array t(100);
t(_n_)=hours;
min=min(min,hours);
if last.record then std=std(of t(*));
end;
do _n_=1 by 1 until(last.record);
set have;
if _n_=1 then call missing(f);
by record notsorted;
if std=0 then output;
else if std ne 0 and min=hours and not f then f=_n_;
if std ne 0 and f and _n_>f then output;
else if std ne 0 and min ne hours then output;
end;
drop t: f std;
run;
Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.
Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.