Dear all,
What I have:
data have; input subject $ seq aval date $10. @@; datalines; A 1 3 2020-01-01 A 2 4 2020-01-04 A 3 3 2020-03-01 A 4 5 2020-04-01 B 1 4 2020-04-05 B 2 1 2020-03-03 B 3 2 2020-01-03 B 4 1 2020-02-03 C 2 4 2020-02-04 C 1 4 2020-02-03 ;
And what I want:
And my code please see as after:
data a;
if _N_=1 then do;
dcl hash h();
h.definekey('subject','seq');
h.definedata('min');
h.definedone();
call missing(min);
end;
do _N_=1 by 1 until(last.subject);
set have;
by subject;
if first.subject or .<aval<min then h.replace();
min=min(min,aval);
end;
do until(last.subject);
set have;
by subject;
if h.check()=0 and aval=min then output;;
end;
run;
Running results are not what I want,and how to modify the code in order to put the smallest date record into hash and output in the second DOW?
Thanks
Just for fun, a hash of hashes approach if you have unsorted data
data have;
input subject $ seq aval date :yymmdd10.;
format date yymmdd10.;
datalines;
A 1 3 2020-01-01
B 3 2 2020-01-03
A 2 4 2020-01-04
C 2 4 2020-02-04
A 3 3 2020-03-01
B 4 1 2020-02-03
A 4 5 2020-04-01
B 1 4 2020-04-05
C 1 4 2020-02-03
B 2 1 2020-03-03
;
data want;
dcl hash HoH(ordered : 'A');
HoH.definekey('subject');
HoH.definedata('h', 'hi', 'subject');
HoH.definedone();
dcl hiter HoHiter('HoH');
do until (lr);
set have end=lr;
if HoH.find() ne 0 then do;
dcl hash h(dataset : 'have(obs=0)', multidata : 'Y', ordered : 'A');
h.definekey('aval', 'date');
h.definedata(all : 'Y');
h.definedone();
dcl hiter hi('h');
HoH.add();
end;
h.add();
end;
do while(HoHiter.next() = 0);
_N_ = hi.next();
_N_ = hi.prev();
output;
end;
run;
Result
subject seq aval date A 1 3 2020-01-01 B 4 1 2020-02-03 C 1 4 2020-02-03
You only need one DO loop:
data want (
keep=subject _seq _aval _date
rename=(_seq=seq _aval=aval _date=date)
);
_aval = 1e200;
do until (last.subject);
set have;
by subject;
if aval < _aval
then do;
_aval = aval;
_seq = seq;
_date = date;
end;
end;
run;
Untested, posted from my tablet.
Thanks for your reply, but the result is not I want. The date is not right. I want to take the earliest date with the lowest aval at the same time.
Slightly expand the code:
data want (
keep=subject _seq _aval _date
rename=(_seq=seq _aval=aval _date=date)
);
_aval = 1e200;
do until (last.subject);
set have;
by subject;
if aval < _aval
then do;
_aval = aval;
_seq = seq;
_date = date;
end;
else if aval = _aval and date < _date
then do;
_date = date;
_seq = seq;
end;
end;
run;
If you really want to use the Hash Object to find the min values by group, here is an approach
data have;
input subject $ seq aval date :yymmdd10.;
format date yymmdd10.;
datalines;
A 1 3 2020-01-01
A 2 4 2020-01-04
A 3 3 2020-03-01
A 4 5 2020-04-01
B 1 4 2020-04-05
B 2 1 2020-03-03
B 3 2 2020-01-03
B 4 1 2020-02-03
C 2 4 2020-02-04
C 1 4 2020-02-03
;
data want;
if _N_ = 1 then do;
declare hash h (dataset : 'have(obs=0)', ordered : 'A', multidata : 'Y');
h.definekey ('aval', 'date');
h.definedata (all : 'Y');
h.definedone();
declare hiter hi ('h');
end;
do until (last.subject);
set have;
by subject;
h.add();
end;
_N_ = hi.next();
_N_ = hi.prev();
h.clear();
run;
Result:
subject seq aval date A 1 3 2020-01-01 B 4 1 2020-02-03 C 1 4 2020-02-03
Just for fun, a hash of hashes approach if you have unsorted data
data have;
input subject $ seq aval date :yymmdd10.;
format date yymmdd10.;
datalines;
A 1 3 2020-01-01
B 3 2 2020-01-03
A 2 4 2020-01-04
C 2 4 2020-02-04
A 3 3 2020-03-01
B 4 1 2020-02-03
A 4 5 2020-04-01
B 1 4 2020-04-05
C 1 4 2020-02-03
B 2 1 2020-03-03
;
data want;
dcl hash HoH(ordered : 'A');
HoH.definekey('subject');
HoH.definedata('h', 'hi', 'subject');
HoH.definedone();
dcl hiter HoHiter('HoH');
do until (lr);
set have end=lr;
if HoH.find() ne 0 then do;
dcl hash h(dataset : 'have(obs=0)', multidata : 'Y', ordered : 'A');
h.definekey('aval', 'date');
h.definedata(all : 'Y');
h.definedone();
dcl hiter hi('h');
HoH.add();
end;
h.add();
end;
do while(HoHiter.next() = 0);
_N_ = hi.next();
_N_ = hi.prev();
output;
end;
run;
Result
subject seq aval date A 1 3 2020-01-01 B 4 1 2020-02-03 C 1 4 2020-02-03
Do you really have to use Hash Table ? Why not PROC SORT ?
data have;
input subject $ seq aval date $10. @@;
datalines;
A 1 3 2020-01-01
A 2 4 2020-01-04
A 3 3 2020-03-01
A 4 5 2020-04-01
B 1 4 2020-04-05
B 2 1 2020-03-03
B 3 2 2020-01-03
B 4 1 2020-02-03
C 2 4 2020-02-04
C 1 4 2020-02-03
;
data a;
dcl hash h(dataset:'have(obs=0)',ordered:'a');
dcl hiter hi('h');
h.definekey('aval','date');
h.definedata(all:'y');
h.definedone();
do until(last.subject);
set have;
by subject;
h.ref();
end;
hi.first(); output;
h.delete(); hi.delete();
run;
data have;
input subject $ seq aval date $10. @@;
datalines;
A 1 3 2020-01-01
A 2 4 2020-01-04
A 3 3 2020-03-01
A 4 5 2020-04-01
B 1 4 2020-04-05
B 2 1 2020-03-03
B 3 2 2020-01-03
B 4 1 2020-02-03
C 2 4 2020-02-04
C 1 4 2020-02-03
;
data a;
if _N_=1 then do;
dcl hash h(dataset:'have(obs=0)',ordered:'d');
dcl hiter hi('h');
h.definekey('aval','date');
h.definedata(all:'y');
h.definedone();
end;
do until(last.subject);
set have;
by subject;
h.ref();
end;
hi.last(); output;
rc=hi.next();h.clear();
drop rc;
run;
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Still thinking about your presentation idea? The submission deadline has been extended to Friday, Nov. 14, at 11:59 p.m. ET.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.