Dear all,
What I have:
data have; input subject $ seq aval date $10. @@; datalines; A 1 3 2020-01-01 A 2 4 2020-01-04 A 3 3 2020-03-01 A 4 5 2020-04-01 B 1 4 2020-04-05 B 2 1 2020-03-03 B 3 2 2020-01-03 B 4 1 2020-02-03 C 2 4 2020-02-04 C 1 4 2020-02-03 ;
And what I want:
And my code please see as after:
data a; if _N_=1 then do; dcl hash h(); h.definekey('subject','seq'); h.definedata('min'); h.definedone(); call missing(min); end; do _N_=1 by 1 until(last.subject); set have; by subject; if first.subject or .<aval<min then h.replace(); min=min(min,aval); end; do until(last.subject); set have; by subject; if h.check()=0 and aval=min then output;; end; run;
Running results are not what I want,and how to modify the code in order to put the smallest date record into hash and output in the second DOW?
Thanks
Just for fun, a hash of hashes approach if you have unsorted data
data have;
input subject $ seq aval date :yymmdd10.;
format date yymmdd10.;
datalines;
A 1 3 2020-01-01
B 3 2 2020-01-03
A 2 4 2020-01-04
C 2 4 2020-02-04
A 3 3 2020-03-01
B 4 1 2020-02-03
A 4 5 2020-04-01
B 1 4 2020-04-05
C 1 4 2020-02-03
B 2 1 2020-03-03
;
data want;
dcl hash HoH(ordered : 'A');
HoH.definekey('subject');
HoH.definedata('h', 'hi', 'subject');
HoH.definedone();
dcl hiter HoHiter('HoH');
do until (lr);
set have end=lr;
if HoH.find() ne 0 then do;
dcl hash h(dataset : 'have(obs=0)', multidata : 'Y', ordered : 'A');
h.definekey('aval', 'date');
h.definedata(all : 'Y');
h.definedone();
dcl hiter hi('h');
HoH.add();
end;
h.add();
end;
do while(HoHiter.next() = 0);
_N_ = hi.next();
_N_ = hi.prev();
output;
end;
run;
Result
subject seq aval date A 1 3 2020-01-01 B 4 1 2020-02-03 C 1 4 2020-02-03
You only need one DO loop:
data want (
keep=subject _seq _aval _date
rename=(_seq=seq _aval=aval _date=date)
);
_aval = 1e200;
do until (last.subject);
set have;
by subject;
if aval < _aval
then do;
_aval = aval;
_seq = seq;
_date = date;
end;
end;
run;
Untested, posted from my tablet.
Thanks for your reply, but the result is not I want. The date is not right. I want to take the earliest date with the lowest aval at the same time.
Slightly expand the code:
data want (
keep=subject _seq _aval _date
rename=(_seq=seq _aval=aval _date=date)
);
_aval = 1e200;
do until (last.subject);
set have;
by subject;
if aval < _aval
then do;
_aval = aval;
_seq = seq;
_date = date;
end;
else if aval = _aval and date < _date
then do;
_date = date;
_seq = seq;
end;
end;
run;
If you really want to use the Hash Object to find the min values by group, here is an approach
data have;
input subject $ seq aval date :yymmdd10.;
format date yymmdd10.;
datalines;
A 1 3 2020-01-01
A 2 4 2020-01-04
A 3 3 2020-03-01
A 4 5 2020-04-01
B 1 4 2020-04-05
B 2 1 2020-03-03
B 3 2 2020-01-03
B 4 1 2020-02-03
C 2 4 2020-02-04
C 1 4 2020-02-03
;
data want;
if _N_ = 1 then do;
declare hash h (dataset : 'have(obs=0)', ordered : 'A', multidata : 'Y');
h.definekey ('aval', 'date');
h.definedata (all : 'Y');
h.definedone();
declare hiter hi ('h');
end;
do until (last.subject);
set have;
by subject;
h.add();
end;
_N_ = hi.next();
_N_ = hi.prev();
h.clear();
run;
Result:
subject seq aval date A 1 3 2020-01-01 B 4 1 2020-02-03 C 1 4 2020-02-03
Just for fun, a hash of hashes approach if you have unsorted data
data have;
input subject $ seq aval date :yymmdd10.;
format date yymmdd10.;
datalines;
A 1 3 2020-01-01
B 3 2 2020-01-03
A 2 4 2020-01-04
C 2 4 2020-02-04
A 3 3 2020-03-01
B 4 1 2020-02-03
A 4 5 2020-04-01
B 1 4 2020-04-05
C 1 4 2020-02-03
B 2 1 2020-03-03
;
data want;
dcl hash HoH(ordered : 'A');
HoH.definekey('subject');
HoH.definedata('h', 'hi', 'subject');
HoH.definedone();
dcl hiter HoHiter('HoH');
do until (lr);
set have end=lr;
if HoH.find() ne 0 then do;
dcl hash h(dataset : 'have(obs=0)', multidata : 'Y', ordered : 'A');
h.definekey('aval', 'date');
h.definedata(all : 'Y');
h.definedone();
dcl hiter hi('h');
HoH.add();
end;
h.add();
end;
do while(HoHiter.next() = 0);
_N_ = hi.next();
_N_ = hi.prev();
output;
end;
run;
Result
subject seq aval date A 1 3 2020-01-01 B 4 1 2020-02-03 C 1 4 2020-02-03
Do you really have to use Hash Table ? Why not PROC SORT ?
data have;
input subject $ seq aval date $10. @@;
datalines;
A 1 3 2020-01-01
A 2 4 2020-01-04
A 3 3 2020-03-01
A 4 5 2020-04-01
B 1 4 2020-04-05
B 2 1 2020-03-03
B 3 2 2020-01-03
B 4 1 2020-02-03
C 2 4 2020-02-04
C 1 4 2020-02-03
;
data a;
dcl hash h(dataset:'have(obs=0)',ordered:'a');
dcl hiter hi('h');
h.definekey('aval','date');
h.definedata(all:'y');
h.definedone();
do until(last.subject);
set have;
by subject;
h.ref();
end;
hi.first(); output;
h.delete(); hi.delete();
run;
data have; input subject $ seq aval date $10. @@; datalines; A 1 3 2020-01-01 A 2 4 2020-01-04 A 3 3 2020-03-01 A 4 5 2020-04-01 B 1 4 2020-04-05 B 2 1 2020-03-03 B 3 2 2020-01-03 B 4 1 2020-02-03 C 2 4 2020-02-04 C 1 4 2020-02-03 ; data a; if _N_=1 then do; dcl hash h(dataset:'have(obs=0)',ordered:'d'); dcl hiter hi('h'); h.definekey('aval','date'); h.definedata(all:'y'); h.definedone(); end; do until(last.subject); set have; by subject; h.ref(); end; hi.last(); output; rc=hi.next();h.clear(); drop rc; run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.