Dear all,
What I have:
data have; input subject $ seq aval date $10. @@; datalines; A 1 3 2020-01-01 A 2 4 2020-01-04 A 3 3 2020-03-01 A 4 5 2020-04-01 B 1 4 2020-04-05 B 2 1 2020-03-03 B 3 2 2020-01-03 B 4 1 2020-02-03 C 2 4 2020-02-04 C 1 4 2020-02-03 ;
And what I want:
And my code please see as after:
data a; if _N_=1 then do; dcl hash h(); h.definekey('subject','seq'); h.definedata('min'); h.definedone(); call missing(min); end; do _N_=1 by 1 until(last.subject); set have; by subject; if first.subject or .<aval<min then h.replace(); min=min(min,aval); end; do until(last.subject); set have; by subject; if h.check()=0 and aval=min then output;; end; run;
Running results are not what I want,and how to modify the code in order to put the smallest date record into hash and output in the second DOW?
Thanks
Just for fun, a hash of hashes approach if you have unsorted data
data have;
input subject $ seq aval date :yymmdd10.;
format date yymmdd10.;
datalines;
A 1 3 2020-01-01
B 3 2 2020-01-03
A 2 4 2020-01-04
C 2 4 2020-02-04
A 3 3 2020-03-01
B 4 1 2020-02-03
A 4 5 2020-04-01
B 1 4 2020-04-05
C 1 4 2020-02-03
B 2 1 2020-03-03
;
data want;
dcl hash HoH(ordered : 'A');
HoH.definekey('subject');
HoH.definedata('h', 'hi', 'subject');
HoH.definedone();
dcl hiter HoHiter('HoH');
do until (lr);
set have end=lr;
if HoH.find() ne 0 then do;
dcl hash h(dataset : 'have(obs=0)', multidata : 'Y', ordered : 'A');
h.definekey('aval', 'date');
h.definedata(all : 'Y');
h.definedone();
dcl hiter hi('h');
HoH.add();
end;
h.add();
end;
do while(HoHiter.next() = 0);
_N_ = hi.next();
_N_ = hi.prev();
output;
end;
run;
Result
subject seq aval date A 1 3 2020-01-01 B 4 1 2020-02-03 C 1 4 2020-02-03
You only need one DO loop:
data want (
keep=subject _seq _aval _date
rename=(_seq=seq _aval=aval _date=date)
);
_aval = 1e200;
do until (last.subject);
set have;
by subject;
if aval < _aval
then do;
_aval = aval;
_seq = seq;
_date = date;
end;
end;
run;
Untested, posted from my tablet.
Thanks for your reply, but the result is not I want. The date is not right. I want to take the earliest date with the lowest aval at the same time.
Slightly expand the code:
data want (
keep=subject _seq _aval _date
rename=(_seq=seq _aval=aval _date=date)
);
_aval = 1e200;
do until (last.subject);
set have;
by subject;
if aval < _aval
then do;
_aval = aval;
_seq = seq;
_date = date;
end;
else if aval = _aval and date < _date
then do;
_date = date;
_seq = seq;
end;
end;
run;
If you really want to use the Hash Object to find the min values by group, here is an approach
data have;
input subject $ seq aval date :yymmdd10.;
format date yymmdd10.;
datalines;
A 1 3 2020-01-01
A 2 4 2020-01-04
A 3 3 2020-03-01
A 4 5 2020-04-01
B 1 4 2020-04-05
B 2 1 2020-03-03
B 3 2 2020-01-03
B 4 1 2020-02-03
C 2 4 2020-02-04
C 1 4 2020-02-03
;
data want;
if _N_ = 1 then do;
declare hash h (dataset : 'have(obs=0)', ordered : 'A', multidata : 'Y');
h.definekey ('aval', 'date');
h.definedata (all : 'Y');
h.definedone();
declare hiter hi ('h');
end;
do until (last.subject);
set have;
by subject;
h.add();
end;
_N_ = hi.next();
_N_ = hi.prev();
h.clear();
run;
Result:
subject seq aval date A 1 3 2020-01-01 B 4 1 2020-02-03 C 1 4 2020-02-03
Just for fun, a hash of hashes approach if you have unsorted data
data have;
input subject $ seq aval date :yymmdd10.;
format date yymmdd10.;
datalines;
A 1 3 2020-01-01
B 3 2 2020-01-03
A 2 4 2020-01-04
C 2 4 2020-02-04
A 3 3 2020-03-01
B 4 1 2020-02-03
A 4 5 2020-04-01
B 1 4 2020-04-05
C 1 4 2020-02-03
B 2 1 2020-03-03
;
data want;
dcl hash HoH(ordered : 'A');
HoH.definekey('subject');
HoH.definedata('h', 'hi', 'subject');
HoH.definedone();
dcl hiter HoHiter('HoH');
do until (lr);
set have end=lr;
if HoH.find() ne 0 then do;
dcl hash h(dataset : 'have(obs=0)', multidata : 'Y', ordered : 'A');
h.definekey('aval', 'date');
h.definedata(all : 'Y');
h.definedone();
dcl hiter hi('h');
HoH.add();
end;
h.add();
end;
do while(HoHiter.next() = 0);
_N_ = hi.next();
_N_ = hi.prev();
output;
end;
run;
Result
subject seq aval date A 1 3 2020-01-01 B 4 1 2020-02-03 C 1 4 2020-02-03
Do you really have to use Hash Table ? Why not PROC SORT ?
data have;
input subject $ seq aval date $10. @@;
datalines;
A 1 3 2020-01-01
A 2 4 2020-01-04
A 3 3 2020-03-01
A 4 5 2020-04-01
B 1 4 2020-04-05
B 2 1 2020-03-03
B 3 2 2020-01-03
B 4 1 2020-02-03
C 2 4 2020-02-04
C 1 4 2020-02-03
;
data a;
dcl hash h(dataset:'have(obs=0)',ordered:'a');
dcl hiter hi('h');
h.definekey('aval','date');
h.definedata(all:'y');
h.definedone();
do until(last.subject);
set have;
by subject;
h.ref();
end;
hi.first(); output;
h.delete(); hi.delete();
run;
data have; input subject $ seq aval date $10. @@; datalines; A 1 3 2020-01-01 A 2 4 2020-01-04 A 3 3 2020-03-01 A 4 5 2020-04-01 B 1 4 2020-04-05 B 2 1 2020-03-03 B 3 2 2020-01-03 B 4 1 2020-02-03 C 2 4 2020-02-04 C 1 4 2020-02-03 ; data a; if _N_=1 then do; dcl hash h(dataset:'have(obs=0)',ordered:'d'); dcl hiter hi('h'); h.definekey('aval','date'); h.definedata(all:'y'); h.definedone(); end; do until(last.subject); set have; by subject; h.ref(); end; hi.last(); output; rc=hi.next();h.clear(); drop rc; run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.