BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Lee_wan
Obsidian | Level 7

Dear all,

 

What I have:

data have;
input subject $ seq aval date $10. @@;
datalines;
A 1 3 2020-01-01
A 2 4 2020-01-04
A 3 3 2020-03-01
A 4 5 2020-04-01
B 1 4 2020-04-05
B 2 1 2020-03-03
B 3 2 2020-01-03
B 4 1 2020-02-03
C 2 4 2020-02-04
C 1 4 2020-02-03
;

And what I want:

Lee_wan_0-1596863522432.png

And my code please see as after:

data a;
  if _N_=1 then do;
    dcl hash h();
	h.definekey('subject','seq');
	h.definedata('min');
	h.definedone();
	call missing(min);
  end;

  do _N_=1 by 1 until(last.subject);
    set have;
	by subject;
	if first.subject or .<aval<min then h.replace();
	min=min(min,aval);
  end;

  do until(last.subject);
    set have;
	by subject;
    if h.check()=0 and aval=min then output;;
  end;
run;

Running results are not what I want,and how to modify the code in order to put the smallest date record into hash and output in the second DOW?

 

Thanks

1 ACCEPTED SOLUTION

Accepted Solutions
PeterClemmensen
Tourmaline | Level 20

Just for fun, a hash of hashes approach if you have unsorted data

 

data have;
input subject $ seq aval date :yymmdd10.;
format date yymmdd10.;
datalines;
A 1 3 2020-01-01
B 3 2 2020-01-03
A 2 4 2020-01-04
C 2 4 2020-02-04
A 3 3 2020-03-01
B 4 1 2020-02-03
A 4 5 2020-04-01
B 1 4 2020-04-05
C 1 4 2020-02-03
B 2 1 2020-03-03
;

data want;
   dcl hash HoH(ordered : 'A');
   HoH.definekey('subject');
   HoH.definedata('h', 'hi', 'subject');
   HoH.definedone();
   dcl hiter HoHiter('HoH');

   do until (lr);
      set have end=lr;

      if HoH.find() ne 0 then do;
         dcl hash h(dataset : 'have(obs=0)', multidata : 'Y', ordered : 'A');
         h.definekey('aval', 'date');
         h.definedata(all : 'Y');
         h.definedone();
         dcl hiter hi('h');
         HoH.add();
      end;

      h.add();
   end;

   do while(HoHiter.next() = 0);
      _N_ = hi.next();
      _N_ = hi.prev();
      output;
   end;
run;

 

Result

subject seq aval date 
A       1   3    2020-01-01 
B       4   1    2020-02-03 
C       1   4    2020-02-03 

View solution in original post

9 REPLIES 9
Kurt_Bremser
Super User

You only need one DO loop:

data want (
  keep=subject _seq _aval _date
  rename=(_seq=seq _aval=aval _date=date)
);
_aval = 1e200;
do until (last.subject);
  set have;
  by subject;
  if aval < _aval
  then do;
    _aval = aval;
    _seq = seq;
    _date = date;
  end;
end;
run;

Untested, posted from my tablet.

Lee_wan
Obsidian | Level 7

Thanks for your reply, but the result is not I want. The date is not right. I want to take the earliest date with the lowest aval at the same time.

Kurt_Bremser
Super User

Slightly expand the code:

data want (
  keep=subject _seq _aval _date
  rename=(_seq=seq _aval=aval _date=date)
);
_aval = 1e200;
do until (last.subject);
  set have;
  by subject;
  if aval < _aval
  then do;
    _aval = aval;
    _seq = seq;
    _date = date;
  end;
  else if aval = _aval and date < _date
  then do; 
    _date = date;
    _seq = seq;
  end;
end;
run;
PeterClemmensen
Tourmaline | Level 20

If you really want to use the Hash Object to find the min values by group, here is an approach

 

 

data have;
input subject $ seq aval date :yymmdd10.;
format date yymmdd10.;
datalines;
A 1 3 2020-01-01
A 2 4 2020-01-04
A 3 3 2020-03-01
A 4 5 2020-04-01
B 1 4 2020-04-05
B 2 1 2020-03-03
B 3 2 2020-01-03
B 4 1 2020-02-03
C 2 4 2020-02-04
C 1 4 2020-02-03
;

data want;
   if _N_ = 1 then do;
      declare hash h (dataset : 'have(obs=0)', ordered : 'A', multidata : 'Y');
      h.definekey ('aval', 'date');
      h.definedata (all : 'Y');
      h.definedone();
      declare hiter hi ('h');
   end;
 
   do until (last.subject);
      set have;
      by subject;
      h.add();
   end;
 
   _N_ = hi.next();
   _N_ = hi.prev();
   h.clear(); 
run;

 

 

Result:

 

subject seq aval date 
A       1   3    2020-01-01 
B       4   1    2020-02-03 
C       1   4    2020-02-03 

 

Lee_wan
Obsidian | Level 7
Can I use rc=h.first() to replace _N_ = hi.next();?
And why we can't delete the code for "_N_ = hi.prev();"? what dose this code do?
Thanks.
PeterClemmensen
Tourmaline | Level 20

Just for fun, a hash of hashes approach if you have unsorted data

 

data have;
input subject $ seq aval date :yymmdd10.;
format date yymmdd10.;
datalines;
A 1 3 2020-01-01
B 3 2 2020-01-03
A 2 4 2020-01-04
C 2 4 2020-02-04
A 3 3 2020-03-01
B 4 1 2020-02-03
A 4 5 2020-04-01
B 1 4 2020-04-05
C 1 4 2020-02-03
B 2 1 2020-03-03
;

data want;
   dcl hash HoH(ordered : 'A');
   HoH.definekey('subject');
   HoH.definedata('h', 'hi', 'subject');
   HoH.definedone();
   dcl hiter HoHiter('HoH');

   do until (lr);
      set have end=lr;

      if HoH.find() ne 0 then do;
         dcl hash h(dataset : 'have(obs=0)', multidata : 'Y', ordered : 'A');
         h.definekey('aval', 'date');
         h.definedata(all : 'Y');
         h.definedone();
         dcl hiter hi('h');
         HoH.add();
      end;

      h.add();
   end;

   do while(HoHiter.next() = 0);
      _N_ = hi.next();
      _N_ = hi.prev();
      output;
   end;
run;

 

Result

subject seq aval date 
A       1   3    2020-01-01 
B       4   1    2020-02-03 
C       1   4    2020-02-03 
Lee_wan
Obsidian | Level 7
Amazing!Thanks a lot!
The process is a little difficult for me to understand.
I‘m not good at using hiter and I can't understand "_N_ = hi.next();_N_ = hi.prev();“ in your code.Can you give me a hint?
Ksharp
Super User

Do you really have to use Hash Table ? Why not PROC SORT ?

 

data have;
input subject $ seq aval date $10. @@;
datalines;
A 1 3 2020-01-01
A 2 4 2020-01-04
A 3 3 2020-03-01
A 4 5 2020-04-01
B 1 4 2020-04-05
B 2 1 2020-03-03
B 3 2 2020-01-03
B 4 1 2020-02-03
C 2 4 2020-02-04
C 1 4 2020-02-03
;
data a;
    dcl hash h(dataset:'have(obs=0)',ordered:'a');
	dcl hiter hi('h');
	h.definekey('aval','date');
	h.definedata(all:'y');
	h.definedone();
  do until(last.subject);
    set have;
	by subject;
	h.ref();
  end;
  hi.first(); output;
  h.delete(); hi.delete();
run;
Ksharp
Super User
data have;
input subject $ seq aval date $10. @@;
datalines;
A 1 3 2020-01-01
A 2 4 2020-01-04
A 3 3 2020-03-01
A 4 5 2020-04-01
B 1 4 2020-04-05
B 2 1 2020-03-03
B 3 2 2020-01-03
B 4 1 2020-02-03
C 2 4 2020-02-04
C 1 4 2020-02-03
;
data a;
  if _N_=1 then do;
    dcl hash h(dataset:'have(obs=0)',ordered:'d');
	dcl hiter hi('h');
	h.definekey('aval','date');
	h.definedata(all:'y');
	h.definedone();
  end;
  do until(last.subject);
    set have;
	by subject;
	h.ref();
  end;
  hi.last(); output;
  rc=hi.next();h.clear();
  drop rc;
run;

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 9 replies
  • 1544 views
  • 2 likes
  • 4 in conversation