You've got a cornucopia of fine solutions, so methought another one, based on the hash object's prowess, wouldn't hurt at least from the standpoint of learning how it can be done. Note that like MEANS/SUMMARY it doesn't need the input file to be sorted, so I use it unsorted and have also added another satellite variable, VAR, to see if it gets transplanted from the input to the output correctly (it does). Despite the unsorted input, the output comes out sorted (again sharing the trait with MEANS/SUMMARY).
data have ;
input personID $ attribute1 $ qty date yymmdd10. var ;
format date yymmdd10. ;
cards ;
A m3 20 2019-02-01 1
B m3 10 2019-02-15 2
A m1 20 2019-03-01 3
B m3 20 2019-02-15 4
A m1 20 2019-02-01 5
C m2 20 2019-01-15 6
D m1 20 2019-01-15 7
A m1 10 2019-03-01 8
D m1 20 2019-02-15 9
A m2 20 2019-01-01 10
;
run;
data _null_ ;
dcl hash h (dataset:"have(obs=0)", ordered:"a") ;
h.definekey ("personID", "attribute1") ;
h.definedata (all:"y") ;
h.definedone () ;
do until (z) ;
set have (rename = (qty=_q date=_d var=_v)) end = z ;
if not (h.find() or (_d < date or _d = date and _q > qty)) then continue ;
date = _d ;
qty = _q ;
var = _v ;
h.replace() ;
end ;
h.output (dataset:"want") ;
format date yymmdd10. ;
run ;
Output:
person ID attribute1 qty date var ------------------------------------------------ A m1 20 2019-02-01 5 A m2 20 2019-01-01 10 A m3 20 2019-02-01 1 B m3 20 2019-02-15 4 C m2 20 2019-01-15 6 D m1 20 2019-01-15 7
Frankly, I prefer the brevity of @FreelanceReinh's MEANS/SUMMARY offering, except for one subtlety: With the hash, bringing another satellite variable like VAR above is a simple business and it works properly. I've tried to do the same by adding the ID statement in SUMMARY; but the values of VAR it generated came out incorrectly in the sense that not all of them aligned with the corresponding output (date,qty) pairs. Hope @FreelanceReinh will chime in on that.
Kind regards
Paul D.
@hashman wrote:Frankly, I prefer the brevity of @FreelanceReinh's MEANS/SUMMARY offering, except for one subtlety: With the hash, bringing another satellite variable like VAR above is a simple business and it works properly. I've tried to do the same by adding the ID statement in SUMMARY; but the values of VAR it generated came out incorrectly in the sense that not all of them aligned with the corresponding output (date,qty) pairs. Hope @FreelanceReinh will chime in on that.
Hi Paul,
Thanks. It's true that adding the ID statement to the suggested code can bring in (unwanted) values from a different observation. What I alluded to in my post was adding more ID variables to the existing list (date qty) in the IDGRP specification, like (date qty var1 var2). In this case the results were correct at least in the tests I performed before posting my suggestion.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.