If all you care about is the average record-day on GroupID level, then It seems to me the key is to figure out the total number of Effective Days for each Groupid, taking consideration of gaps, overlaps etc. The following code is supposed to have better performance by leveraging Hash table: data have; input GroupID$ record Start_Date End_Date; format Start_Date End_Date date9.; cards; A 1 1 5 A 2 1 5 B 3 1 5 B 4 6 10 C 1 1 8 C 2 1 5 C 3 5 10 C 4 14 18 ; data want; dcl hash h(ordered:'a'); h.definekey('dt'); h.definedone(); do until (last.groupid); set have; by groupid notsorted; do dt=start_date to end_date; rc=h.ref(); end; tot_rec_day+intck('day', start_date, end_date)+1; end; avg_record=tot_rec_day/h.num_items; output; tot_rec_day=.; keep groupid tot_rec_day avg_record; run; If you already have the accurate effective days for each groupid, as being presented, the following step may be skipped: tot_rec_day+intck('day', start_date, end_date)+1; instead, just go ahead use the range number came with your incoming data. Message was edited by: haikuo bian
... View more