Hi All
I have a dataset with many records. I want to assign the same batch number to the first 5 records and add 1 to the batch number for the next 5 records and so on. But if the ID after the 5th record is the same as the 5th record's ID then put the 6th record in the same batch as the 5th record. In this example: 1 AA , 2 BB, 3CC, 4 EE and 4AE should be in the same batch 001. However, the 6th record has the same ID as the last record of the first batch hence it also falls into 001.
Please see below example:
ID Name Batch
1 AA 001
2 BB 001
3 CC 001
4 EE 001
4 AE 001
4 BE 001
5 FF 002
5 FF 002
7 GG 002
8 HH 002
9 II 002
22 JJ 003
23 KK 003
24 MM 003
Any help will really be appreciated.
What if the above rules require 7 with the same batch?
If we put 6 in the first batch, do we put 4 in the second batch?
What if the above rules require 7 with the same batch?
If we put 6 in the first batch, do we put 4 in the second batch?
data want;
set have;
retain batch 1;
prev_id=lag(id);
number_in_id+1;
if number_in_id>=5 and id^=prev_id then do;
batch+1;
number_in_id=0;
end;
drop prev_id;
run;
A DOW loop whose embedded SET
has by-groups processed using the NOTSORTED
option of the BY
statement will respond to the end of a contiguous block of values.
Example:
data want;
retain batch_num 1;
/* batch_num # is applied to 5 rows at a time,
* extended for each row of final id in the batch
*/
do _n_ = 1 by 1 until (_n_ >= 5 and last.id);
set have;
by id NOTSORTED;
OUTPUT;
end;
batch_num + 1;
run;
I think you want
... "until (_n_ >= 5 and last.id)"
data have;
input ID Name $;
cards;
1 AA
2 BB
3 CC
4 EE
4 AE
4 BE
5 FF
5 FF
7 GG
8 HH
9 II
22 JJ
23 KK
24 MM
;
data want;
set have;
retain batch 1;
n+1;
if n>5 and id ne lag(id) then do;batch+1;n=1;end;
drop n;
run;
BTW, I like this kind of Q .
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.