BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
lsehlola
Calcite | Level 5

Hi All

 

I have a dataset with many records. I want to assign the same batch number to the first 5 records and add 1 to the batch number for the next 5 records and so on. But if the ID after the 5th record is the same as the 5th record's ID then put the 6th record in the same batch as the 5th record. In this example: 1 AA , 2 BB, 3CC, 4 EE and 4AE should be in the same batch 001. However, the 6th record has the same ID as the last record of the first batch hence it also falls into 001. 

 

Please see below example:

 

ID Name Batch

1  AA  001

2  BB  001

3  CC 001

4  EE 001

4  AE 001

4  BE 001

5  FF 002

5  FF 002

7 GG 002 

8 HH 002

9  II   002

22 JJ 003

23 KK 003

24 MM 003

 

Any help will really be appreciated.

1 ACCEPTED SOLUTION

Accepted Solutions
PaigeMiller
Diamond | Level 26

What if the above rules require 7 with the same batch?

 

If we put 6 in the first batch, do we put 4 in the second batch?

 

 

--
Paige Miller

View solution in original post

8 REPLIES 8
PaigeMiller
Diamond | Level 26

What if the above rules require 7 with the same batch?

 

If we put 6 in the first batch, do we put 4 in the second batch?

 

 

--
Paige Miller
lsehlola
Calcite | Level 5
What if the above rules require 7 with the same batch?

Then ID - 5 will be in the same batch 001. Meaning you take 8 records instead of 7.

If we put 6 in the first batch, do we put 4 in the second batch?
You put 5 in the next batch unless the 6th ID is the same as the 5th ID
PaigeMiller
Diamond | Level 26
data want;
    set have;
	retain batch 1;
	prev_id=lag(id);
	number_in_id+1;
	if number_in_id>=5 and id^=prev_id then do;
	    batch+1;
		number_in_id=0;
	end;
	drop prev_id;
run;
--
Paige Miller
RichardDeVen
Barite | Level 11

A DOW loop whose embedded SET has by-groups processed using the NOTSORTED option of the BY statement will respond to the end of a contiguous block of values.

 

Example:

 

data want;
  retain batch_num 1;

  /* batch_num # is applied to 5 rows at a time,
   * extended for each row of final id in the batch
   */

  do _n_ = 1 by 1 until (_n_ >= 5 and last.id);     
    set have;
    by id NOTSORTED;
    OUTPUT;
  end;
  batch_num + 1;
run;

 

 

mkeintz
PROC Star

@RichardDeVen 

I think you want

  ... "until (_n_ >= 5 and last.id)"

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
RichardDeVen
Barite | Level 11
Indeed!
RichardDeVen
Barite | Level 11
Why does the title say "batches of 10" and the question discusses "5 records"
Ksharp
Super User
data have;
input ID Name $;
cards;
1  AA
2  BB
3  CC
4  EE
4  AE
4  BE
5  FF
5  FF
7  GG 
8  HH 
9  II
22 JJ
23 KK
24 MM
;

data want;
 set have;
 retain batch 1;
 n+1;
 if n>5 and id ne lag(id) then do;batch+1;n=1;end;
 drop n;
run;

BTW, I like this kind of Q .

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 1156 views
  • 0 likes
  • 5 in conversation