Hi @Patrick, Yes, a bundle is a defined set of SKUs within the same order. To @mkeintz's question, a single SKU could be an element of more than one bundle (which makes this even more "fun"). So, in creating a reference table, I'm struggling with how to accomidate that fact. My current thought is to just define a macro group for each bundle: %let b1 = ('TGD25', 'TGD38', 'XGD02'); %let b2 = ('TGD25', 'TGD29'); ... %let b68 = ('PND56', 'PND59'); A very small sample dataset: data mydata; input order_id line_id SKU $; datalines; 12345 1234 TGD25 12345 1235 TGD38 12345 1236 CLX30 12345 1237 XGD02 12345 1238 BFD01 67891 2345 BLK89 67891 2346 TGD25 67891 2347 CLK89 67891 2348 TGD25 67891 2349 TGD29 56789 4512 TGD38 56789 4513 BLK78 56789 4514 TGD25 56789 4515 CLT98 56789 4516 BLH67 56789 4517 CLX89 ; run; In this sample, ---order 12345 has one bundle (as defined in &b1: 'TGD25', 'TGD38', 'XGD02') ---order 67891 has one bundle (as defined in &b2: 'TGD25', 'TGD29'), but also has an extra sku ('TGD25') that should not be counted ---order 56789 has two valid skus for &b1 ('TGD25', 'TGD38'), but does not contain the third and therefore has no bundle. To both your points, I think the rules are: 1. All SKUs must be on a single order 2. There can be more than one bundle per order, but each SKU still just needs a 1/0 flag 3. Partial bundles (extra SKUs not completing a bundle) should not be flagged I'd like to keep everything at the line level. The very end result, after this gets merged in with lots of other orders, is that I could sum the demand of lines that are flagged as part of a full bundle. For this sample, the desired final dataset would be: order_id line_id sku bundle 12345 1234 TGD25 1 12345 1235 TGD38 1 12345 1236 CLX30 0 12345 1237 XGD02 1 12345 1238 BFD01 0 67891 2345 BLK89 0 67891 2346 TGD25 0 67891 2347 CLK89 0 67891 2348 TGD25 1 67891 2349 TGD29 1 56789 4512 TGD38 0 56789 4513 BLK78 0 56789 4514 TGD25 0 56789 4515 CLT98 0 56789 4516 BLH67 0 56789 4517 CLX89 0 I have been trying to do a datastep that counts the number of each SKU on the order, then takes the minimum for those counts (to get how many valid bundles there are), but struggle with how to keep the unique line_ids through all that so I can go back and flag the lines. For example, with order 67891 above, if I summed by order_id, by SKU, I would have count_TGD25 = 2 and count_TGD29 = 1. If I take the min(count_TGD25, TGD29), and get 1, then I know I have one full bundle...but then how to I take that back to flag the first instance of the 'TGD25' SKU, but not the second instance?
... View more