Desktop productivity for business analysts and programmers

Random select multiple groups without duplicates

Reply
Occasional Contributor
Posts: 5

Random select multiple groups without duplicates

, I want to send a survey to our customers based on a random selection, but i'am stuck.

The problem is that a customers can appear within multiple categories (like customerid 1 and 2 in the example below) what can lead to duplicates in the random selection

What i need is a example how to realise a random selection which selects random three customers a category without duplicaties.

I'm using Enterprise Guide 4.3

input example

category 1customerid
1
2
3
4
5
category 2customerid
1
2
8
7
10

output example

category 1customerid
1
2
3
category 28
9
7
Trusted Advisor
Posts: 2,114

Re: Random select multiple groups without duplicates

One approach would be:

Assign a pseudo-random number between 0 and 1 (the RANUNI function) to every record.

Sort by customerid and that number (the category will now be in a random order within customerid).

Select the first of each customerid (now have one record per customerid).

Re-sort by category and the random number (the customerids will now be in a random order within category).

Select first 3 in each category.

Doc Muhlbaier

Duke

Occasional Contributor
Posts: 5

Re: Random select multiple groups without duplicates

Thanks for your reply.

I only have one more question; how do I select the first 3 rows by each category?

Harm Klaassen

Trusted Advisor
Posts: 2,114

Re: Random select multiple groups without duplicates

Harm,

I would usually use a DATA step and a retain statement for that.  Something like

DATA want;

SELECT have;

BY category;

RETAIN counter;  DROP counter;

IF first.category THEN counter=0;

IF counter <=3 THEN OUTPUT;

RUN;

Doc

Ask a Question
Discussion stats
  • 3 replies
  • 395 views
  • 1 like
  • 2 in conversation