BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
danwarags
Obsidian | Level 7

For a given set of 2000 IDs I want to assign random numbers in the range 100000- 102000. How should I code this. For example:

 

ObsID
11456
23765
335632
489564
           .            .
           .            .
           .            .
2000580058

 

My output should look something like: 

 

ObsIDRandom
11456100000
23765101997
335632101456
489564100002
.. 
.. 
.. 
2000580058102000

 

In the above example, the number of observations in the dataset are 2000. The Random numbers are assigned such that they are in particular range i.e 100000-102000. 

 

Thanks for your help

1 ACCEPTED SOLUTION

Accepted Solutions
Rick_SAS
SAS Super FREQ

It sounds like you want to assign a sequence of 2000 numbers to a set of 2000 observations in a random fashion.  This is different than "assigning random numbers." If you assign a sequence randomly, there are no duplicate values, whereas you will almost surely get a dupicate value if you assign random numbers.

 

If there are 2000 observations, I assume you want the RANDOM variable to be values in [10000, 101999] or 

[10001, 102000].  I'll assume the latter. Try this:

 

data Have;   /* test data set */
set Sashelp.class;
run;

/* generate sequence of numbers */
data ID;
if _N_=0 then set Have nobs=N;
do i = 1 to N;
   sortOrder = rand("Uniform");
   ID = 10000 + i;   /* sequence of numbers */
   output;
end;
keep sortOrder ID;
run;

/* randomly permute the ID values */
proc sort data=ID; 
by sortOrder;
run;

/* merge with original data */
data Want;
merge Have ID(drop=sortOrder);
run;

 

View solution in original post

11 REPLIES 11
Rick_SAS
SAS Super FREQ

It sounds like you want to assign a sequence of 2000 numbers to a set of 2000 observations in a random fashion.  This is different than "assigning random numbers." If you assign a sequence randomly, there are no duplicate values, whereas you will almost surely get a dupicate value if you assign random numbers.

 

If there are 2000 observations, I assume you want the RANDOM variable to be values in [10000, 101999] or 

[10001, 102000].  I'll assume the latter. Try this:

 

data Have;   /* test data set */
set Sashelp.class;
run;

/* generate sequence of numbers */
data ID;
if _N_=0 then set Have nobs=N;
do i = 1 to N;
   sortOrder = rand("Uniform");
   ID = 10000 + i;   /* sequence of numbers */
   output;
end;
keep sortOrder ID;
run;

/* randomly permute the ID values */
proc sort data=ID; 
by sortOrder;
run;

/* merge with original data */
data Want;
merge Have ID(drop=sortOrder);
run;

 

danwarags
Obsidian | Level 7

Hello Rick,

 

Thank you so much for the code. This really helped me. The only thing I am finding hard is to run the last step (/*merge with original data*/). The error says that I need to specify BY statement. I need the ID variable and the random assigned variable in the same dataset. 

 

This is what I used to run the code. The last step gives me error saying I need to specify BY statement. How do I modify the last step. 

 

data have;
input ID;
datalines;
101
103
107
506
845
;

data ID;
if _N_=0 then set have  nobs=N;
do i = 1 to N;
   sortOrder = rand("Uniform");
   RANDOM_ID = 900000 + i;   /* sequence of numbers */
   output;
end;
keep RANDOM_ID sortOrder ;
run;

proc sort data=ID; 
by sortOrder;
run;

data Want;
merge Have ID(drop=sortOrder);
run;

 

Thanks!

Rick_SAS
SAS Super FREQ

Please post your SAS log. The code you posted runs without error on my version of SAS. The BY statement is optional for a one-to-one merge, so It doesn't make sense that you are seeing an error message.

danwarags
Obsidian | Level 7

Thanks for quick reply Rick. Please find my attached log: 

 

284 dm 'clear log'; dm 'clear output';
285
286 data have;
287 input ID;
288 datalines;

NOTE: The data set WORK.HAVE has 5 observations and 1 variables.
NOTE: DATA statement used (Total process time):
real time 0.02 seconds
cpu time 0.01 seconds


294 ;
295
296 data ID;
297 if _N_=0 then set have nobs=N;
298 do i = 1 to N;
299 sortOrder = rand("Uniform");
300 RANDOM_ID = 900000 + i; /* sequence of numbers */
301 output;
302 end;
303 keep RANDOM_ID sortOrder ;
304 run;

NOTE: DATA STEP stopped due to looping.
NOTE: The data set WORK.ID has 5 observations and 2 variables.
NOTE: DATA statement used (Total process time):
real time 0.02 seconds
cpu time 0.01 seconds


305
306 proc sort data=ID;
307 by sortOrder;
308 run;

NOTE: There were 5 observations read from the data set WORK.ID.
NOTE: SAS sort was used.
NOTE: The data set WORK.ID has 5 observations and 2 variables.
NOTE: PROCEDURE SORT used (Total process time):
real time 0.01 seconds
cpu time 0.01 seconds


309
310 data Want;
311 merge Have ID(drop=sortOrder);
312 run;

ERROR: No BY statement was specified for a MERGE statement.
NOTE: The SAS System stopped processing this step because of errors.
WARNING: The data set WORK.WANT may be incomplete. When this step was stopped there were 0
observations and 2 variables.
WARNING: Data set WORK.WANT was not replaced because this step was stopped.
NOTE: DATA statement used (Total process time):
real time 0.02 seconds
cpu time 0.03 seconds

 

 

Thanks!

 

Rick_SAS
SAS Super FREQ

OK. Your site must be configured to run with the MERGENOBY option set to ERROR. Run the following:

proc options option=mergenoby value; run;

 

On my system the SAS log reports:

Option Value Information For SAS Option MERGENOBY
Value: NOWARN
Scope: Default
How option value set: Shipped Default

 

I'm guessing that your value is ERROR.

 

You have two options, The easiest is to change the option:

options MERGENOBY=NOWARN;

 

If you can't do that, then use

obs = _N_;

to create a common variable for both data sets and then include BY OBS; when you merge.

 

 

danwarags
Obsidian | Level 7
Woww. It worked when I set options MERGENOBY= NOWARN. Thank you Rick. Appreciate your help.
Astounding
PROC Star

Here's one way (there's usually more than one way in SAS):

 

random = 99999 + ceil( 2001 * ranuni(12345) );

 

It is possible to assign the same random number on two separate observations this way.

ballardw
Super User

If your unstated requirement is not to repeat any of the values then one approach:

1) assign a random number using Ranuni or Rand('uniform') functions

2) sort by the random number;

3) Replace the random number with 100000 + _n_;

4) You may want to resort by the Id.

 

Question: Do you want to exhaust all of the numbers from 100000 to 102000? That is 2001 numbers. So you may be looking for 100000 to 101999 (If so then 100000+_n_ -1) or 100001 to 102000 (which the above does)

KachiM
Rhodochrosite | Level 12

A simpler way with the use of Sashelp.class:

 

There are 19 observations just as your 2000. Use an array to hold the random number selected and fill the cell with say 1. Next time if the same random number is generated(check the array), take a new random number,

data want;
   array k[19] _temporary_;
   set sashelp.class;
THERE:
   id = ceil(ranuni(123) * 19);
  if k[id] then goto THERE;
  k[id] = 1;
run;
Rick_SAS
SAS Super FREQ

Unfortunately @KachiM's idea will require many iterations to completely obtain the 2000 ID values.  This is because the "Birthday Problem" says that duplicates appear often in random sequences

KachiM
Rhodochrosite | Level 12

Rick:

 

I know there will be duplicates but never realized that there would be so many duplicates.

 

Thank you for knowing it.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 11 replies
  • 7476 views
  • 12 likes
  • 5 in conversation