BookmarkSubscribeRSS Feed
grayab
Calcite | Level 5

Hello -

I have a data set that includes a number of buildings, with multiple records for each building.  I want to generate random 5-digit ID for each building, and I want the same ID to apply to each observation from a given building.

I was able to generate IDs for the buildings, like so:

data new; set old;

  if building^=lag(building) then BuildingID+1;

    else BuildingID+0;

run;

However, this gives me IDs that start with 1 and progress consecutively.  I really want the IDs to be randomly generated, 5-digit numbers.  I can't see how to do this while maintaining the same ID across multiple records for a given building. 

Any assistance is appreciated.

9 REPLIES 9
ballardw
Super User

Something like:

 

data new;

set old;

bnum = put( round(100000*Ranuni(327),1),z5.);

run;

Will get you close, though you need to check to see if any duplicates are created in the BNUM.

Since I don't think an identifier variable should have arithmetic performed on it I don't think there should be a problem with a character variable.

Or if you really want a numeric, take out the put, though some of the results will be fewer than 5 digits.

grayab
Calcite | Level 5

Thank you for this suggestion.  I tried it, and it did generate random 5 digit numbers. However, it did not maintain the same buildingID across multiple records from the same building.

Here, in essence, is what I got:

BuildingName              BuildingID

FirstBldgName               23456

FirstBldgName               87567

FirstBldgName               92345

SecondBldgName          87454

SecondBLdgName         67845

ThirdBldgName              46789

Here is what I need:

BuildingName              BuildingID

FirstBldgName               23456

FirstBldgName               23456

FirstBldgName               23456

SecondBldgName          87454

SecondBLdgName         87454

ThirdBldgName              46789

Any thoughts?

Thanks again.

ballardw
Super User

Get the uniquie buildings and assign the number.

Then merge the data.

Note: New will be in building order when finished. If for some reason you need to keep an existing order you may need an order variable OR the create table can have an Order clause after the From to sort the data: order by var1, var2, var3 as needed.

Proc sql;

     create table uniquebldg as

     select distinct building, put( round(100000*Ranuni(327),1),z5.) as bnum

      from old;

     Create table new as

     select a.*, b.bnum

     from old as a left join uniquebldg as b

          on a.building=b.building;

quit;


data_null__
Jade | Level 19

Here is one way to insure you have random list of five digit numbers from 10000 to 99999.  Then to apply that to your data get(set) one from the list when FIRST.BUILDINGNAME.  You may need to sort your building data first if the buildings are not grouped.

proc plan seed=32329;
  
factors randomID=90000 / noprint;
  
output out=randomID randomID nvals=(10000 to 99999);
   run;
  
quit;
  
data buildings;  
  
input buildingName :$32.;
  
cards;  
FirstBldgName             
FirstBldgName             
FirstBldgName             
SecondBldgName         
SecondBldgName         
SecondBldgName         
SecondBLdgName        
SecondBLdgName        
SecondBLdgName        
SecondBLdgName        
ThirdBldgName   
;;;;
   run;
data building2;
   set buildings;
   by buildingname notsorted;
  
if first.buildingname then set randomid;
   run;
4-30-2014 11-14-55 AM.png
grayab
Calcite | Level 5

This worked perfectly.  THank you.

ilikesas
Barite | Level 11

Is it possible that the same random id can be repeated for more than one building name?

Smitha9
Fluorite | Level 6
What is the firstbuildingname comes in the 12th observation? I want the same ID even if it comes later in the observation.
Reeza
Super User

Get a distinct list of building IDs.

Generate a set of random IDs that meet your criteria.

Create a format out of that

Apply the format in dataset to get your new ID.

proc sql;

     create table distinct_id as

select distinct buildingID, put( round(100000*Ranuni(327),1),z5.) as randomID from have as A;

quit;

*create your format;

*apply format;

data want;

set have;

randomID=input(buildingID, $randomID_fmt);

run;

Ksharp
Super User

data new;

set old;

by buildingname;

retain bnum;

if first.buildingname then bnum = put( round(100000*Ranuni(327),1),z5.);

run;

Or another function:

data x;
length uuid $ 5;
do i=1 to 100;
 uuid=compress(uuidgen(), ,'kd');
 output;
 end;
 run;

Xia Keshan

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 9 replies
  • 10542 views
  • 4 likes
  • 7 in conversation