Hello -
I have a data set that includes a number of buildings, with multiple records for each building. I want to generate random 5-digit ID for each building, and I want the same ID to apply to each observation from a given building.
I was able to generate IDs for the buildings, like so:
if building^=lag(building) then BuildingID+1;
else BuildingID+0;
run;
However, this gives me IDs that start with 1 and progress consecutively. I really want the IDs to be randomly generated, 5-digit numbers. I can't see how to do this while maintaining the same ID across multiple records for a given building.
Any assistance is appreciated.
Something like:
data new;
set old;
bnum = put( round(100000*Ranuni(327),1),z5.);
run;
Will get you close, though you need to check to see if any duplicates are created in the BNUM.
Since I don't think an identifier variable should have arithmetic performed on it I don't think there should be a problem with a character variable.
Or if you really want a numeric, take out the put, though some of the results will be fewer than 5 digits.
Thank you for this suggestion. I tried it, and it did generate random 5 digit numbers. However, it did not maintain the same buildingID across multiple records from the same building.
Here, in essence, is what I got:
BuildingName BuildingID
FirstBldgName 23456
FirstBldgName 87567
FirstBldgName 92345
SecondBldgName 87454
SecondBLdgName 67845
ThirdBldgName 46789
Here is what I need:
BuildingName BuildingID
FirstBldgName 23456
FirstBldgName 23456
FirstBldgName 23456
SecondBldgName 87454
SecondBLdgName 87454
ThirdBldgName 46789
Any thoughts?
Thanks again.
Get the uniquie buildings and assign the number.
Then merge the data.
Note: New will be in building order when finished. If for some reason you need to keep an existing order you may need an order variable OR the create table can have an Order clause after the From to sort the data: order by var1, var2, var3 as needed.
Proc sql;
create table uniquebldg as
select distinct building, put( round(100000*Ranuni(327),1),z5.) as bnum
from old;
Create table new as
select a.*, b.bnum
from old as a left join uniquebldg as b
on a.building=b.building;
quit;
Here is one way to insure you have random list of five digit numbers from 10000 to 99999. Then to apply that to your data get(set) one from the list when FIRST.BUILDINGNAME. You may need to sort your building data first if the buildings are not grouped.
This worked perfectly. THank you.
Is it possible that the same random id can be repeated for more than one building name?
Get a distinct list of building IDs.
Generate a set of random IDs that meet your criteria.
Create a format out of that
Apply the format in dataset to get your new ID.
proc sql;
create table distinct_id as
select distinct buildingID, put( round(100000*Ranuni(327),1),z5.) as randomID from have as A;
quit;
*create your format;
*apply format;
data want;
set have;
randomID=input(buildingID, $randomID_fmt);
run;
data new;
set old;
by buildingname;
retain bnum;
if first.buildingname then bnum = put( round(100000*Ranuni(327),1),z5.);
run;
Or another function:
data x; length uuid $ 5; do i=1 to 100; uuid=compress(uuidgen(), ,'kd'); output; end; run;
Xia Keshan
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.