BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
kbk
Fluorite | Level 6 kbk
Fluorite | Level 6

Hi All, I've created the following macro to generate random IDs/passwords. It essentially constructs a data step with a bunch of output statements. It allows the user to specify the allowed characters, number of passwords and password length. Also included is a check to ensure duplicate IDs/passwords are not included.


My question: Is there a more efficient way to accomplish this? Tweaks to my code or a fundamentally different approach? This macro bogs down when asked to generate 10000 or more IDs/passwords. I originally tried to generate a DATALINES statement with the macro (not allowed) and also tried PROC SQL INSERT, which works but isn't any faster.


*characters to place in the ID, space delimited;

%let allowed = 1 2 3 4 5 6 7 8 9 a b c d e f h i j k m n p r s t u v w x y z;

%let numIDs = 100;    *number of IDs to create;

%let idLen = 6;            *number of characters per ID;

%macro ran6;

     /*delimeters for concatenating macro variables*/

     %let delim_nospace = %str();

     %let delim_space = %str( );

     %let numChars = %length(%sysfunc(compress(&allowed,' '))); *num of total characters available;

     *%put numchars: &numChars;

     %let used = ; *generate IDs go here to check for duplicates;

  

     *output dataset;

  data id_output;

  call streaminit(23);  *set random seed;

  length userID $ &idLen.;

     *this loop controls the number of IDs to generate;

     %do i = 1 %to &numIDs;

     *this loop controls the length of an ID;

     %do j = 1 %to &idLen;

          *follow http://blogs.sas.com/content/iml/2011/08/24/how-to-generate-random-numbers-in-sas/;

          %let u = %sysfunc( rand(UNIFORM) ); *u is random decimal;

          /*sysevalf works because https://communities.sas.com/message/51523#51523*/

          %let x = %sysevalf(1 + (&numChars - 1 ) * &u.);



           *set random into with upper bound eq to number of characters;

          %let k = %sysevalf(&numChars * &u , ceil) ;

           *pull a letter fro the allowed list;

          %let IDPart = %scan(&allowed.,&k, ' ');

          /*%put u: &u  x: &x k: &k IDPart: &IDPart;*/

          %if &j = 1 %then %let userID = &IDPart;

          /*concatenate macro variables, see https://communities.sas.com/message/180272 */

          %else %let userID = %sysfunc(catx(&delim_nospace.,%nrstr(%superq(userID)),%nrstr(%superq(IDPart))));

     %end;

        

     /*if duplicate found, do not increment i*/

     %if %index(&used, &userID) %then %let i = %eval(&i -1);

     /*no duplicat found, write output statement*/

     %else %do;

           userID = "&userID";

           output;

     %end;

     /*add mewest ID to macro variable for duplicate checking*/

     %let used = %sysfunc(catx(&delim_space,%nrstr(%superq(used)),%nrstr(%superq(userID))));

     /*%put used: &used;*/

     /*%put UserID: &userID;*/

     %end;

run;

%mend;

%ran6;

1 ACCEPTED SOLUTION

Accepted Solutions
data_null__
Jade | Level 19

PROC PLAN has some features that make this pretty easy.

%let allowed = '1' '2' '3' '4' '5' '6' '7' '8' '9' 'a' 'b' 'c' 'd' 'e' 'f' 'g' 'h' 'i' 'j' 'k' 'l' 'm' 'n' 'o' 'p' 'q' 'r' 's' 't' 'u' 'v' 'w' 'x' 'y' 'z';
%let n = %sysfunc(countw(%superq(allowed)));
%put NOTE: &=n;
%let numIDs = 100;    *number of IDs to create;
%let idLen = 6;   

proc plan;
  
factors rep=&numids ordered v=6 of &n random / noprint;
  
output out=ids v cvals=(&allowed);
   run;
data words;
   do until(last.rep);
      set ids;
      by rep;
      length word $&idlen;
      word = cats(word,v);
     
end;
  
drop v;
   run;
proc sort data=words nodupkey;
  
by word;
   run;
proc sort data=words;
   by rep;
   run;
proc print;
  
run;

1-29-2015 2-36-12 PM.png

View solution in original post

8 REPLIES 8
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Hi,

Its possible your over complicating things with the macro code:

data temp;

  allowed="123456789abcdefghijklmnopqrstuvwxyz";

  num_ids=100;

  idlen=6;

  length my_id $6;

  num_allowed=length(strip(allowed));

  do i=1 to num_ids;

    do j=1 to 6;

      temp=ceil((num_allowed * rand("uniform")));

      substr(my_id,j,1)=substr(allowed,temp,1);

    end;

    output;

  end;

run;

kbk
Fluorite | Level 6 kbk
Fluorite | Level 6

Hi RW9,

Much more elegant solution. The one item missing though is a guard against duplicate IDs. I realize one could generate extra IDs then dedup and reduce to the number needed but I'm trying to avoid that.

RW9
Diamond | Level 26 RW9
Diamond | Level 26

Well, if its a hundred then you could apply them into an array rather than outputting, then keep looping over until the generated text is not in the array.  Am leaving for the day, but can update the code tomorrow to this effect e.g. array the_ids{num_ids}...

RW9
Diamond | Level 26 RW9
Diamond | Level 26

Just to finish off from where I was yesterday, this should give unique results:

data temp (keep=id_out);

  allowed="123456789abcdefghijklmnopqrstuvwxyz";

  num_ids=100;

  idlen=6;

  array done{100} $200.;

  length my_id $6;

  num_allowed=length(strip(allowed));

  i=1;

  do until (i=num_ids+1);

    do j=1 to 6;

      temp=ceil((num_allowed * rand("uniform")));

      substr(my_id,j,1)=substr(allowed,temp,1);

    end;

    if whichc(my_id,of done{*})=0 then do;

      done{i}=my_id;

      i=i+1;

      id_out=my_id;

      output;

    end;

  end;

run;

ballardw
Super User

Your macro bogs down because you are generating thousands of lines of macro code. You have basically created a program that does this:

data want;

      assign value; output;

      assign value; output;

      assign value; output;

     assign value; output;

     /* repeat*/

run;

where each of those assigns calls a lot of functions

A hint: if you are using that many calls to %sysfunc you might be looking at more of a datastep solution:

%macro ran6(numids= 10, idlen=6, allowed=123456789abcdefhijkmnprstuvwxyz);
  data id_output;
  call streaminit(23);  *set random seed;
  length userID $ &idLen.;
     numchars=length("&allowed");
     *this loop controls the number of IDs to generate;
     do i = 1 to &numIDs;
     *this loop controls the length of an ID;
        do j = 1 to &idLen;
             u =  rand('UNIFORM');
             x = 1 + (numChars - 1 ) * u;
              *set random into with upper bound eq to number of characters;
             k = ceil(numChars * u) ;
   
              *pull a letter fro the allowed list;
             IDPart = substr("&allowed",k,1);
   
             if j = 1 then userID = IDPart;
             /*concatenate macro variables, see https://communities.sas.com/message/180272 */
             else userID = cats(userid,idpart);

       end;/*j*/
    output;
    end; /*i*/
run;
/*remove duplicates */
proc sql noprint;
   create table UserIds as
   select distinct userid
   from id_output;
quit;

%mend;

%ran6;

Change the values after the parameters to customize output for a run of the macro:

%ran6(numid=20); to use the default character list and length.

Also, when writing macros it is not a good idea to use the

*text ; style of comment as they are generated in the macro results. Repeatedly.

Use either %* text ; or /* text */

And for the worry about duplicates, generate more than you "need" as the time spent trying to write extra code probably isn't worth much. And you need to deduplicate across multiple uses of the macro anyway. Hint: sql except.

data_null__
Jade | Level 19

PROC PLAN has some features that make this pretty easy.

%let allowed = '1' '2' '3' '4' '5' '6' '7' '8' '9' 'a' 'b' 'c' 'd' 'e' 'f' 'g' 'h' 'i' 'j' 'k' 'l' 'm' 'n' 'o' 'p' 'q' 'r' 's' 't' 'u' 'v' 'w' 'x' 'y' 'z';
%let n = %sysfunc(countw(%superq(allowed)));
%put NOTE: &=n;
%let numIDs = 100;    *number of IDs to create;
%let idLen = 6;   

proc plan;
  
factors rep=&numids ordered v=6 of &n random / noprint;
  
output out=ids v cvals=(&allowed);
   run;
data words;
   do until(last.rep);
      set ids;
      by rep;
      length word $&idlen;
      word = cats(word,v);
     
end;
  
drop v;
   run;
proc sort data=words nodupkey;
  
by word;
   run;
proc sort data=words;
   by rep;
   run;
proc print;
  
run;

1-29-2015 2-36-12 PM.png
jakarman
Barite | Level 11

kbk do not make it too complicated when needing performance. A datastep executes fast
As it is more easy to solve the deduplication afterwards you gains in performance are better.

I you know hashing you could use that also. The overhead will be that check (in memory) and the memory constraint.  

---->-- ja karman --<-----
Ksharp
Super User

Check UUIDGEN()

%let id=%sysfunc(uuidgen());

%put &id ;

Xia Keshan

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 6603 views
  • 6 likes
  • 6 in conversation