SAS Data Integration Studio, DataFlux Data Management Studio, SAS/ACCESS, SAS Data Loader for Hadoop and others

Randomly generated IDs and sequential IDs

Reply
Contributor
Posts: 22

Randomly generated IDs and sequential IDs

Hello,

 

I have a data set that looks something like this:

 

ID1

1

1

2

3

3

3

4

5

6

 

I want to create two new IDs, one would be a randomly generated ID that is one to one with ID1, and another one that is sequential, as follows:

ID1        ID2        ID3

1        456234     1

1        456234     2

2        128504     1

3        546745     1

3        546745     2

3        546745     3

4        387446     1

5        897655     1

6        342269     1

 

 

I am having problems with grouping these and the general coding as well as generating random numbers.

 

Thanks in advance! Smiley Happy

Super Contributor
Posts: 251

Re: Randomly generated IDs and sequential IDs

How about this?

data have;
infile cards;
attrib id1 length=5 label='Initial ID';
input id1;
cards;
1
1
2
3
3
3
4
5
6
;
run;

data want;
set have;
attrib id2 length=8 label='Random ID';
attrib id3 length=5 label='Sequential ID';
by id1;
retain id2 id3;
if first.id1 then do;
   id2 = int(ranuni(0) * 1e6) + 1;
   id3 = 0;
   end;
id3 + 1;
run;
Contributor
Posts: 22

Re: Randomly generated IDs and sequential IDs

[ Edited ]

Thanks LaurieF.

It somewhat works but not quite.

I get the ID2 only for the first observation in each group and the other ones are blank.

ID2 is also sometimes 5 and sometimes 6 digits, I want it to be 6 digits all the time.

 

Any help would be greatly appreciated!

 

 EDIT: I also have units where ID1 is blank, I want those blanks to each have different ID2s, not the same.

Contributor
Posts: 22

Re: Randomly generated IDs and sequential IDs

And I would also like to be able to generate the same random numbers if I run this program again... Like using a seed value or something... I just don't seem to know how to do all this in SAS.

Thanks in advance

Super Contributor
Posts: 251

Re: Randomly generated IDs and sequential IDs

You didn't say that was what you wanted! But it's easy:

   id2 = int(ranuni(0) * 9e5) + 1e5 + 1;

If you want to reproduce the results, replace 0 with any number. I often use a very old phone number.

 

This won't work consistently though: if you were to insert a new ID 3.5 (for example), the results would break down from that point. If you always want an ID to turn into the same random number, all the time, you need a hashing algorithm. I would start looking at the MD5 function.

Valued Guide
Posts: 505

Re: Randomly generated IDs and sequential IDs

Weak reversible encryption (set up for SSNs)

Using rounded random numbers
does not insure 1 to 1 or easy
reversibility and can be problematic.

If your ids are considerable larger than your
key and your ids are less than 9999999999
then this should work.
key has to be a power of 2  (shift bits)

HAVE
====

Up to 40 obs WORK.HAVE total obs=8

Obs       ID1

 1     104111111
 2     567123456
 3     311555555
 4     442819299
 5     224262144
 6     591048576
 7     661048111
 8     999999999

WANT
====

Up to 40 obs WORK.WANT total obs=8

Obs       ID1         ENCRYPT       DECRYPT

 1     104111111     5717080721    104111111
 2     567123456      867082296     56713984
 3     311555555     3065013083    311555555
 4     442819299     9370055045    442819299
 5     224262144     4416057372    224262144
 6     591048576     8693094127    591048576
 7     661048111     3682049608    661048111
 8     999999999    95520070221    999999999


SOLUTION
========

* create some data;
data have;
infile cards;
input id1;
cards;
104111111
567123456
311555555
442819299
224262144
591048576
661048111
999999999
;
run;quit;

* encrypt and decrypt;
data want;
  set have;
  encrypt=encrypt(id1);
  decrypt=decrypt(encrypt);
  put id1 encrypt decrypt;
run;quit;

options cmplib = (work.functions /* add other libs here */);

proc fcmp outlib=work.functions.hashssn;
function encrypt(ssn);
    length rev $16;
    key=8192;
    ssn_remainder = mod(ssn, key);
    ssn_int       = round((ssn - ssn_remainder)/key,1);
    * 8192  ssn_remainder = 1 and ssn_int = 1 ie 1*key + 1 = original value;
    ssn_big   = ssn_int*100000 + ssn_remainder;
    rev=reverse(putn(ssn_big,16.));
    ssn_encrypt=input(rev,16.);
  return(ssn_encrypt);
endsub;
run;quit;

proc fcmp outlib=work.functions.hashssn;
function decrypt(ssn);
    length rev $16;
    key=8192;
    rev=reverse(putn(ssn,16.));
    ssn_big=input(rev,16.);
    ssn_decrypt   = round(ssn_big/100000,1)*key + mod(ssn_big,10000);
  return(ssn_decrypt);
endsub;
;run;quit;

Valued Guide
Posts: 505

Re: Randomly generated IDs and sequential IDs

There is an issue when encrypting ssns that end in 0
Super Contributor
Posts: 251

Re: Randomly generated IDs and sequential IDs

[ Edited ]

@rogerjdeangelis I think it depends on what the specification is, and what the results are being used for, and whether it has to be reversible If it's just for a few hundred or thousand, it should be fine. But if there are more than that, that's why I suggested md5. I know it is possible to create collisions, but it's not easy (and it's good enough for SAS/DI's SCD-2 generator…)

 

But a proper spec would be nice.

Valued Guide
Posts: 505

Re: Randomly generated IDs and sequential IDs

This seems to work


* create some data;
data have;
infile cards;
input id1;
cards;
567123456
104111111
311555555
442819299
224262144
591048576
661048111
999999999
008111111
008110000
010101010
;
run;quit;

* encrypt and decrypt;
data want;
  set have;
  encrypt=encrypt(id1);
  decrypt=decrypt(encrypt);
  put id1 encrypt decrypt;
run;quit;

options cmplib = (work.functions /* add other libs here */);

proc fcmp outlib=work.functions.hashssn;
function encrypt(ssn);
    length rev $17;
    key=8192;
    ssn_remainder = mod(ssn, key);
    ssn_int       = round((ssn - ssn_remainder)/key,1);
    * 8192  ssn_remainder = 1 and ssn_int = 1 ie 1*key + 1 = original value;
    ssn_big   = ssn_int*100000 + ssn_remainder;
    rev=reverse(put(ssn_big,17.));
    if substr(rev,1,1)=0 then rev=cats('-1',substr(rev,2));
    ssn_encrypt=input(rev,17.);
  return(ssn_encrypt);
endsub;
run;quit;

proc fcmp outlib=work.functions.hashssn;
function decrypt(ssn);
    length rev $17;
    key=8192;
    rev=strip(reverse(put(ssn,17.)));
    if index(rev,'-')>0 then substr(rev,index(rev,'-')-1)='0 ';
    ssn_big=input(rev,17.);
    ssn_decrypt   = round(ssn_big/100000,1)*key + mod(ssn_big,10000);
  return(ssn_decrypt);
endsub;
;run;quit;


Up to 40 obs WORK.WANT total obs=11

Obs       ID1            ENCRYPT     DECRYPT

  1    567123456     -1867082296    567123456
  2    104111111      5717080721    104111111
  3    311555555      3065013083    311555555
  4    442819299      9370055045    442819299
  5    224262144      4416057372    224262144
  6    591048576      8693094127    591048576
  7    661048111      3682049608    661048111
  8    999999999     95520070221    999999999
  9      8111111        13010099      8111111
 10      8110000        21180989      8110000
 11     10101010       472003321     10101010



Valued Guide
Posts: 505

Re: Randomly generated IDs and sequential IDs

Strong Encryption 

sha256 is probably enough but if you are paranoid then do sha512

/* T1002220 SAS/WPS/R: De-identify PII(name) using irrevesible strong hashing

SAS/WPS/R: De-identify PII(name) using irrevesible strong hashing

Joining tables with de-identified names

sha256 and sha512 are 'theoretically' secure

You need SAS Secure for the SAS solution.
For WPS/R you don't;

HAVE Two sas datasets sashelp.class and sashelp.classfit
      need to de-identify name
=========================================================

Up to 40 obs SD1.WPSCLASS total obs=4

Obs    NAME        SEX  AGE    HEIGHT  WEIGHT

 1     Roger        F    14      62.8   102.5
 2     Janet        F    15      62.5   112.5
 3     Judy         F    14      64.3      90
 4     Robert       M    12      64.8     128

And

Up to 40 obs SD1.WPSCLASSFIT total obs=9

Obs    NAME        SEX  AGE    HEIGHT  WEIGHT   PREDICT  LOWERMEAN

 1     Louise       F    12      56.3      77     76.48    67.96
 2     James        M    12      57.3      83     80.38    72.66
 3     John         M    12        59    99.5     87.01    80.47
 4     Janet        F    15      62.5   112.5    100.66    95.22
 5     Carol        F    14      62.8   102.5    101.83    96.37
 6     Judy         F    14      64.3      90    107.68   101.84
 7     Barbara      F    13      65.3      98    111.57   105.26
 8     William      M    15      66.5     112    116.25   109.18
 9     Alfred       M    14        69   112.5    126.00   116.94

WANT ( Join the two datasets but with de-identified name )
==========================================================

Up to 40 obs SD1.WPSCLASSFIT total obs=9

Obs    NAME        SEX  AGE    HEIGHT  WEIGHT   PREDICT  LOWERMEAN

 1     Janet        F    15      62.5   112.5    100.66    95.22
 2     Judy         F    14      64.3      90    107.68   101.84


WITH DE-IDENTIFIED 'NAME' using 256bit hash
===========================================

Up to 40 obs from classwps total obs=4

Obs    NAME                                                                SEX  AGE HEIGHT WEIGHT

 1     689ffebb9149498fb8189d6e3ad1625c8bc1e1c0f0849ca40f2df83eb2930c7b     F    14   62.8 102.5
 2     909428873964ece00ba2d2fbca40035e7685a04687821c9fc7ea4cbb047b70fc     F    15   62.5 112.5
 3     f43da163b5f48c69f909facc9a5880c5666c5ccffa4837890bb4519542f03e45     F    14   64.3    90
 4     7d359106f754613d9354353f2a403b88b76fb1569ea81524f54d479f4c0d610f     M    12   64.8   128



Obs    NAME                                                               SEX  AGE    HEIGHT  WEIGHT   PREDICT  LOWERMEAN

 1     b68d4a840d7e218045d98247e00d79ed7bb1eff3b1fff117dc93db32f4699bfe    F    12      56.3      77     76.48    67.96
 2     7007b18f0d9cb87601ba50c83e761006dc185270a4cf2d55812e0863be8ff566    M    12      57.3      83     80.38    72.66
 3     632dc5c5235dc0b5b411f8f5e4a6f619ce4155ef99d3d8adfcd026288c795187    M    12        59    99.5     87.01    80.47

*4     909428873964ece00ba2d2fbca40035e7685a04687821c9fc7ea4cbb047b70fc    F    15      62.5   112.5    100.66    95.22

 5     f90982571d7ace45426530062c7008fc370be6a7faeb9ecb05733758960b0c39    F    14      62.8   102.5    101.83    96.37

*6     f43da163b5f48c69f909facc9a5880c5666c5ccffa4837890bb4519542f03e45    F    14      64.3      90    107.68   101.84

 7     6615925a48704e803ee5e32ad0209bb348698ce575effe73da268d9e606c3208    F    13      65.3      98    111.57   105.26
 8     2b008dfe3f79c53c59ab5392929c06c970d71b62c58527ef51d07a50616dbbd0    M    15      66.5     112    116.25   109.18
 9     177d26a3ce597f633b7194aee7fa886111ab1f2edfd37854e0a179ea533154c6    M    14        69   112.5    126.00   116.94

OR

Up to 40 obs WORK.NTRSEC total obs=2

Obs    NAME                                                             SEX  AGE    HEIGHT  WEIGHT   PREDICT  LOWERMEAN

 1     909428873964ece00ba2d2fbca40035e7685a04687821c9fc7ea4cbb047b70fc  F    15      62.5   112.5    100.66    95.22
 2     f43da163b5f48c69f909facc9a5880c5666c5ccffa4837890bb4519542f03e45  F    14      64.3      90    107.68   101.84


MAPPING THE HASH TO INTEGERS
============================

Up to 40 obs from CLASSWPSFIX total obs=4

     NAME
     CRYPTO    SEX   AGE  HEIGHT  WEIGHT

          1     F     14    62.8   102.5
          4     M     12    64.8     128
         *2     F     15    62.5   112.5 ** Match 1
         *3     F     14    64.3      90 ** Match 2

Up to 40 obs from CLASSWPS total obs=9

     NAME
     CRYPTO    SEX   AGE  HEIGHT  WEIGHT   PREDICT   LOWERMEAN

          5     F     12    56.3      77    76.48    67.96
          6     M     12    57.3      83    80.38    72.66
          7     M     12      59    99.5    87.01    80.47

         *2     F     15    62.5   112.5    100.6    95.22 ** Match 1

          9     F     14    62.8   102.5    101.8    96.37

         *3     F     14    64.3      90    107.6    101.8 ** Match 2

         11     F     13    65.3      98    111.5    105.2
         12     M     15    66.5     112    116.2    109.1
         13     M     14      69   112.5    126.0    116.9

AFTER MATCH ON NUM_CRYPTO

     NAME
Obs  CRYPTO    SEX  AGE   HEIGHT    WEIGHT  PREDICT   LOWERMEAN

 1        2     F    15     62.5     112.5    100.66   95.225
 2        3     F    14     64.3        90    107.68   101.84


*____    _    ____
/ ___|  / \  / ___|
\___ \ / _ \ \___ \
 ___) / ___ \ ___) |
|____/_/   \_\____/

;


libname sd1 "d:/sd1";
data sd1.sasclass;
  length name $32;
  format name $hex64.;
  set sashelp..class;
  name=sha256(name);
  if mod(_n_,4) = 0 then do;
    if _n_=4 then name=sha256('Roger');
    output;
  end;
run;quit;

data sd1.sasclassfit;
  length name $32;
  format name $hex64.;
  set sashelp.classfit;
  name=sha256(name);
  if mod(_n_,2) = 0;
run;quit;

proc sql;
  create
     table ntrsec as
  select
     l.*
    ,r.predict
    ,r.lowermean
  from
     sd1.sasclass as l, sd1.sasclassfit as r
  where
     l.name = r.name
;quit;
                                                                                                 Predicted  Bound of
                                                                                                   Value of  95% C.I.
NAME                                                              SEX   AGE    HEIGHT    WEIGHT     Weight  for Mean
--------------------------------------------------------------------------------------------------------------------
D4B7A1817B10284BBC7135509F57659210CAEB70C06F91644CDB0D6B19054C47  F      15      62.5     112.5   100.6625  95.22579
E8EE6BCC3464E592F1DD07E0EE6D3FB93EC132AA4D867AA52818A641304475F5  F      14      64.3        90   107.6807  101.8416


#! WPS-R ;

*_        ______  ____        ____
\ \      / /  _ \/ ___|      |  _ \
 \ \ /\ / /| |_) \___ \ _____| |_) |
  \ V  V / |  __/ ___) |_____|  _ <
   \_/\_/  |_|   |____/      |_| \_\

;

proc datasets lib=work kill;
run;quit;

* need at aleast one dataset in work for WPS to switch to sas7bdats
  instead of wpd datasets;

data makesas;
 set sashelp.class;
run;quit;

options validvarname=upcase;
libname sd1 "d:/sd1";
data sd1.wpsclass;
  set sashelp.class;
  if mod(_n_,4) = 0 then do;
    if _n_=4 then name='Roger';
    output;
  end;
run;quit;

data sd1.wpsclassfit;
  set sashelp.classfit;
  if mod(_n_,2) = 0;
run;quit;

* I think there is an issue with binary data and WPS import;
* so I switched to hex;

%utl_submit_wps64('
libname sd1 "d:/sd1";
options set=R_HOME "C:/Program Files/R/R-3.3.2";
libname wrk "%sysfunc(pathname(work))";
libname hlp "C:\Program Files\SASHome\SASFoundation\9.4\core\sashelp";
proc r;
submit;
source("C:/Program Files/R/R-3.3.2/etc/Rprofile.site", echo=T);
library(digest);
library(haven);
class       <- read_sas("d:/sd1/wpsclass.sas7bdat");
shaclass    <- as.character(lapply(class$NAME,function(x) paste0(digest(x, algo=c("sha256"),raw=TRUE),collapse="")));
classwps    <- cbind(name_crypto=c(1:nrow(class)),SHACLASS=shaclass,class[,-1]);
classfit    <- read_sas("d:/sd1/wpsclassfit.sas7bdat");
shaclassfit <- as.character(lapply(classfit$NAME,function(x) paste0(digest(x, algo=c("sha256"),raw=TRUE),collapse="")));
classfitwps <- cbind(name_crypto=c((nrow(class)+1):(nrow(class)+nrow(classfit))),SHACLASS=shaclassfit,classfit[,-1]);
endsubmit;
import r=classwps data=wrk.classwps;
import r=classfitwps data=wrk.classfitwps;
run;quit;
');

/*
The WPS System

  name_crypto                                                         SHACLASS SEX AGE HEIGHT WEIGHT
1           1 689ffebb9149498fb8189d6e3ad1625c8bc1e1c0f0849ca40f2df83eb2930c7b   F  14   62.8  102.5
2           2 909428873964ece00ba2d2fbca40035e7685a04687821c9fc7ea4cbb047b70fc   F  15   62.5  112.5
3           3 f43da163b5f48c69f909facc9a5880c5666c5ccffa4837890bb4519542f03e45   F  14   64.3   90.0
4           4 7d359106f754613d9354353f2a403b88b76fb1569ea81524f54d479f4c0d610f   M  12   64.8  128.0
[1]  5  6  7  8  9 10 11 12 13
  name_crypto                                                         SHACLASS SEX AGE HEIGHT WEIGHT   PREDICT LOWERMEAN
1           5 b68d4a840d7e218045d98247e00d79ed7bb1eff3b1fff117dc93db32f4699bfe   F  12   56.3   77.0  76.48849  67.96005
2           6 7007b18f0d9cb87601ba50c83e761006dc185270a4cf2d55812e0863be8ff566   M  12   57.3   83.0  80.38752  72.66709
3           7 632dc5c5235dc0b5b411f8f5e4a6f619ce4155ef99d3d8adfcd026288c795187   M  12   59.0   99.5  87.01587  80.47925
4           8 909428873964ece00ba2d2fbca40035e7685a04687821c9fc7ea4cbb047b70fc   F  15   62.5  112.5 100.66247  95.22579
5           9 f90982571d7ace45426530062c7008fc370be6a7faeb9ecb05733758960b0c39   F  14   62.8  102.5 101.83218  96.37505
6          10 f43da163b5f48c69f909facc9a5880c5666c5ccffa4837890bb4519542f03e45   F  14   64.3   90.0 107.68073 101.84160
7          11 6615925a48704e803ee5e32ad0209bb348698ce575effe73da268d9e606c3208   F  13   65.3   98.0 111.57976 105.26025
8          12 2b008dfe3f79c53c59ab5392929c06c970d71b62c58527ef51d07a50616dbbd0   M  15   66.5  112.0 116.25859 109.18222
9          13 177d26a3ce597f633b7194aee7fa886111ab1f2edfd37854e0a179ea533154c6   M  14   69.0  112.5 126.00617 116.94168
*/

proc sql;
  select
    l.*
  from
    classfitwps as l, classwps as r
  where
    l.shaclass = r.shaclass
;quit;

 SHACLASS                                                          SEX  AGE    HEIGHT    WEIGHT   PREDICT  LOWERMEAN
---------------------------------------------------------------------------------------------------------------------
 909428873964ece00ba2d2fbca40035e7685a04687821c9fc7ea4cbb047b70fc  F     15      62.5     112.5  100.6625   95.22579
 f43da163b5f48c69f909facc9a5880c5666c5ccffa4837890bb4519542f03e45  F     14      64.3        90  107.6807   101.8416

*__  __    _    ____      ___ _   _ _____
|  \/  |  / \  |  _ \    |_ _| \ | |_   _|
| |\/| | / _ \ | |_) |____| ||  \| | | |
| |  | |/ ___ \|  __/_____| || |\  | | |
|_|  |_/_/   \_\_|       |___|_| \_| |_|
;

data class_classfit;
  set classwps classfitwps;
run;quit;

/*
Up to 40 obs WORK.CLASS_CLASSFIT total obs=13

Obs     NAME_CRYPTO    SHACLASS

  1               1    689ffebb9149498fb8189d6e3ad1625c8bc1e1c0f0849ca40f2df83eb2930c7b

  2               2    909428873964ece00ba2d2fbca40035e7685a04687821c9fc7ea4cbb047b70fc ***** DUP

  3               3    f43da163b5f48c69f909facc9a5880c5666c5ccffa4837890bb4519542f03e45 ** DUP

  4               4    7d359106f754613d9354353f2a403b88b76fb1569ea81524f54d479f4c0d610f
  5               5    b68d4a840d7e218045d98247e00d79ed7bb1eff3b1fff117dc93db32f4699bfe
  6               6    7007b18f0d9cb87601ba50c83e761006dc185270a4cf2d55812e0863be8ff566
  7               7    632dc5c5235dc0b5b411f8f5e4a6f619ce4155ef99d3d8adfcd026288c795187

  8               8    909428873964ece00ba2d2fbca40035e7685a04687821c9fc7ea4cbb047b70fc ***** DUP

  9               9    f90982571d7ace45426530062c7008fc370be6a7faeb9ecb05733758960b0c39

 10              10    f43da163b5f48c69f909facc9a5880c5666c5ccffa4837890bb4519542f03e45 ** DUP

 11              11    6615925a48704e803ee5e32ad0209bb348698ce575effe73da268d9e606c3208
 12              12    2b008dfe3f79c53c59ab5392929c06c970d71b62c58527ef51d07a50616dbbd0
 13              13    177d26a3ce597f633b7194aee7fa886111ab1f2edfd37854e0a179ea533154c6
*/


* remove dups we need 1 to 1
proc sort data=class_classfit
    out=lookup(keep=name_crypto shaclass) nodupkey;
by shaclass;
run;quit;

/*
Up to 40 obs WORK.LOOKUP total obs=11

    NAME
Obs CRYPTO    SHACLASS

  1     13    177d26a3ce597f633b7194aee7fa886111ab1f2edfd37854e0a179ea533154c6
  2     12    2b008dfe3f79c53c59ab5392929c06c970d71b62c58527ef51d07a50616dbbd0
  3      7    632dc5c5235dc0b5b411f8f5e4a6f619ce4155ef99d3d8adfcd026288c795187
  4     11    6615925a48704e803ee5e32ad0209bb348698ce575effe73da268d9e606c3208
  5      1    689ffebb9149498fb8189d6e3ad1625c8bc1e1c0f0849ca40f2df83eb2930c7b
  6      6    7007b18f0d9cb87601ba50c83e761006dc185270a4cf2d55812e0863be8ff566
  7      4    7d359106f754613d9354353f2a403b88b76fb1569ea81524f54d479f4c0d610f
  8      2    909428873964ece00ba2d2fbca40035e7685a04687821c9fc7ea4cbb047b70fc
  9      5    b68d4a840d7e218045d98247e00d79ed7bb1eff3b1fff117dc93db32f4699bfe
 10      3    f43da163b5f48c69f909facc9a5880c5666c5ccffa4837890bb4519542f03e45
 11      9    f90982571d7ace45426530062c7008fc370be6a7faeb9ecb05733758960b0c39
*/

* add mapped integers to each dataset;
data _null_;
  do dsn="classfitwps","classwps";
     call symputx('dsn',dsn);
     rc=dosubl('
       proc sql;
         create
           table &dsn.fix(drop=shaclass) as
         select
           l.name_crypto
          ,r.*
         from
          lookup as l, &dsn as r
         where
          l.shaclass = r.shaclass
       ;quit;
     ');
  end;
run;quit;

* change sha256 to an unique integer for each datset ;

Up to 40 obs from CLASSWPSFIX total obs=4

     NAME
Obs  CRYPTO    SEX     AGE          HEIGHT          WEIGHT

 1        1     F       14            62.8           102.5
 2        4     M       12            64.8             128
 3        2     F       15            62.5           112.5
 4        3     F       14            64.3              90

    NAME
Obs CRYPTO   SEX  AGE HEIGHT  WEIGHT  PREDICT   LOWERMEAN

 1      5     F    12   56.3      77    76.48   67.960
 2      6     M    12   57.3      83    80.38   72.667
 3      7     M    12     59    99.5    87.01     80.4
 4      2     F    15   62.5   112.5    100.6   95.225
 5      9     F    14   62.8   102.5    101.8   96.375
 6      3     F    14   64.3      90    107.6   101.84
 7     11     F    13   65.3      98    111.5   105.26
 8     12     M    15   66.5     112    116.2   109.18
 9     13     M    14     69   112.5    126.0   116.94


proc sql;
  create
     table ntrsec as
  select
     l.*
    ,r.predict
    ,r.lowermean
  from
     classwpsfix as l, classfitwpsfix as r
  where
     l.name_crypto = r.name_crypto
;quit;


Up to 40 obs WORK.NTRSEC total obs=2

     NAME
Obs  CRYPTO    SEX  AGE   HEIGHT    WEIGHT  PREDICT   LOWERMEAN

 1        2     F    15     62.5     112.5    100.66   95.225
 2        3     F    14     64.3        90    107.68   101.84

Contributor
Posts: 22

Re: Randomly generated IDs and sequential IDs

Thank you for all the help.

I have almost figured it out but I seem to have a problem with duplicate IDs, when I do it this way... Because the random IDs are generated separately for those with blank ID1s and those with non blank ID1s, there seems to be duplicate IDs.

 

I just need to fix the problem with duplicate random P_IDs, everything else works perfectly it seems.

 

Data want;
set have;
attrib P_ID length=$10 label='Person ID';
attrib T_ID length=5 label='Transaction ID';
by ID1;
retain P_ID T_ID;
 
if first.ID1 and ID1 NE '' then do;
        P_ID = CATS ("X_", put(int(ranuni(26298)*1e6)+1,z6.) );
        T_ID=1;
        end;
 
else if ID1 = '' then do;
        P_ID = CATS ("X_", put(int(ranuni(58787)*1e6)+1,z6.) );
        T_ID=1;
        end;
 
else T_ID+1;
run;
Contributor
Posts: 22

Re: Randomly generated IDs and sequential IDs

the duplication of my P_IDs isn't only beteween those with blank ID1s and those that have IDs. There are units which have 2 different, non blank IDs and have the same P_IDs. So I guess the ranuni doesn't ensure the IDs will be unique? 

Ask a Question
Discussion stats
  • 11 replies
  • 276 views
  • 0 likes
  • 3 in conversation