BookmarkSubscribeRSS Feed
YukiJudai
Calcite | Level 5

I'm currently working on a project where i'm trying to evaluate differences between 2 tables. The id used for both tables are barcodes but for some reason they sent it in different format. 1 has it like ADC.15462ds32 and the other has the 3 letters at the back like 56sdg56348.ADC.   I'm trying to change it so both barcodes are in the same format so i can inner join the two tables.

 

so to do so i was wondering if using cat substr(a.barcode,last 3 digits,) but then i am not sure how to move the . and the 1st part around since the digets may be different.

4 REPLIES 4
Astounding
PROC Star

You might even consider using two variables instead of 1, and joining by both BARCODE3 and BARCODE8.  But to move things around:

 

barcode = catx('.', scan(barcode, 2, '.'), scan(barcode, 1, '.') );

PGStats
Opal | Level 21

If the position of the period is fixed, then it is simpler to move the prefix to the end than the other way around

 

barcode2 = catx('.', substr(barcode1,5), substr(barcode1,1,3) );

PG
Tom
Super User Tom
Super User

If the parts are separated by period (and periods don't appear in the sub-parts) then use the SCAN() function.

* When the letter part is first ;
barcode = 'ADC.15462ds32';
L_part = scan(barcode,1,'.');
N_part = scan(barcode,2,'.');

* When the letter part is last ;
barcode = '56sdg56348.ADC';
L_part = scan(barcode,2,'.');
N_part = scan(barcode,1,'.');

Then you can rebuild the code from the two parts.

new_barcode=catx('.',L_part,N_part);

If the "number" part is always longer than 3 characters and the "letter" part is always three characters then you could convert a random stream into a consistent format.

if length(scan(barcode,1,'.')) > 3 then 
  barcode=catx('.',scan(barcode,2,'.'),scan(barcode,1,'.'))
;
andreas_lds
Jade | Level 19

If you don't insist on using cat-function, try:

 

data want;
    length barcode newcode $ 20;
    
    input barcode;
    
    newcode = prxchange('s/(.+)\.(.+)/$2.$1/i', -1, trim(barcode));
    
    datalines;
ADC.15462ds32
56sdg56348.ADC
;
run;

As you can see the very same parameters for prxchange are used, regardless of the position of the three letters.