DATA Step, Macro, Functions and more

Converting a character including number to unique number

Reply
Occasional Contributor
Posts: 16

Converting a character including number to unique number

[ Edited ]

Hi experts,

 

I have a party_number varible which may contain records like these: A12345 , 123E45 , 67588H.

 

I want to change the character letters with their alphabetic order numbers.

For example A is the first character of the alphabet and I want to convert A12345 to 112345.

 

123E45  ==> 123545

67588H ==> 675888

 

How can I do this with SAS coding?

 

 

Many thanks,

Onur

Super User
Super User
Posts: 9,456

Re: Converting a character including number to unique number

You can use the rank() procedure:

http://documentation.sas.com/?docsetId=lefunctionsref&docsetTarget=p0m5k2s76pmv9pn1n1lu3vfyq8s4.htm&...

 

This will return the number of the character in its position in the character table, so A is 65, you can then just remove 64 to get 1, B would be 66, so -64 = 2 etc.

PROC Star
Posts: 1,218

Re: Converting a character including number to unique number

There are several ways to do this, though @RW9s solution is probably the best.

 

But you say you want to make the number unique? What if you have two occurrences of the value A12345? Then they will both be converted to 112345 and will not be unique?

Occasional Contributor
Posts: 16

Re: Converting a character including number to unique number

Thank you very much for your answer.

Actually these are passport numbers and I think there are no passport numbers with exactly the same char-number combination.

 

rank function can be used but char values may be in the beggining, middle etc.. how I take the char value, convert to number and put it to record?

 

Many thanks

Super User
Super User
Posts: 9,456

Re: Converting a character including number to unique number

A string is just an array of characters, so:

data want;
  infile datalines;
  length pp_no new_ppno $20;
  input ppno $;
  do i=1 to lengthn(ppno);
    if anyalpha(char(ppno,i)) then substr(new_ppno,i,1)=put(rank(char(ppno,i))-64,1.);
    else substr(new_ppno,i,1)=char(ppno,i);
  end;
datalines;
A1234
23F56
12345
;
run; 
Respected Advisor
Posts: 4,702

Re: Converting a character including number to unique number


@dincooo wrote:

Thank you very much for your answer.

Actually these are passport numbers and I think there are no passport numbers with exactly the same char-number combination.

 


 

Following your logic consider below case. Is that what you want?

Original Passport Number Converted Passport Number
A2345 12345
12C45 12345

 

Occasional Contributor
Posts: 16

Re: Converting a character including number to unique number

No it is a case which i dont want. So i missed that point. Do you have any idea how can make these passport numbers as unique number?

Super User
Super User
Posts: 9,456

Re: Converting a character including number to unique number

Sort a distinct list of passports, then assign a unique number:

proc sort data=have out=uniques nodupkey;
  by passport_no;
run;

data uniques (keep=passport_no passport_unique);
  set uniques;
  passport_unique=_n_;
run;

data want;
  merge have uniques;
  by passport_no;
run;
Super User
Posts: 9,940

Re: Converting a character including number to unique number

It depends on what you want to do. As a unique key, your current passport numbers are already perfect.

And keys are stored as strings, not as numbers.

Numbers are only needed for calculations, and I don't see any useful calculation done with passport numbers.

 

One way to deal with both of the problems I mentioned:

Start your number with the position of the character in the string, and code the character into two digits (01 go 26).

Another: extract the PP numbers to a new dataset and assign them a sequential number (_n_). Then use that lookup dataset to recode.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
How to post code
Respected Advisor
Posts: 4,702

Re: Converting a character including number to unique number


@dincooo wrote:

No it is a case which i dont want. So i missed that point. Do you have any idea how can make these passport numbers as unique number?


 

You could use the passport numbers alphanumeric directly as keys. I don't see any reason why they would need to be numeric.

But you can of course also generate an additional key.

 

One way to go is to just sort by passport number and then count unique groups. 

data want;
  set have;
  by passport_no;
  if first.passport_no then passport_id+1;
run;

This approach is totally fine if that's an adhoc task. If you need to implement some sort of ETL process where you might want to load additional passport records later on, then you would of course need some lookup process to only create additional passport id's if they don't exist already.

 

Alternatively: Generate a hash key. Functions like MD5() or SHA() can be used for this. These functions will always return the same output value for the same input value.

data test;
  passport_no='A1234';
  passport_id=put(md5(pp_no),hex32.);
run;

 

Super User
Posts: 9,940

Re: Converting a character including number to unique number

What if you have these passport numbers:

A54321
15432A

?

And what wiIl you do with characters K and following?

 

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
How to post code
Occasional Contributor
Posts: 16

Re: Converting a character including number to unique number

Posted in reply to KurtBremser

I missed that point you're right..

Do you have any other suggestion how can i make passport numbers unique?

 

All suggestions are more than welcome Smiley Happy

Super User
Posts: 13,358

Re: Converting a character including number to unique number


@dincooo wrote:

Hi experts,

 

I have a party_number varible which may contain records like these: A12345 , 123E45 , 67588H.

 

I want to change the character letters with their alphabetic order numbers.

For example A is the first character of the alphabet and I want to convert A12345 to 112345.

 

123E45  ==> 123545

67588H ==> 675888

 

How can I do this with SAS coding?

 

 

Many thanks,

Onur


What is length assigned to the variable? What is longest (character count) of values in the current variable? I ask because if your variable has a length of 6, as implied by your example data and you have one or more letters J or later in the alphabet then you will be replacing one character with two for each letter. And the result may not fit your original variable and would require either a new variable or additional steps to change the length of your current variable.

Here is a very basic example:

data example;
   x='J'; /* x has length of 1*/
   x='10'; /* attempt to assign 10 instead of J*/
   put x=; /* x has value of '1' in the data set*/
run;

If your value currently has 6 characters and may contain all 6 as letters then you need something that will hold at least 12 characters as the result.

 

Ask a Question
Discussion stats
  • 12 replies
  • 117 views
  • 1 like
  • 6 in conversation