Obsidian | Level 7

## Create a numeric varible based on a character variable

Dear SAS community,

I need your help. Image I have the following column with about 120 different observations

Column1

A

B

C

A

D

E

F

B

C

I want to create a numeric variable (Column2) based on the Column1 like this

Column1       Column2

A                    1

B                    2

C                    3

A                    1

D                    4

E                    5

F                    6

B                    2

C                    3

How can I create the Column2 based on the values from the Column1? I want to have the same number for the same values from the character variable Column1, i.e. A gets the number 1, B the number 2, C the number 3 etc.

Many thanks and best regards

1 ACCEPTED SOLUTION

Accepted Solutions
Tourmaline | Level 20

## Re: Create a numeric varible based on a character variable

Alright. Try this

``````data have;
input Column1 \$;
datalines;
A
B
C
A
D
E
F
B
C
;

data want(drop = c);
if _N_ = 1 then do;
dcl hash h();
h.definekey('Column1');
h.definedata('Column2');
h.definedone();
end;

set have;

if h.find() then do;
c + 1;
Column2 = c;
end;
run;``````

Result:

```Column1 Column2
A       1
B       2
C       3
A       1
D       4
E       5
F       6
B       2
C       3 ```
8 REPLIES 8
Tourmaline | Level 20

## Re: Create a numeric varible based on a character variable

Is this data representable? Is your actual data simply 'A', 'B', .. or more compex strings ?

Obsidian | Level 7

## Re: Create a numeric varible based on a character variable

@PeterClemmensen: Unfortunately, I have complex strings like  HTALMHHGR or 2LBBUIIPN. As I said I have about 120 different observations, the mentioned two are just examples.

Tourmaline | Level 20

## Re: Create a numeric varible based on a character variable

Ok. Does it matter what numeric value is the first? I.e. is it important that A = 1 and not 12345 ?

Obsidian | Level 7

## Re: Create a numeric varible based on a character variable

It is important that A gets 1 B 2 and so on (in ascending order).

Tourmaline | Level 20

## Re: Create a numeric varible based on a character variable

Alright. Try this

``````data have;
input Column1 \$;
datalines;
A
B
C
A
D
E
F
B
C
;

data want(drop = c);
if _N_ = 1 then do;
dcl hash h();
h.definekey('Column1');
h.definedata('Column2');
h.definedone();
end;

set have;

if h.find() then do;
c + 1;
Column2 = c;
end;
run;``````

Result:

```Column1 Column2
A       1
B       2
C       3
A       1
D       4
E       5
F       6
B       2
C       3 ```
Obsidian | Level 7

## Re: Create a numeric varible based on a character variable

@PeterClemmensen: It works, thank you very much.

Tourmaline | Level 20

Anytime 🙂

Super User

## Re: Create a numeric varible based on a character variable

Another approach is a custom informat to read the data into a fixed value.

```proc format;
invalue mystring (upcase)
'ABC' = 3
'PDQ' = 27
'ZZZ' = 18
other=_error_
;

data have;
input x \$;
datalines;
abc
ABC
aBc
pdQ
zzZ
row
;

data use;
set have;
y = input(x,mystring.);
run;```

The infomat approach may be more usable if you are going to have multiple data sets with similar values that you want to always map to the same number. If your data source has entry issues like inconsistent use of letter case the informat option (Upcase) as shown converts all the values to upper case before comparing to the values in the informat and assigning values. Plus, the other=_error_ option will let you know with invalid data messages if one of the values in the data is not defined in the informat (spelling issue or new value entirely). If there are going to be many different data sets the informat may become the preferred choice in the long run for data consistency.

Discussion stats
• 8 replies
• 609 views
• 0 likes
• 3 in conversation