BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
ML86
Obsidian | Level 7

Dear SAS community,

I need your help. Image I have the following column with about 120 different observations

 

Column1

A

B

C

A

D

E

F

B

C

 

I want to create a numeric variable (Column2) based on the Column1 like this

 

Column1       Column2

A                    1

B                    2

C                    3

A                    1

D                    4

E                    5

F                    6                 

B                    2

C                    3

 

How can I create the Column2 based on the values from the Column1? I want to have the same number for the same values from the character variable Column1, i.e. A gets the number 1, B the number 2, C the number 3 etc.

 

Many thanks and best regards

1 ACCEPTED SOLUTION

Accepted Solutions
PeterClemmensen
Tourmaline | Level 20

Alright. Try this

 

data have;
input Column1 $;
datalines;
A
B
C
A
D
E
F
B
C
;

data want(drop = c);
   if _N_ = 1 then do;
      dcl hash h();
	  h.definekey('Column1');
	  h.definedata('Column2');
	  h.definedone();
   end;

   set have;

   if h.find() then do;
      c + 1;
	  Column2 = c;
	  h.add();
   end;
run;

 

Result:

 

Column1 Column2 
A       1 
B       2 
C       3 
A       1 
D       4 
E       5 
F       6 
B       2 
C       3 

View solution in original post

8 REPLIES 8
PeterClemmensen
Tourmaline | Level 20

Is this data representable? Is your actual data simply 'A', 'B', .. or more compex strings ?

ML86
Obsidian | Level 7

@PeterClemmensen: Unfortunately, I have complex strings like  HTALMHHGR or 2LBBUIIPN. As I said I have about 120 different observations, the mentioned two are just examples.

PeterClemmensen
Tourmaline | Level 20

Ok. Does it matter what numeric value is the first? I.e. is it important that A = 1 and not 12345 ?

ML86
Obsidian | Level 7

It is important that A gets 1 B 2 and so on (in ascending order).

PeterClemmensen
Tourmaline | Level 20

Alright. Try this

 

data have;
input Column1 $;
datalines;
A
B
C
A
D
E
F
B
C
;

data want(drop = c);
   if _N_ = 1 then do;
      dcl hash h();
	  h.definekey('Column1');
	  h.definedata('Column2');
	  h.definedone();
   end;

   set have;

   if h.find() then do;
      c + 1;
	  Column2 = c;
	  h.add();
   end;
run;

 

Result:

 

Column1 Column2 
A       1 
B       2 
C       3 
A       1 
D       4 
E       5 
F       6 
B       2 
C       3 
ML86
Obsidian | Level 7

@PeterClemmensen: It works, thank you very much.

ballardw
Super User

Another approach is a custom informat to read the data into a fixed value.

proc format;
invalue mystring (upcase)
'ABC' = 3
'PDQ' = 27
'ZZZ' = 18
other=_error_
;

data have;
   input x $;
datalines;
abc
ABC
aBc
pdQ
zzZ
row
;

data use;
   set have;
   y = input(x,mystring.);
run;

The infomat approach may be more usable if you are going to have multiple data sets with similar values that you want to always map to the same number. If your data source has entry issues like inconsistent use of letter case the informat option (Upcase) as shown converts all the values to upper case before comparing to the values in the informat and assigning values. Plus, the other=_error_ option will let you know with invalid data messages if one of the values in the data is not defined in the informat (spelling issue or new value entirely). If there are going to be many different data sets the informat may become the preferred choice in the long run for data consistency.

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 8 replies
  • 939 views
  • 0 likes
  • 3 in conversation