DATA Step, Macro, Functions and more

Removing specific characters from the beginning and end of a variable

Reply
N/A
Posts: 0

Removing specific characters from the beginning and end of a variable

I have IDs in file A as follows:
p704987-1
p789094-22
p-3455-3

In file B I have IDs as follows:
704987
789094
3455

They are same ID, but in file A, a letter has been added to the beginning as well as a (-) and number following that are not needed.

I need to remove the p at the beginning of the ID variable in file A and remove the (-) and any following numbers after the (-) for ID variable in file A as well so I can merge the data from file A with file B using the ID variable from both files, but I cannot seem to figure out the right code. Any help is appreciated. Thank you. Message was edited by: JHale
SAS Super FREQ
Posts: 8,743

Re: Removing specific characters from the beginning and end of a variable

Hi:
In file A the IDs must be character strings. What about file B? Are the IDs numeric or character variables???

If the data is exactly as you describe (that the IDs in file A always start with a 'p' or a 'p-'), then I'd probably recommend that you investigate the SCAN function. What remains to be determined is whether you will be doing a character to character comparison between the IDs in file A and file B or whether you will be doing a numeric to numeric comparison between the IDs.

cynthia
PROC Star
Posts: 7,363

Re: Removing specific characters from the beginning and end of a variable

Regular expressions would probably be the easiest way to solve the problem, but you could also probably do it with something like:

data want;
set have;
id=substr(id,1,findc(id,'-',-length(id))-1);
id=substr(id,findc(id,,'d'));
run;

HTH,
Art
Super User
Posts: 9,687

Re: Removing specific characters from the beginning and end of a variable

Hi.
Just as Cynthia said.
Use Function scan() would be more simpler.

[pre]
data temp;
input id : $20.;
id=scan(id,-2,'-pP');
put id=;
cards;
p704987-1
p789094-22
p-3455-3
run;

[/pre]


Ksharp
PROC Star
Posts: 7,363

Re: Removing specific characters from the beginning and end of a variable

As long as the prefix is always a 'p' I totally agree.

Art
SAS Super FREQ
Posts: 8,743

Re: Removing specific characters from the beginning and end of a variable

Hi:
Actually, the prefix can be either - or p or p- or -p or not there at all -- SCAN ignores leading delimiters so this data:
[pre]
p704987-1
p789094-22
p-3455-3
-1234-p44
34567p-55
-p98765-66
[/pre]

Can all use this program:
[pre]
data char_id;
length char_id $12;
infile datalines;
input char_id $;
new_char_id = scan(char_id,1,'p-');
new_num_id = input(new_char_id,best8.);
put _n_= char_id= new_char_id= new_num_id= ;
return;
datalines;
p704987-1
p789094-22
p-3455-3
-1234-p44
34567p-55
-p98765-66
;
run;
[/pre]

With these results in the log:
[pre]
_N_=1 char_id=p704987-1 new_char_id=704987 new_num_id=704987
_N_=2 char_id=p789094-22 new_char_id=789094 new_num_id=789094
_N_=3 char_id=p-3455-3 new_char_id=3455 new_num_id=3455
_N_=4 char_id=-1234-p44 new_char_id=1234 new_num_id=1234
_N_=5 char_id=34567p-55 new_char_id=34567 new_num_id=34567
_N_=6 char_id=-p98765-66 new_char_id=98765 new_num_id=98765
[/pre]

SCAN is pretty cool for a low-tech, non-PRX function.

cynthia
N/A
Posts: 0

Re: Removing specific characters from the beginning and end of a variable

Thank you so much! This worked perfectly.
Super User
Posts: 9,687

Re: Removing specific characters from the beginning and end of a variable

Hi.
I scan it from right not left,So as long as the right character is to looks like '-23' '-p23',that will be OK.

Ksharp
N/A
Posts: 0

Re: Removing specific characters from the beginning and end of a variable

Thank you to everyone for your advice and help. Much appreciation.
N/A
Posts: 0

Re: Removing specific characters from the beginning and end of a variable

Hello,

If your string begins with other letters than p maybe this solution will fit:

data test;
input x $40.;
y=compress(substr(x,1,find(trim(x),'-',-length(trim(x)))), ,'kd');
datalines;
p704987-1
p789094-22
p-03455-3
-1234-p44
34567p-55
-p98765-66
;

Marius
Ask a Question
Discussion stats
  • 9 replies
  • 240 views
  • 0 likes
  • 4 in conversation