SAS Procedures

dcruik · Posted 06-10-2015 12:53 PM

Hello,

I'm trying to figure out a technique of removing a specific character value from a string where the character value is at a specific position in the string. For example, I have a variable like the following: "N,N,N,Y". And say I want to remove the value in the second position of that string (in this case it would be the N in bold), so that the new value of the variable is now: "N,N,Y".

I've tried using the technique of combining multiple procedures (substr and tranwrd); however, I have not had any luck. Below is an example of how I used the combination:

newvariable=tranwrd(substr(oldvariable,position,1),"N,","");

Unfortunately, the tranwrd function finds all "N," and removes them so I get the new variable value equal to "Y". The similar issue I get is if the variable is equal to "N,N,N" and I want to remove the second position of that string again (in this case it would again be the N in bold), so that the new value is equal to "N,N". However, using the combination of functions previously shown, I get the value "N".

Is there a specific character function (or combination of functions) I could use that can remove/replace a value at a specific position within a string? Thank you in advance for any advice or help!

CTorres · Posted 06-10-2015 04:08 PM

My program works as you want:

data have ;

length ID position 8 Have $ 20;

input ID position have;

cards;

1000 2 N,N,Y

1001 2 Y,N,N,N,Y

1002 4 N,N,N,N,N

1003 6 N,Y,Y,N,Y,Y,N

1004 5 N,Y,N,N,N,Y,N

1005 3 Y,N,Y

;

RUN;

data want;

set have;

length Wanted $20;

do i=1 to countw(have);

if i=position then continue;

wanted=cats(wanted,scan(have,i),',');

end;

wanted=substr(wanted,1,length(wanted)-1);

drop i;

run;

CTorres

View solution in original post

Astounding · Posted 06-10-2015 02:07 PM

Here's one way (untested):

first_comma = find(old_variable, ',');

second_comma = find(old_variable, ',' , first_comma+1);

newvariable = substr(old_variable, 1, first_comma), || substr(old_variable, second_comma+1);

Steelers_In_DC · Posted 06-10-2015 02:20 PM

This suits your example but I'm not sure how dynamic your process has to be. Let me know if this helps:

data have;

infile cards dsd;

input string $;

cards;

nnny

;

run;

data want;

set have;

new_string = substr(string,1,1)||substr(string,3,2);

run;

dcruik · Posted 06-10-2015 02:32 PM

Thank you both for your inputs, I appreciate your help with this. Unfortunately, the value of the string variable could be many different values i.e. N,Y or N,N,Y or N,Y,Y,N or N,N,N,N,Y, etc...

I was hoping there was some kind of function, similar to the tranwrd, that could replace a value at a specific position within the string. So, if I needed the 3rd value removed from the last example given: "N,N,N,N,Y" (the N in bold), I could get the resulting value of the new variable equal to "N,N,N,Y". But, what is dynamic is that the position of the value I want to remove can change by observation; as well as, the value of the actual string variable I'm wanting to replace and change.

Thanks again!

Astounding · Posted 06-10-2015 02:48 PM

In that case, you might have to define the problem a bit more. For example, would POSITION change from one observation to the next? Is it a variable in the incoming data set, or a macro variable that applies to every observation? Are your character strings always one character long, so that the commas are always in the even-numbered positions?

For what it is worth, the answer I posted always removes the second piece of the string, regardless of the lengths between the commas.

dcruik · Posted 06-10-2015 03:58 PM

Sorry for being vague. The position is a variable in the data set extracted from another variable and in the string, the values are all 1 character with commas separating them. Here would be an example of the data set:

ID Position Have Wanted

1000 2 N,N,Y N,Y

1001 2 Y,N,N,N,Y Y,N,N,Y

1002 4 N,N,N,N,N N,N,N,N

1003 6 N,Y,Y,N,Y,Y,N N,Y,Y,N,Y,N

1004 5 N,Y,N,N,N,Y,N N,Y,N,N,Y,N

1005 3 Y,N,Y Y,N

CTorres · Posted 06-10-2015 04:08 PM

My program works as you want:

data have ;

length ID position 8 Have $ 20;

input ID position have;

cards;

1000 2 N,N,Y

1001 2 Y,N,N,N,Y

1002 4 N,N,N,N,N

1003 6 N,Y,Y,N,Y,Y,N

1004 5 N,Y,N,N,N,Y,N

1005 3 Y,N,Y

;

RUN;

data want;

set have;

length Wanted $20;

do i=1 to countw(have);

if i=position then continue;

wanted=cats(wanted,scan(have,i),',');

end;

wanted=substr(wanted,1,length(wanted)-1);

drop i;

run;

CTorres

dcruik · Posted 06-10-2015 04:14 PM

It does. Thank you very much for your help with this!

Astounding · Posted 06-10-2015 04:16 PM

OK, this might do the trick then.

wanted = have || ' ';

if position = 1 then wanted = substr(wanted, 3);

else wanted = substr(wanted, 1, 2*position - 3) || substr(wanted, 2*position);

The purpose of appending a blank on the first line is to guarantee that SUBSTR never tries to read past the end of the original string (such as when POSITION indicates that the last Y/N should be removed).

If the items between commas vary in length, more complex solutions are needed. (See, for example, the CTorres suggestion.)

Good luck.

CTorres · Posted 06-10-2015 03:13 PM

When you say the position in the string is dynamic I assume that this value is known in every observation.

The following program removes the text found in different positions (del variable) in the text separated by commas:

data have ;

length del 8 text $ 20;

input del text;

cards;

2 N,N,Y,Y,N,XX,G

4 YX,N,Y,N,N,Y,AA

1 N,X,T,Y,Y,A,F

3 Y,XX,N,YY,DF

3 M,N,N,Y,YY,XX,Z

8 Y,N,Y,Y,Y,Y,Y,D

;

Run;

data want;

set have;

length newtext $20;

do i=1 to countw(text);

if i=del then continue;

newtext=cats(newtext,scan(text,i),',');

end;

newtext=substr(newtext,1,length(newtext)-1);

drop i;

run;

I hope this is what you want,

CTorres

Ksharp · Posted 06-11-2015 09:23 AM

Why not LEFT substr() ?

Code: Program

data have ;
  length ID position 8 Have $ 20;
  input ID position have;
cards;
1000 2 N,N,Y  
1001 2 Y,N,N,N,Y  
1002 4 N,N,N,N,N  
1003 6 N,Y,Y,N,Y,Y,N 
1004 5 N,Y,N,N,N,Y,N 
1005 3 Y,N,Y  
;
RUN;

data want;
 set have;
 substr(have,ifn(position-1=0,1,position-1)*2,2)='';
run;

Xia Keshan

CTorres · Posted 06-11-2015 10:40 AM

Yes, It works OK but I would improve the program with a compress at the end to remove the blanks:

data want;

set have;

want=have;

substr(want,ifn(position-1=0,1,position-1)*2,2)='';

want=compress(want);

run;

CTorres

Haikuo · Posted 06-11-2015 11:11 AM

Here is one more way to skin the cat, using PRX.

data have;

length ID position 8 Have $ 20;

input ID position have;

cards;

1000 2 1,2,3

1001 3 1,2,3,4,5

1002 4 1,2,3,4,5

1003 6 N,Y,Y,N,Y,Y,N

1004 5 N,Y,N,N,N,Y,N

1005 1 Y,N,Y

;

RUN;

DATA WANT;

SET HAVE;

IF POSITION=1 THEN

WANT=SUBSTR(HAVE,3);

ELSE WANT=PRXCHANGE('s/(^\S{'||cats(position*2-3)||'})(\S{2})(.+$)/$1$3/',-1,have);

run;

SAS Procedures

Removing Character Value from String at Specific Position in String

Re: Removing Character Value from String at Specific Position in String

Re: Removing Character Value from String at Specific Position in String

Re: Removing Character Value from String at Specific Position in String

Re: Removing Character Value from String at Specific Position in String

Re: Removing Character Value from String at Specific Position in String

Re: Removing Character Value from String at Specific Position in String

Re: Removing Character Value from String at Specific Position in String

Re: Removing Character Value from String at Specific Position in String

Re: Removing Character Value from String at Specific Position in String

Re: Removing Character Value from String at Specific Position in String

Re: Removing Character Value from String at Specific Position in String

Code: Program

Re: Removing Character Value from String at Specific Position in String

Re: Removing Character Value from String at Specific Position in String

Follow Us

What is...

SAS Procedures

Code: Program

Our biggest data and AI event of the year.

SAS Training: Just a Click Away

Follow Us

What is...