Hi Experts,
data dsn;
length string $26;
string = 'abcdefghijklmnopqrstuvwxyz';
do i = 1 to length(string);
if i =1 then do;from_string=substr(string,i-1);end;
from_string = cat(substr(string, 1, i-1), substr(string, i+1));
extract_letter = substr(string, i, 1);
output;
end;
drop i;
proc print noobs;
run;
i want ouput like below i want extract each letter from string variable that letter should not be in from_string variabe
Is case of the letter to be considered? "A" is not the same as "a" for example.
Is the "letter" you want to remove duplicated in your actual values? Should the process remove just one instance of the letter? All instances? Something else?
I would probably use the COMPRESS function but that depends on the answers above. The Compress function would remove multiple instances of the letter and has options to ignore case, so "a" and "A" could both be removed.
data dsn; length string $26; string = 'abcdefghijklmnopqrstuvwxyz'; length from_string $ 26; do i = 1 to length(string); extract_letter = substr(string, i, 1); from_string=compress(string,extract_letter); output; end; drop i; run;
So what is your question? What is wrong with the code you show?
Is case of the letter to be considered? "A" is not the same as "a" for example.
Is the "letter" you want to remove duplicated in your actual values? Should the process remove just one instance of the letter? All instances? Something else?
I would probably use the COMPRESS function but that depends on the answers above. The Compress function would remove multiple instances of the letter and has options to ignore case, so "a" and "A" could both be removed.
data dsn; length string $26; string = 'abcdefghijklmnopqrstuvwxyz'; length from_string $ 26; do i = 1 to length(string); extract_letter = substr(string, i, 1); from_string=compress(string,extract_letter); output; end; drop i; run;
Does below return what you're after?
data demo;
length string from_string extract_letters $26;
string = 'abcdefghijklmnopqrstuvwxyz';
from_string='acdefghijklmnopqrstuvwxyz';
extract_letters=compress(string,strip(from_string));
run;
I guess what you want to do is march through STRING one character at a time. For each iteration, extract the corresponding character, and generate FROM_STRING as STRING minus that character.
If each character in STRING is unique, then compress is all you need:
data want (drop=_:);
string='abcdefghijklmnopqrstuvwxyz';
do _i=1 to length(string);
extract_letter=char(string,_i);
from_string=compress(string,extract_letter);
output;
end;
run;
Now if some characters appear more than once in STRING, but you still want to remove characters only one at a time, then:
data want (drop=_:);
string='abcdefghijklmnopqrstuvwxyz';
do _i=1 to length(string);
extract_letter=char(string,_i);
substr(string,_i,1)='*';
from_string=compress(string,'*');
substr(string,_i,1)=extract_letter;
output;
end;
run;
The latter code inserts an asterisk in the location of the extracted letter, followed by compressing out the asterisk in generating FROM_STRING. Then the extract_letter is restored to its original position in STRING. Of course, this assumes that an asterisk never appears in the original string.
The "trick" here is the use of the SUBSTR function on the left side (i.e. the result side) of the equals sign. It's not a usage one would assume exists.
Hi Mkeintz,
Thank you very much for your solution
Here a picklist depending on what you need.
data have;
infile datalines truncover dlm=' ';
input string:$26. from_string:$26.;
datalines;
abcdefghijklmnopqrstuvwxyz acdefghijklmnopqrstuvwxyz
abcdefghijklmnopqrstuvwxyz aCdefghijklmnopqrstuvwxyz
babdEfghijklmnopqrstuvwxyz cdefghijklMnopqrstuvwxyz
;
/* case insensitive processing */
%let ra=%sysfunc(rank(a));
%let rz=%sysfunc(rank(z));
data want;
set have;
length extract_letters extract_letters_unique $26;
/* get all letters in string that don't exist in from_string - case insensitive */
extract_letters=compress(string,strip(from_string),'i');
/* only list delta delta letters once - case insensitive */
length letter $1;
array charlist{&ra:&rz} $1 _temporary_;
do i=1 to length(extract_letters);
letter=substr(extract_letters,i);
charlist[rank(lowcase(letter))]=letter;
end;
extract_letters_unique=cats(of charlist[*]);
drop letter i;
run;
proc print data=want;
run;
/* case sensitive processing */
%let ra=%sysfunc(rank(a));
%let rz=%sysfunc(rank(z));
data want2;
set have;
length extract_letters extract_letters_unique $26;
/* get all letters in string that don't exist in from_string - case sensitive */
extract_letters=compress(string,strip(from_string));
/* only list delta delta letters once - case sensitive */
length letter $1;
array charlist{&ra:&rz,0:1} $1 _temporary_;
do i=1 to length(extract_letters);
letter=substr(extract_letters,i);
l_num =rank(letter);
l_case= &ra<=l_num<=&rz;
if l_case=0 then l_num= rank(lowcase(letter));
charlist[l_num,l_case]=letter;
end;
extract_letters_unique=cats(of charlist[*]);
call missing(of charlist[*]);
drop letter i l_num l_case;
run;
proc print data=want2;
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.