BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
pavank
Quartz | Level 8

Hi Experts,

 

data dsn;
    length string $26; 
 string = 'abcdefghijklmnopqrstuvwxyz';
      do i = 1 to length(string);
     if i =1 then do;from_string=substr(string,i-1);end;
         from_string = cat(substr(string, 1, i-1), substr(string, i+1));
      extract_letter = substr(string, i, 1); 
      output;
    end;
    drop i;
    proc print noobs;
run;

i want ouput like below i want extract each letter from string variable that letter should not be in from_string variabe

 

pavank_0-1711627911973.png

 

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

Is case of the letter to be considered? "A" is not the same as "a" for example.

Is the "letter" you want to remove duplicated in your actual values? Should the process remove just one instance of the letter? All instances? Something else?

 

I would probably use the COMPRESS function but that depends on the answers above. The Compress function would remove multiple instances of the letter and has options to ignore case, so "a" and "A" could both be removed.

data dsn;
    length string $26; 
    string = 'abcdefghijklmnopqrstuvwxyz';
    length from_string $ 26;
     do i = 1 to length(string);
         extract_letter = substr(string, i, 1); 
         from_string=compress(string,extract_letter);
      output;
    end;
    drop i;
run;

View solution in original post

7 REPLIES 7
PaigeMiller
Diamond | Level 26

So what is your question? What is wrong with the code you show?

--
Paige Miller
ballardw
Super User

Is case of the letter to be considered? "A" is not the same as "a" for example.

Is the "letter" you want to remove duplicated in your actual values? Should the process remove just one instance of the letter? All instances? Something else?

 

I would probably use the COMPRESS function but that depends on the answers above. The Compress function would remove multiple instances of the letter and has options to ignore case, so "a" and "A" could both be removed.

data dsn;
    length string $26; 
    string = 'abcdefghijklmnopqrstuvwxyz';
    length from_string $ 26;
     do i = 1 to length(string);
         extract_letter = substr(string, i, 1); 
         from_string=compress(string,extract_letter);
      output;
    end;
    drop i;
run;
Patrick
Opal | Level 21

Does below return what you're after?

data demo;
  length string from_string extract_letters $26;
  string = 'abcdefghijklmnopqrstuvwxyz';
  from_string='acdefghijklmnopqrstuvwxyz';
  extract_letters=compress(string,strip(from_string));
run;
mkeintz
PROC Star

I guess what you want to do is march through STRING one character at a time.  For each iteration, extract the corresponding character, and generate FROM_STRING as STRING minus that character. 

 

If each character in STRING is unique, then compress is all you need:

 

data want (drop=_:);
  string='abcdefghijklmnopqrstuvwxyz';
  do _i=1 to length(string);
    extract_letter=char(string,_i);
    from_string=compress(string,extract_letter);
    output;
  end;
run;

Now if some characters appear more than once in STRING, but you still want to remove characters only one at a time, then:

 

data want (drop=_:);
  string='abcdefghijklmnopqrstuvwxyz';

  do _i=1 to length(string);
    extract_letter=char(string,_i);
    substr(string,_i,1)='*';
    from_string=compress(string,'*');
    substr(string,_i,1)=extract_letter;
    output;
  end;
run;

The latter code inserts an asterisk in the location of the extracted letter, followed by compressing out the asterisk in generating FROM_STRING.  Then the extract_letter is restored to its original position in STRING.  Of course, this assumes that an asterisk never appears in the original string.

 

The "trick" here is the use of the SUBSTR function on the left side (i.e. the result side) of the equals sign.  It's not a usage one would assume exists.  

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
pavank
Quartz | Level 8

Hi Mkeintz,

Thank you very much for your solution 

Patrick
Opal | Level 21

Here a picklist depending on what you need.

data have;
  infile datalines truncover dlm=' ';
  input string:$26. from_string:$26.;
  datalines;
abcdefghijklmnopqrstuvwxyz acdefghijklmnopqrstuvwxyz
abcdefghijklmnopqrstuvwxyz aCdefghijklmnopqrstuvwxyz
babdEfghijklmnopqrstuvwxyz cdefghijklMnopqrstuvwxyz
;


/* case insensitive processing */
%let ra=%sysfunc(rank(a));
%let rz=%sysfunc(rank(z));

data want;
  set have;
  length extract_letters extract_letters_unique $26;

  /* get all letters in string that don't exist in from_string - case insensitive */
  extract_letters=compress(string,strip(from_string),'i');

  /* only list delta delta letters once - case insensitive */
  length letter $1;
  array charlist{&ra:&rz} $1 _temporary_;
  do i=1 to length(extract_letters);
    letter=substr(extract_letters,i);
    charlist[rank(lowcase(letter))]=letter;
  end;
  extract_letters_unique=cats(of charlist[*]);
  drop letter i;
run;

proc print data=want;
run;


/* case sensitive processing */
%let ra=%sysfunc(rank(a));
%let rz=%sysfunc(rank(z));

data want2;
  set have;
  length extract_letters extract_letters_unique $26;

  /* get all letters in string that don't exist in from_string - case sensitive */
  extract_letters=compress(string,strip(from_string));

  /* only list delta delta letters once - case sensitive */
  length letter $1;
  array charlist{&ra:&rz,0:1} $1 _temporary_;
  do i=1 to length(extract_letters);
    letter=substr(extract_letters,i);
    l_num =rank(letter);
    l_case= &ra<=l_num<=&rz;
    if l_case=0 then l_num= rank(lowcase(letter));
    charlist[l_num,l_case]=letter;
  end;
  extract_letters_unique=cats(of charlist[*]);
  call missing(of charlist[*]);
  drop letter i l_num l_case;
run;

proc print data=want2;
run;

Patrick_0-1711689135985.png

 

 

pavank
Quartz | Level 8
Hi Patrick,
Thank you very much your detailed solution

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 7 replies
  • 1143 views
  • 3 likes
  • 5 in conversation