I am trying add a leading and trailing space to a character variable using the following command.
caps=' '||caps;
caps=caps||' ';
It is adding a leading space but not the trailing space.
Depending on the defined length (e.g. with a LENGTH statement) and the actual content, the variable will always have trailing blanks; strings in the variable are padded to the defined length with blanks.
Trailing blanks are usually not displayed, unless you force their display (e.g. by using the $CHAR format).
I have enough length in the variable. The source variable is 100 Char long and i am using the following
caps=put(GenericName,$upcase200.); /*Capitalise all letters and setting the length to 200*/
caps = compbl(tranwrd(caps, ",", " ,")); /* Adding a space before a comma */
caps = compbl(tranwrd(caps, ".", " ."));
caps = compbl(tranwrd(caps, ";", " ;"));
caps = compbl(tranwrd(caps, "&", " + "));
*caps = compbl(tranwrd(caps, "/", " + "));
caps=' '||caps;
caps=caps||" ";
So (unless you defined the length elsewhere) your new variable caps will be padded with blanks up to the 200th byte.
what is the solution then. I need to add a trailing space for text mining.
Since there are already trailing spaces, there is no need to add any. The number of trailing spaces is always (defined length of variable) minus (position of last non-blank character).
But my following text mining command is not considering the trailing space.
pattern_id=prxparse('/( DONEPEZIL | GALANTAMINE | RIVASTIGMINE | MEMANTINE )/i');
start=1;
stop=length(caps);
call prxnext(pattern_id, start, stop, caps, position, length);
if (position>0)then
cm_dementia=substr(caps, position, length);
You don't need the blanks in the pattern IMO. See a similar solution here: https://communities.sas.com/t5/SAS-Programming/Searching-a-string-for-a-list-of-words/td-p/212899
@bayzid wrote:
But my following text mining command is not considering the trailing space.
pattern_id=prxparse('/( DONEPEZIL | GALANTAMINE | RIVASTIGMINE | MEMANTINE )/i'); start=1; stop=length(caps); call prxnext(pattern_id, start, stop, caps, position, length); if (position>0)then cm_dementia=substr(caps, position, length);
I assume it is not matching because you told PRXNEXT() to NOT use any of the trailing spaces by setting STOP to the last non-space character in the variable.
Everything is working when I am adding a dot at the end of the space.
caps1=' '||caps1;
caps=caps1||" .";
I think you should drop the leading and trailing blanks, and look for word boundaries instead - beginning of string or end of string is also a word boundary:
pattern_id=prxparse('/\b(DONEPEZIL|GALANTAMINE|RIVASTIGMINE|MEMANTINE)\b/i');
That will pick the keyword by itself or with another keywords. For example, "DONEPEZIL" and "DONEPEZIL/OTHER". But i do not want the "DONEPEZIL/OTHER" to be picked up.
@BayzidurRahman wrote:
That will pick the keyword by itself or with another keywords. For example, "DONEPEZIL" and "DONEPEZIL/OTHER". But i do not want the "DONEPEZIL/OTHER" to be picked up.
Do not add the spaces to the VARIABLE. Add them to the string you pass to the FUNCTION. Remember that means that the position that it found the start of the match is off by one because of the extra space.
pattern_id=prxparse('/ (DONEPEZIL|GALANTAMINE|RIVASTIGMINE|MEMANTINE) /i');
call prxnext(pattern_id, start, vlength(caps)+2 , ' '||caps||' ', position, length);
if (position>0) then cm_dementia=substr(caps, position-1, length);
It could also be done with a RegEx but in your case may-be just use the scan() function with a blank defined as the word delimiter.
data demo;
infile datalines truncover;
input str $100.;
length cm_dementia $40;
do _i=1 to countw(str,' ');
cm_dementia=scan(str,_i,' ');
if upcase(cm_dementia) in ('DONEPEZIL', 'GALANTAMINE', 'RIVASTIGMINE', 'MEMANTINE') then
output;
end;
datalines;
donepezil galantamine donepezil/other rivastigmine memantine
;
proc print data=demo;
run;
Your new variable caps will be padded with blanks up to the 200th byte, ensuring precise and consistent data storage.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.