- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I am trying add a leading and trailing space to a character variable using the following command.
caps=' '||caps;
caps=caps||' ';
It is adding a leading space but not the trailing space.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Depending on the defined length (e.g. with a LENGTH statement) and the actual content, the variable will always have trailing blanks; strings in the variable are padded to the defined length with blanks.
Trailing blanks are usually not displayed, unless you force their display (e.g. by using the $CHAR format).
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I have enough length in the variable. The source variable is 100 Char long and i am using the following
caps=put(GenericName,$upcase200.); /*Capitalise all letters and setting the length to 200*/
caps = compbl(tranwrd(caps, ",", " ,")); /* Adding a space before a comma */
caps = compbl(tranwrd(caps, ".", " ."));
caps = compbl(tranwrd(caps, ";", " ;"));
caps = compbl(tranwrd(caps, "&", " + "));
*caps = compbl(tranwrd(caps, "/", " + "));
caps=' '||caps;
caps=caps||" ";
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
So (unless you defined the length elsewhere) your new variable caps will be padded with blanks up to the 200th byte.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
what is the solution then. I need to add a trailing space for text mining.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Since there are already trailing spaces, there is no need to add any. The number of trailing spaces is always (defined length of variable) minus (position of last non-blank character).
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
But my following text mining command is not considering the trailing space.
pattern_id=prxparse('/( DONEPEZIL | GALANTAMINE | RIVASTIGMINE | MEMANTINE )/i');
start=1;
stop=length(caps);
call prxnext(pattern_id, start, stop, caps, position, length);
if (position>0)then
cm_dementia=substr(caps, position, length);
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
You don't need the blanks in the pattern IMO. See a similar solution here: https://communities.sas.com/t5/SAS-Programming/Searching-a-string-for-a-list-of-words/td-p/212899
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@bayzid wrote:
But my following text mining command is not considering the trailing space.
pattern_id=prxparse('/( DONEPEZIL | GALANTAMINE | RIVASTIGMINE | MEMANTINE )/i'); start=1; stop=length(caps); call prxnext(pattern_id, start, stop, caps, position, length); if (position>0)then cm_dementia=substr(caps, position, length);
I assume it is not matching because you told PRXNEXT() to NOT use any of the trailing spaces by setting STOP to the last non-space character in the variable.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Everything is working when I am adding a dot at the end of the space.
caps1=' '||caps1;
caps=caps1||" .";
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I think you should drop the leading and trailing blanks, and look for word boundaries instead - beginning of string or end of string is also a word boundary:
pattern_id=prxparse('/\b(DONEPEZIL|GALANTAMINE|RIVASTIGMINE|MEMANTINE)\b/i');
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
That will pick the keyword by itself or with another keywords. For example, "DONEPEZIL" and "DONEPEZIL/OTHER". But i do not want the "DONEPEZIL/OTHER" to be picked up.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@BayzidurRahman wrote:
That will pick the keyword by itself or with another keywords. For example, "DONEPEZIL" and "DONEPEZIL/OTHER". But i do not want the "DONEPEZIL/OTHER" to be picked up.
Do not add the spaces to the VARIABLE. Add them to the string you pass to the FUNCTION. Remember that means that the position that it found the start of the match is off by one because of the extra space.
pattern_id=prxparse('/ (DONEPEZIL|GALANTAMINE|RIVASTIGMINE|MEMANTINE) /i');
call prxnext(pattern_id, start, vlength(caps)+2 , ' '||caps||' ', position, length);
if (position>0) then cm_dementia=substr(caps, position-1, length);
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
It could also be done with a RegEx but in your case may-be just use the scan() function with a blank defined as the word delimiter.
data demo;
infile datalines truncover;
input str $100.;
length cm_dementia $40;
do _i=1 to countw(str,' ');
cm_dementia=scan(str,_i,' ');
if upcase(cm_dementia) in ('DONEPEZIL', 'GALANTAMINE', 'RIVASTIGMINE', 'MEMANTINE') then
output;
end;
datalines;
donepezil galantamine donepezil/other rivastigmine memantine
;
proc print data=demo;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Your new variable caps will be padded with blanks up to the 200th byte, ensuring precise and consistent data storage.