- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hello, I need to create a new variable based on an existing one that contains a value such as: "SV01 – Study visit out of window"
I need to keep everything after the – and remove the leading space so that the new variable contains "Study visit out of window"
the lengths of the input variable are not the same.
Thanks for your help!
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
data test;
have="SV01 – Study visit out of window";
want=substr(have,find(have,"–")+4);
run;
data test;
have="SV01 – Study visit out of window";
want=strip(scan(have,-1,"–"));
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
2. substr everything from after that position to the end of the string.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
data test;
have="SV01 – Study visit out of window";
want=substr(have,find(have,"–")+4);
run;
data test;
have="SV01 – Study visit out of window";
want=strip(scan(have,-1,"–"));
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thank you!
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Kindly test thoroughly and come back to us if you jump into some intricacies. Thanks
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Because SCAN uses any of the single characters in the third parameter position it may not be the best solution if either the †or " might appear anyplace other than as a pair. Example:
data test; have="SV01 – Study visit out “of window"; want=strip(scan(have,-1,"–")); run;
Which returns "of window" as the value for want since the Scan function is using a single delimiting character and reading from right to left (the -1 parameter) to get the end of the string.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
This should address that problem, perhaps?
Convert the – to tab delim and the same approach
data test;
have="SV01 – Study visit out “of window";
temp=tranwrd(have,'–','09'x);
want=strip(scan(temp,-1,'09'x));
drop temp;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@novinosrin wrote:
This should address that problem, perhaps?
Convert the – to tab delim and the same approach
data test; have="SV01 – Study visit out “of window"; temp=tranwrd(have,'–','09'x); want=strip(scan(temp,-1,'09'x)); drop temp; run;
Yes, something like that or other approaches. I mentioned the issue so if the OP had run the code with SCAN and got unexpected results there would be a place to look for the cause. I will not mention how much time I spend running down similar data issues when provided data that some value "always" occurs with another specific value.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
data have;
var1="SV01 – Study visit out of window"; output;
var1="– SV01 Study visit out of window"; output;
var1=" SV01 Study visit out of – window"; output;
run;
data want;
Length var1 var2 $100.;
set have;
var2=strip(substr(strip(var1),findw(var1,"–")+3));
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@novinosrin way is cleaner and simpler and this is just another way using prxchange. brief explanation of prxchange below
^(.+?)? this is first and starting position and indicate that this position could be there or not
(–) -- this is second patter and has the symbol thing
(.+)$ -- this is third position after second position till the end
/$3/ -- this means replace everything with third position/pattern
data test;
have="SV01 – Study visit out of window";
want=prxchange('s/^(.+?)?(–)(.+)$/$3/', 1, have);
run;