Hello all,
I have a list of artists that is formatted like so:
['Justin Bieber']
['Brockhampton']
etc
and I want to make it so these variables no longer have quotes or brackets and instead look like:
Justin Bieber
Brockhampton
How would I do this?
Additionally, this dataset contains many other variables, but I am fine with how they are formatted.
COMPRESS()
x = compress(originalVar, "[]'");
Note that the double quotes contain both brackets and single quote.
@cfun wrote:
Hello all,
I have a list of artists that is formatted like so:
['Justin Bieber']
['Brockhampton']
etc
and I want to make it so these variables no longer have quotes or brackets and instead look like:
Justin Bieber
Brockhampton
How would I do this?
How to remove brackets and quotes from observations in a dataset.
in my data having values 22.9() and 1.9(mg/dl) like this but i want to remove () without removing this brackets 1.9mg/dl
@Sri4_ Please as this as a NEW question and don't just post into the middle of an "old" discussion that unfortunately never got closed.
You can always reference another discussion in your NEW question.
Using compress() function as suggested by Reeza may have unintended consequences. For example, if you have legitimate single quotes in a name, such as ['O'Connor'], compress will remove ' after O as well. A cleaner way would be
NAME = transtrn(NAME,"['",trimn(''));
NAME = transtrn(NAME,"']",trimn(''));
This will selectively remove substring
['
and substring
']
For more information on this technique, see Deleting a substring from a SAS string.
Hope this helps.
Couldl be simpler.
data have; input str $20.; cards; ['Justin Bieber'] ['Brockhampton'] ['O'Connor'] ; data want; set have; want=prxchange("s/^\['|'\]$//",-1,strip(str)); run;
Depends on whether those characters every appear in the actual data or not.
If not then it is pretty simple. So if your existing datasets is named HAVE and the existing variable is named VAR then you could use COMPRESS() to remove those three characters.
data want;
set have;
var = compress(var,"[']");
run;
Or you could use SCAN() .
var = scan(var,1,"[']");
If they appear in the data and the quotes are properly escaped then you might want to use SCAN() and DEQUOTE().
var = dequote(scan(var,1,"[]",'q'));
If the embedded single quotes are "escaped" with \ instead of being doubled up then use:
var = dequote(scan(tranwrd(var,"\'","''"),1,"[]",'q'));
If they are not protected in anyway then perhaps you just want to remove the first 2 and last 2 characters.
var = substrn(var,3,length(var)-4);
As a variant to the RegEx @Ksharp posted below RegEx will only remove ['....'] if it encloses the string and both parts exist.
data have;
input str $20.;
cards;
['Justin Bieber']
['Brockhampton']
['O'Connor']
['O'['Con']nor']
['O'['Connor']
['O'Con']nor']
['O'Connor
O'Connor']
O'['Con']nor
;
data want;
set have;
length new_str $20;
new_str=prxchange("s/^\['(.*)'\]/$1/oi",1,strip(str));
run;
I am regex-fanboy, but in this case i don't see the need to use a regex.
The first and last two chars should be removed, so why not just substr:
clean = substr(str, 3, lengthn(str) - 4);
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.