I have data in a variable that contains duplicates and sometimes begins or ends in a comma. I would like to clean it up. Any help is truly appreciated!
Have:
dog,dog,cat in the hat |
cat in the hat,dog,mouse,mouse, |
,cat in the hat,cat in the hat,cat in the hat |
Want:
dog,cat in the hat |
cat in the hat,dog,mouse |
cat in the hat |
Thank you in advance for your assistance!
Thank you. but that only deletes the commas in the beginning and at the end. I figured out the deleting duplicates (Here it is in case anyone else needs it):
data want1;
set want;
newstring=scan(newlist, 1, ',');
do i=2 to countw(newlist,',');
word=scan(newlist, i, ',');
found=find(newstring, word, 'it');
if found=0 then newstring=catx(', ', newstring, word);
end;
run;
SCAN() the list and use CATX() to build a NEW list.
So if your dataset is named HAVE the variable is named LIST you can use a step like this to make a new dataset named HAVE with a new variable named NEWLIST. Only add the "word" from the old list that have not already been added to the new list.
data want;
set have;
length newlist $300 word $100 ;
do i=1 to countw(list,',');
word=scan(list,i,',');
if not findw(newlist,list,',','t') then newlist=catx(',',newlist,word);
end;
run;
Thank you. but that only deletes the commas in the beginning and at the end. I figured out the deleting duplicates (Here it is in case anyone else needs it):
data want1;
set want;
newstring=scan(newlist, 1, ',');
do i=2 to countw(newlist,',');
word=scan(newlist, i, ',');
found=find(newstring, word, 'it');
if found=0 then newstring=catx(', ', newstring, word);
end;
run;
Here is the full program if anyone needs it:
data want (drop=word i);
set have;
length newlist $300 word $100 ;
do i=1 to countw(have_,',');
word=scan(have_,i,',');
if not findw(newlist,have_,',','t') then newlist=catx(',',newlist,word);
end;
run;
data want1;
set want;
newstring=scan(newlist, 1, ',');
do i=2 to countw(newlist,',');
word=scan(newlist, i, ',');
found=find(newstring, word, 'it');
if found=0 then newstring=catx(', ', newstring, word);
end;
run;
Catch the best of SAS Innovate 2025 — anytime, anywhere. Stream powerful keynotes, real-world demos, and game-changing insights from the world’s leading data and AI minds.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.