BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
arde
Obsidian | Level 7

I have data in a variable that contains duplicates and sometimes begins or ends in a comma.  I would like to clean it up.  Any help is truly appreciated!

 

Have:

dog,dog,cat in the hat
cat in the hat,dog,mouse,mouse,
,cat in the hat,cat in the hat,cat in the hat

 

Want:

dog,cat in the hat
cat in the hat,dog,mouse
cat in the hat

 

Thank you in advance for your assistance!

 

1 ACCEPTED SOLUTION

Accepted Solutions
arde
Obsidian | Level 7

Thank you.  but that only deletes the commas in the beginning and at the end.  I figured out the deleting duplicates (Here it is in case anyone else needs it):

 

data want1;
set want;
newstring=scan(newlist, 1, ',');
do i=2 to countw(newlist,',');
word=scan(newlist, i, ',');
found=find(newstring, word, 'it');
if found=0 then newstring=catx(', ', newstring, word);
end;
run;

View solution in original post

3 REPLIES 3
Tom
Super User Tom
Super User

SCAN() the list and use CATX() to build a NEW list.

So if your dataset is named HAVE the variable is named LIST you can use a step like this to make a new dataset named HAVE with a new variable named NEWLIST.  Only add the "word" from the old list that have not already been added to the new list.

data want;
  set have;
  length newlist $300 word $100 ;
  do i=1 to countw(list,',');
    word=scan(list,i,',');
    if not findw(newlist,list,',','t') then newlist=catx(',',newlist,word);
  end;
run;
arde
Obsidian | Level 7

Thank you.  but that only deletes the commas in the beginning and at the end.  I figured out the deleting duplicates (Here it is in case anyone else needs it):

 

data want1;
set want;
newstring=scan(newlist, 1, ',');
do i=2 to countw(newlist,',');
word=scan(newlist, i, ',');
found=find(newstring, word, 'it');
if found=0 then newstring=catx(', ', newstring, word);
end;
run;

arde
Obsidian | Level 7

Here is the full program if anyone needs it:

 

data want (drop=word i);
set have;
length newlist $300 word $100 ;
do i=1 to countw(have_,',');
word=scan(have_,i,',');
if not findw(newlist,have_,',','t') then newlist=catx(',',newlist,word);
end;
run;

data want1;
set want;
newstring=scan(newlist, 1, ',');
do i=2 to countw(newlist,',');
word=scan(newlist, i, ',');
found=find(newstring, word, 'it');
if found=0 then newstring=catx(', ', newstring, word);
end;
run;

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 1627 views
  • 0 likes
  • 2 in conversation