Guys, I have an issue where I have to delete duplicate elements of a character array.
For example:
Array1: [cat dog hen cat dog] ----> [cat dog hen]
or
Array2: [cat cat cat] ----> [cat]
I want to maintain the original order. Any ideas?
Here's a really dumb idea but it should work:
do i = 1 to N; (for each element of array - N is number of elements of array)
if carray ~= '' then do;
do j = i+1 to N; (for all following elements)
if carray = carray
end;
end;
end;
At the end of this loop, you have an array with holes in it, but dup elements are gone.
If you want, you can squish the array so that the elements are contiguous via:
pos = 1;
do i = 1 to N;
if carray ~= '' then do;
carray[pos] = carray;
pos = pos + 1;
end;
end;
Not elegant, but should work.
Another idea is, if instead of array, the source is a really long string, then tranwrd() can be used to eliminate words without looping. This makes the dup word elimination easier, but you pick up other headaches of keeping track of what's been done already,... etc. Don't think this will be more efficient to execute.
What about all the ladies? :smileyangry:
Use a loop that searches the rest of the array for the same value, if the value is the same set it to missing instead.
Or use whichc to find the other variables and the index for it to again set it to missing.
In the end you may need to shorten your array by dropping some variables.
There was a solution posted recently for a similar problem but with airplane codes and looking for flight paths I think....
WhichC sounds like a good idea.
323 data test;
324 array c[5] $3;
325 input c
326 put;
327 put 'NOTE: Before (' (c
328 do i = 1 to dim(c);
329 if i gt whichC(c,of c
330 end;
331 put 'NOTE- After (' (c
332 cards;
NOTE: Before ( cat dog hen cat dog )
After ( cat dog hen )
NOTE: Before ( cat cat cat )
After ( cat )
Here's a really dumb idea but it should work:
do i = 1 to N; (for each element of array - N is number of elements of array)
if carray ~= '' then do;
do j = i+1 to N; (for all following elements)
if carray = carray
end;
end;
end;
At the end of this loop, you have an array with holes in it, but dup elements are gone.
If you want, you can squish the array so that the elements are contiguous via:
pos = 1;
do i = 1 to N;
if carray ~= '' then do;
carray[pos] = carray;
pos = pos + 1;
end;
end;
Not elegant, but should work.
Another idea is, if instead of array, the source is a really long string, then tranwrd() can be used to eliminate words without looping. This makes the dup word elimination easier, but you pick up other headaches of keeping track of what's been done already,... etc. Don't think this will be more efficient to execute.
Why not use Hash Table which has this ability.
Ksharp
Thanks everyone for your help. I ended up using D-Ling's suggestions for the nested DO loop to check all values on each row and replacing duplicates with a missing value. One thing I did notice (prob since I'm new to this) was that I had to use ' ' instead of . to replace a value with a missing value or else the column width became extremely large.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.