BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
Neptun83
Calcite | Level 5

I have a column in which I want to check the similarity of the values. I want to delete those values whose first 10 characters are the same as the first 10 characters of some other value.
I thought I could solve it with two nested loops and a Compare function. But unfortunately it doesn't work.
Could someone help me?

1 ACCEPTED SOLUTION

Accepted Solutions
s_lassen
Meteorite | Level 14

You should post sample data as a data step, like this:

Data have;
  infile cards truncover;
  input txt $20.;
cards;
1234567890a  
1234567890b
z1234567890a  
z1234567890a432
;run;

One possible solution is to sort and compare:

proc sort data=have;
  by txt;
run;

data want;
  set have;
  if lag(substr(txt,1,10)) ne substr(txt,1,10);
run;

View solution in original post

5 REPLIES 5
ballardw
Super User

Are these "values" on the same observation? Nested do loops imply data step code and such a data step basically only wants to work on one observation at a time.

 

Provide example data in the form of a working data step and the expected result. Paste the code for the data step into a text box opened with the </> icon as the main message window reformats pasted text and the result may not actually run.

Tom
Super User Tom
Super User

This does what you asked for:

I want to delete those values whose first 10 characters are the same as the first 10 characters of some other value.

proc sql ;
create table want as 
  select * from have
  group by substr(COLUMN,1,10)
  having count(*)=1
;
quit;

But I suspect you actually want something different.

 

 

s_lassen
Meteorite | Level 14

You should post sample data as a data step, like this:

Data have;
  infile cards truncover;
  input txt $20.;
cards;
1234567890a  
1234567890b
z1234567890a  
z1234567890a432
;run;

One possible solution is to sort and compare:

proc sort data=have;
  by txt;
run;

data want;
  set have;
  if lag(substr(txt,1,10)) ne substr(txt,1,10);
run;
Neptun83
Calcite | Level 5
Thank you guys. I can solve the problem .
mkeintz
PROC Star

@Neptun83 wrote:
Thank you guys. I can solve the problem .

Great.  But please report your solution, and mark it as the answer to your question (or just mark the "I can solve the problem" as the answer).

 

Then the rest of us will not think this is a question still looking for an answer.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

sas-innovate-white.png

Register Today!

Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9.

 

Early bird rate extended! Save $200 when you sign up by March 31.

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 1026 views
  • 3 likes
  • 5 in conversation