SAS Programming

abx · Posted 10-08-2022 07:03 AM

Hi All,

I currently learning SAS and would like to know if anyone is able to help out on the codes based on expected output.

Given samples are the duplicate name in string with some of the second name are truncated. I am not sure how to remove the duplicate name based on the expected output.

Many thanks.

Name:

John Smith John Smith
Jane Foster Jane Foste
Happy Garden Management Corporation Happy Garden Management C
ABC Car Workshop ABC Car Worksho

Expected Output:

John Smith
Jane Foster
Happy Garden Management Corporation
ABC Car Workshop

FreelanceReinh · Posted 10-08-2022 09:39 AM

Hi @abx and welcome to the SAS Support Communities!

@abx wrote:

I currently learning SAS and would like to know if anyone is able to help out on the codes ...

Actually, before we get to the code, we would need a hard and fast rule to be applied to the strings. Otherwise, we might suggest something like this ...

data have;
input name $80.;
cards;
John Smith John Smith
Jane Foster Jane Foste
Happy Garden Management Corporation Happy Garden Management C
ABC Car Workshop ABC Car Worksho
;

data want(keep=name);
set have;
do i=2 to countw(name, ' ');
  call scan(name, i, pos, len, ' ');
  if trim(substr(name, pos)) =: name then do;
    name=substr(name, 1, pos-1);
    leave;
  end;
end;
run;

... and then you come up with example strings like "KSU, Manhattan, KS" where you don't want to cut off the abbreviation at the end.

But maybe there is no such problematic case in your data and the code above works for you.

View solution in original post

FreelanceReinh · Posted 10-08-2022 09:39 AM

Hi @abx and welcome to the SAS Support Communities!

@abx wrote:

I currently learning SAS and would like to know if anyone is able to help out on the codes ...

Actually, before we get to the code, we would need a hard and fast rule to be applied to the strings. Otherwise, we might suggest something like this ...

data have;
input name $80.;
cards;
John Smith John Smith
Jane Foster Jane Foste
Happy Garden Management Corporation Happy Garden Management C
ABC Car Workshop ABC Car Worksho
;

data want(keep=name);
set have;
do i=2 to countw(name, ' ');
  call scan(name, i, pos, len, ' ');
  if trim(substr(name, pos)) =: name then do;
    name=substr(name, 1, pos-1);
    leave;
  end;
end;
run;

... and then you come up with example strings like "KSU, Manhattan, KS" where you don't want to cut off the abbreviation at the end.

But maybe there is no such problematic case in your data and the code above works for you.

sbxkoenk · Posted 10-08-2022 09:47 AM

Hello @abx ,

Leonid Batkhan has several interesting blogs about string treatment.

Go to https://blogs.sas.com/content/?s=string+leonid

That is : go to https://blogs.sas.com/

and enter "Leonid" and "string" as search terms, then hit ENTER.

I haven't opened it , but this one might be applicable :
Removing repeated characters in SAS strings
By Leonid Batkhan on SAS Users November 4, 2020

https://blogs.sas.com/content/sgf/2020/11/04/removing-repeated-characters-in-sas-strings/

Koen

abx · Posted 10-18-2022 05:18 AM

Thanks

abx · Posted 10-18-2022 05:19 AM

Thank you for the links

abx · Posted 10-18-2022 05:17 AM

Thanks! will take your advice for consideration.

Patrick · Posted 10-08-2022 09:02 PM

@abx This sort of data cleansing tasks become often quickly rather involved as you have also to deal with valid cases like Johnson & Johnson.

I'd wait with it as an exercise until you're solid with the basics.

abx · Posted 10-18-2022 05:19 AM

Thank you for the tip.

SAS Programming

Remove Duplicate Words in a String

Re: Remove Duplicate Words in a String

Re: Remove Duplicate Words in a String

Re: Remove Duplicate Words in a String

Re: Remove Duplicate Words in a String

Re: Remove Duplicate Words in a String

Re: Remove Duplicate Words in a String

Re: Remove Duplicate Words in a String

Re: Remove Duplicate Words in a String

Follow Us

What is...

SAS Programming

Special offer for SAS Communities members

SAS Training: Just a Click Away

Follow Us

What is...