<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Cleansing a Raw Persons Name Column in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Cleansing-a-Raw-Persons-Name-Column/m-p/379649#M91389</link>
    <description>&lt;P&gt;The question is, how do you know that a word is a first or last name? &amp;nbsp;Sinclair for instance could be either, much like Lin could be a first name, and James a second name, what logical path indicates that the correct order is James Lin and not Lin James. &amp;nbsp;So first would come a list of rules, maybe something like:&lt;/P&gt;
&lt;P&gt;strip out from string the texts Prof. Dr. ...&lt;/P&gt;
&lt;P&gt;if third part of string delimited by spaces is a character then second part of string delimited by spaces is first name and part 1 is second&lt;/P&gt;
&lt;P&gt;else first name = scan(string,1), second name=scan(string,2)&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Once you have the rules, the programming is pretty simple.&lt;/P&gt;</description>
    <pubDate>Thu, 27 Jul 2017 08:36:44 GMT</pubDate>
    <dc:creator>RW9</dc:creator>
    <dc:date>2017-07-27T08:36:44Z</dc:date>
    <item>
      <title>Cleansing a Raw Persons Name Column</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Cleansing-a-Raw-Persons-Name-Column/m-p/379648#M91388</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I have a a column that contains a person's name. There are certain records that conform to the normal way of showing a name such as a forename then a space followed by surname. Other records are back to front where they start with the surname. Please see the data below:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;TABLE&gt;&lt;TBODY&gt;&lt;TR&gt;&lt;TD&gt;Have&lt;/TD&gt;&lt;TD&gt;Want&lt;/TD&gt;&lt;TD&gt;Want&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;Raw data name&lt;/TD&gt;&lt;TD&gt;Forename&lt;/TD&gt;&lt;TD&gt;Surname&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;AARON KENAVAN&lt;/TD&gt;&lt;TD&gt;AARON&lt;/TD&gt;&lt;TD&gt;KENAVAN&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;BARRON&amp;nbsp; WILLIAM F&lt;/TD&gt;&lt;TD&gt;WILLIAM&amp;nbsp;&lt;/TD&gt;&lt;TD&gt;BARRON&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;CHAN&amp;nbsp; BONNIE Y&lt;/TD&gt;&lt;TD&gt;BONNIE&lt;/TD&gt;&lt;TD&gt;CHAN&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;LIN&amp;nbsp; JAMES C&lt;/TD&gt;&lt;TD&gt;JAMES&lt;/TD&gt;&lt;TD&gt;LIN&amp;nbsp;&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;SINCLAIR&amp;nbsp; PATRICK S&lt;/TD&gt;&lt;TD&gt;PATRICK&lt;/TD&gt;&lt;TD&gt;SINCLAIR&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;BUTLER&amp;nbsp; JOHN H&lt;/TD&gt;&lt;TD&gt;JOHN&lt;/TD&gt;&lt;TD&gt;BUTLER&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;FARBER&amp;nbsp; MICHAEL&lt;/TD&gt;&lt;TD&gt;MICHAEL&lt;/TD&gt;&lt;TD&gt;FARBER&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;HADLEY&amp;nbsp; JOSEPH P&lt;/TD&gt;&lt;TD&gt;JOSEPH&lt;/TD&gt;&lt;TD&gt;HADLEY&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;SCHELEIN, ROBERT M.&lt;/TD&gt;&lt;TD&gt;ROBERT&lt;/TD&gt;&lt;TD&gt;SCHELEIN&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;SARRO, DOUGLAS A.&lt;/TD&gt;&lt;TD&gt;DOUGLAS&lt;/TD&gt;&lt;TD&gt;SARRO&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;PROF. DR. MICHAEL SCHLITT&lt;/TD&gt;&lt;TD&gt;MICHAEL&lt;/TD&gt;&lt;TD&gt;SCHLITT&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;CURTIS.STEFANAK&lt;/TD&gt;&lt;TD&gt;CURTIS&lt;/TD&gt;&lt;TD&gt;STEFANAK&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;DARA D..MANN&lt;/TD&gt;&lt;TD&gt;DARA&lt;/TD&gt;&lt;TD&gt;MANN&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Is there a programme available that cleanse the name column so that consistent fornames and surnames can be extracted across all records?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Many thanks&lt;BR /&gt;&lt;BR /&gt;Chris&lt;/P&gt;</description>
      <pubDate>Thu, 27 Jul 2017 08:22:43 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Cleansing-a-Raw-Persons-Name-Column/m-p/379648#M91388</guid>
      <dc:creator>cmoore</dc:creator>
      <dc:date>2017-07-27T08:22:43Z</dc:date>
    </item>
    <item>
      <title>Re: Cleansing a Raw Persons Name Column</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Cleansing-a-Raw-Persons-Name-Column/m-p/379649#M91389</link>
      <description>&lt;P&gt;The question is, how do you know that a word is a first or last name? &amp;nbsp;Sinclair for instance could be either, much like Lin could be a first name, and James a second name, what logical path indicates that the correct order is James Lin and not Lin James. &amp;nbsp;So first would come a list of rules, maybe something like:&lt;/P&gt;
&lt;P&gt;strip out from string the texts Prof. Dr. ...&lt;/P&gt;
&lt;P&gt;if third part of string delimited by spaces is a character then second part of string delimited by spaces is first name and part 1 is second&lt;/P&gt;
&lt;P&gt;else first name = scan(string,1), second name=scan(string,2)&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Once you have the rules, the programming is pretty simple.&lt;/P&gt;</description>
      <pubDate>Thu, 27 Jul 2017 08:36:44 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Cleansing-a-Raw-Persons-Name-Column/m-p/379649#M91389</guid>
      <dc:creator>RW9</dc:creator>
      <dc:date>2017-07-27T08:36:44Z</dc:date>
    </item>
    <item>
      <title>Re: Cleansing a Raw Persons Name Column</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Cleansing-a-Raw-Persons-Name-Column/m-p/379650#M91390</link>
      <description>&lt;P&gt;I don't think it could be done with any certainty - what if you had a name like John James or a non-English language name?&lt;/P&gt;</description>
      <pubDate>Thu, 27 Jul 2017 08:42:49 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Cleansing-a-Raw-Persons-Name-Column/m-p/379650#M91390</guid>
      <dc:creator>ChrisBrooks</dc:creator>
      <dc:date>2017-07-27T08:42:49Z</dc:date>
    </item>
    <item>
      <title>Re: Cleansing a Raw Persons Name Column</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Cleansing-a-Raw-Persons-Name-Column/m-p/379651#M91391</link>
      <description>&lt;P&gt;Originally I had used tranwrd to remove commas. If I left them in the surnames would appear like "&lt;SPAN&gt;BUTLER," so the string would be "BUTLER, &amp;nbsp;JOHN H", If there is a comma is there a quick way to then rearrange the variables? Thanks&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 27 Jul 2017 08:43:46 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Cleansing-a-Raw-Persons-Name-Column/m-p/379651#M91391</guid>
      <dc:creator>cmoore</dc:creator>
      <dc:date>2017-07-27T08:43:46Z</dc:date>
    </item>
    <item>
      <title>Re: Cleansing a Raw Persons Name Column</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Cleansing-a-Raw-Persons-Name-Column/m-p/379657#M91392</link>
      <description>&lt;P&gt;If the existence of a comma always means that the word on the left of the comma is the lastname and the word on its right side is the first name, than it is easy to extract first- and lastname.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;untested:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;if index(raw, ",") &amp;gt; 0 then do;
  Forename = scan(raw, 2, ",");
  Surname = scan(raw, 1, ",");
end;
else do;
  Forename = scan(raw, 1, " ");
  Surname = scan(raw, 2, " ");
end;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 27 Jul 2017 09:47:55 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Cleansing-a-Raw-Persons-Name-Column/m-p/379657#M91392</guid>
      <dc:creator>andreas_lds</dc:creator>
      <dc:date>2017-07-27T09:47:55Z</dc:date>
    </item>
    <item>
      <title>Re: Cleansing a Raw Persons Name Column</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Cleansing-a-Raw-Persons-Name-Column/m-p/383956#M91613</link>
      <description>&lt;P&gt;If you can come up with clear rules for parsing names such as:&lt;/P&gt;
&lt;P&gt;Jones Elizabeth Ann Brichoux&lt;/P&gt;
&lt;P&gt;Ross Raven Aurora-moonlight&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I'll be impressed. Those are a couple of names I had to work with, also coming in a single field.&lt;/P&gt;
&lt;P&gt;I hate dealing with names.&lt;/P&gt;
&lt;P&gt;And anyone still collecting names into a single field needs to return their Pong game as it is too modern for them.&lt;/P&gt;</description>
      <pubDate>Fri, 28 Jul 2017 23:19:11 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Cleansing-a-Raw-Persons-Name-Column/m-p/383956#M91613</guid>
      <dc:creator>ballardw</dc:creator>
      <dc:date>2017-07-28T23:19:11Z</dc:date>
    </item>
    <item>
      <title>Re: Cleansing a Raw Persons Name Column</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Cleansing-a-Raw-Persons-Name-Column/m-p/383967#M91618</link>
      <description>&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/136670"&gt;@cmoore&lt;/a&gt;&lt;/P&gt;
&lt;P&gt;If you've got the SAS Data Quality Server licensed&amp;nbsp;&lt;A href="http://support.sas.com/software/products/dataqual/" target="_blank"&gt;http://support.sas.com/software/products/dataqual/&lt;/A&gt; then you could try code as below.&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;options dqlocale=(ENUSA);
data want(drop=_:);
  set have;
/*  showTokens=DQPARSEINFOGET('NAME', 'ENUSA');*/
  stdName=dqStandardize(name, 'Name');
  _parsedValue=dqParse(stdName, 'NAME', 'ENUSA');
  nm_prefix=dqParseTokenGet(_parsedValue, 'Name Prefix', 'NAME', 'ENUSA');
  nm_given=dqParseTokenGet(_parsedValue, 'Given Name', 'NAME', 'ENUSA');
  nm_Middle=dqParseTokenGet(_parsedValue, 'Middle Name', 'NAME', 'ENUSA');
  nm_Family=dqParseTokenGet(_parsedValue, 'Family Name', 'NAME', 'ENUSA');
  nm_Suffix=dqParseTokenGet(_parsedValue, 'Name Suffix', 'NAME', 'ENUSA');
  nm_Appendage=dqParseTokenGet(_parsedValue, 'Name Appendage', 'NAME', 'ENUSA');
run;
&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Sat, 29 Jul 2017 02:12:16 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Cleansing-a-Raw-Persons-Name-Column/m-p/383967#M91618</guid>
      <dc:creator>Patrick</dc:creator>
      <dc:date>2017-07-29T02:12:16Z</dc:date>
    </item>
  </channel>
</rss>

