<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Extracting first and last words in a string in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Extracting-first-and-last-words-in-a-string/m-p/469996#M120282</link>
    <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/140721"&gt;@Melk&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;
&lt;P&gt;Data is dirty often and sometimes you just have to deal with it best as possible. I already developed a strategy for the best possible scenario. It is not perfect, but it is optimal. I don't answer questions on this board to listen to people complain about them. These are real life situations and datasets from reputable companies.&lt;/P&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;I don't think anyone is "complaining" but pointing out potential issues in the process.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;And those potential issues have come from very real data from equally reputable sources.&lt;/P&gt;
&lt;P&gt;Examples of the name data I have had to process had all "name" information in a single field. Since the data related to children some times both parents last names were included in the name field with no regularity for order of mother or father last names. Date were entered as:&lt;/P&gt;
&lt;P&gt;first name, middle name, last name&lt;/P&gt;
&lt;P&gt;first name,&amp;nbsp;last name&lt;/P&gt;
&lt;P&gt;last name, first name, middle name&lt;/P&gt;
&lt;P&gt;last name, first name, last name, middle initial&lt;/P&gt;
&lt;P&gt;last name-lastname (hyphenated), middle initial, first name (and the same very distinctive last names hyphenated in different order)&lt;/P&gt;
&lt;P&gt;with occasional sprinklings of Junior, second, third, II, III and such after first names or middle names or last names.&lt;/P&gt;
&lt;P&gt;In my data for this project of roughly 15,000 names only about 30 percent&amp;nbsp; was I comfortable with assuming the data were first name, middle name, last name.&lt;/P&gt;
&lt;P&gt;And for extra joy the unique identifier assigned would sometimes have a somewhat different name either spelling or dropping one of the "last names" associated or adding a middle name or initial.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;And I had to match this data set to a separate data source&amp;nbsp;on name, date of birth and gender.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Luckily after dealing with that data source for a year or so they transitioned to a collection system that actually collected the data into first name, middle name(s), last name (singular) data entry. This was not in the 1980's but as recent as 2012 when data collection folks should have had the word 30 years ago that a single name field is poor design.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Other "name" data I have had to process actually included data comments like "see grandma in Apt B" or "the green trailer".&lt;/P&gt;</description>
    <pubDate>Wed, 13 Jun 2018 16:51:55 GMT</pubDate>
    <dc:creator>ballardw</dc:creator>
    <dc:date>2018-06-13T16:51:55Z</dc:date>
    <item>
      <title>Extracting first and last words in a string</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Extracting-first-and-last-words-in-a-string/m-p/469759#M120204</link>
      <description>&lt;P&gt;I have a string with names with varying formats:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;name&lt;/P&gt;&lt;P&gt;First middle_initial last&lt;/P&gt;&lt;P&gt;first middle last&lt;/P&gt;&lt;P&gt;first middle_initial last&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I need to extract first and last names into separate variables, but the name may have 2 first names, ie "mary ann l smith" in which case I need the new variables to read:&lt;/P&gt;&lt;P&gt;first&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; last&lt;/P&gt;&lt;P&gt;mary ann&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; smith&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;or 2 last names such as "mary l smith jones, which would read&lt;/P&gt;&lt;P&gt;first&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; last&lt;/P&gt;&lt;P&gt;mary&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; smith jones&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;How can I do this taking into acount the names may or may not have a middle initial? none of the names would have a letter count of less than 2, so I feel like that would be my best bet? Perhaps 2 if statements depending on if there is a middle initial or not..&lt;/P&gt;</description>
      <pubDate>Tue, 12 Jun 2018 20:48:45 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Extracting-first-and-last-words-in-a-string/m-p/469759#M120204</guid>
      <dc:creator>Melk</dc:creator>
      <dc:date>2018-06-12T20:48:45Z</dc:date>
    </item>
    <item>
      <title>Re: Extracting first and last words in a string</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Extracting-first-and-last-words-in-a-string/m-p/469763#M120205</link>
      <description>&lt;P&gt;Do you have a situation where there will not be middle initial?&lt;/P&gt;
&lt;P&gt;If yes, then if you have a name "&lt;SPAN&gt;mary smith jones" then how can you know the first name and last name. I mean first name can be Mary smith or just Mary. You never know unless it is properly delimited.&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;For this kind of situation&amp;nbsp;the name maybe delimited by tab, single space between first names (ie: Mary ann) and separated by tab next. Consecutive blanks might represent missing values between&amp;nbsp;then SCAN() function with "m" as 4th argument&amp;nbsp;might work.&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data test ;
input name $50.;
Datalines;
Marry Ann	I	Jones
Marry		Jones
;
run;
/*
Marry Ann&amp;lt;tab&amp;gt;I&amp;lt;tab&amp;gt;Jones
Marry&amp;lt;tab&amp;gt;&amp;lt;tab&amp;gt;Jones
*/
data want;
set test;
First=scan(name,1,'09'x,'m');
middle=scan(name,2,'09'x,'m');
last=scan(name,3,'09'x,'m');
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Check if your data is delimited by tab or someway to identify the first, middle and last name. As far as I know all the source systems will create the string with proper delimiters to identify. If not you may need to change the way your source data is sent.&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 12 Jun 2018 21:16:26 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Extracting-first-and-last-words-in-a-string/m-p/469763#M120205</guid>
      <dc:creator>SuryaKiran</dc:creator>
      <dc:date>2018-06-12T21:16:26Z</dc:date>
    </item>
    <item>
      <title>Re: Extracting first and last words in a string</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Extracting-first-and-last-words-in-a-string/m-p/469788#M120215</link>
      <description>&lt;P&gt;Last names from some areas may also have spaces such as "Le Blanc" "Von Braun" and just plain creative people who name there children things like "Moonbeam Glory Morning Sunshine", 3, 4 or 5 "first" and "middle" names.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Letter count less than 2 occurs in some cases of Hispanic "last names" such as "y" signifying the family was from 2 (or more) properties in their history.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Have fun. Names are the second worst data I deal with.&lt;/P&gt;</description>
      <pubDate>Tue, 12 Jun 2018 22:38:26 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Extracting-first-and-last-words-in-a-string/m-p/469788#M120215</guid>
      <dc:creator>ballardw</dc:creator>
      <dc:date>2018-06-12T22:38:26Z</dc:date>
    </item>
    <item>
      <title>Re: Extracting first and last words in a string</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Extracting-first-and-last-words-in-a-string/m-p/469836#M120223</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I really did not got the scenario when it will be like the name is having a middle name or not&lt;BR /&gt;I mean if name is Mary Le Blanc&lt;BR /&gt;how can we say that Mary Le is the first name and Blanc is the last name or if mary is the first name Le is the middle and Blanc is the last name.&lt;/P&gt;</description>
      <pubDate>Wed, 13 Jun 2018 06:09:30 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Extracting-first-and-last-words-in-a-string/m-p/469836#M120223</guid>
      <dc:creator>ruchi11dec</dc:creator>
      <dc:date>2018-06-13T06:09:30Z</dc:date>
    </item>
    <item>
      <title>Re: Extracting first and last words in a string</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Extracting-first-and-last-words-in-a-string/m-p/469851#M120230</link>
      <description>&lt;P&gt;If you can't make up a rule that works if you do it on paper yourself, then you have nothing that can be translated into code. This is true for any programming language.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Find that rule, once you have it, we can help in converting it to SAS code.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;This is the reason why any decent database for people always has separate fields for surname(s) and given name. There's no other way of keeping names that works.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 13 Jun 2018 08:04:05 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Extracting-first-and-last-words-in-a-string/m-p/469851#M120230</guid>
      <dc:creator>Kurt_Bremser</dc:creator>
      <dc:date>2018-06-13T08:04:05Z</dc:date>
    </item>
    <item>
      <title>Re: Extracting first and last words in a string</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Extracting-first-and-last-words-in-a-string/m-p/469985#M120280</link>
      <description>&lt;P&gt;Data is dirty often and sometimes you just have to deal with it best as possible. I already developed a strategy for the best possible scenario. It is not perfect, but it is optimal. I don't answer questions on this board to listen to people complain about them. These are real life situations and datasets from reputable companies.&lt;/P&gt;</description>
      <pubDate>Wed, 13 Jun 2018 16:19:29 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Extracting-first-and-last-words-in-a-string/m-p/469985#M120280</guid>
      <dc:creator>Melk</dc:creator>
      <dc:date>2018-06-13T16:19:29Z</dc:date>
    </item>
    <item>
      <title>Re: Extracting first and last words in a string</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Extracting-first-and-last-words-in-a-string/m-p/469994#M120281</link>
      <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/145839"&gt;@ruchi11dec&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;
&lt;P&gt;Hi,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I really did not got the scenario when it will be like the name is having a middle name or not&lt;BR /&gt;I mean if name is Mary Le Blanc&lt;BR /&gt;how can we say that Mary Le is the first name and Blanc is the last name or if mary is the first name Le is the middle and Blanc is the last name.&lt;/P&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;Which could well be that the first name is Mary and the last name is Le Blanc and no middle name at all. Which was part of the point of bringing it out. Naming conventions vary by culture or country of origin and data that collects "name" without data entry limits runs into messy processing.&lt;/P&gt;</description>
      <pubDate>Wed, 13 Jun 2018 16:36:09 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Extracting-first-and-last-words-in-a-string/m-p/469994#M120281</guid>
      <dc:creator>ballardw</dc:creator>
      <dc:date>2018-06-13T16:36:09Z</dc:date>
    </item>
    <item>
      <title>Re: Extracting first and last words in a string</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Extracting-first-and-last-words-in-a-string/m-p/469996#M120282</link>
      <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/140721"&gt;@Melk&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;
&lt;P&gt;Data is dirty often and sometimes you just have to deal with it best as possible. I already developed a strategy for the best possible scenario. It is not perfect, but it is optimal. I don't answer questions on this board to listen to people complain about them. These are real life situations and datasets from reputable companies.&lt;/P&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;I don't think anyone is "complaining" but pointing out potential issues in the process.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;And those potential issues have come from very real data from equally reputable sources.&lt;/P&gt;
&lt;P&gt;Examples of the name data I have had to process had all "name" information in a single field. Since the data related to children some times both parents last names were included in the name field with no regularity for order of mother or father last names. Date were entered as:&lt;/P&gt;
&lt;P&gt;first name, middle name, last name&lt;/P&gt;
&lt;P&gt;first name,&amp;nbsp;last name&lt;/P&gt;
&lt;P&gt;last name, first name, middle name&lt;/P&gt;
&lt;P&gt;last name, first name, last name, middle initial&lt;/P&gt;
&lt;P&gt;last name-lastname (hyphenated), middle initial, first name (and the same very distinctive last names hyphenated in different order)&lt;/P&gt;
&lt;P&gt;with occasional sprinklings of Junior, second, third, II, III and such after first names or middle names or last names.&lt;/P&gt;
&lt;P&gt;In my data for this project of roughly 15,000 names only about 30 percent&amp;nbsp; was I comfortable with assuming the data were first name, middle name, last name.&lt;/P&gt;
&lt;P&gt;And for extra joy the unique identifier assigned would sometimes have a somewhat different name either spelling or dropping one of the "last names" associated or adding a middle name or initial.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;And I had to match this data set to a separate data source&amp;nbsp;on name, date of birth and gender.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Luckily after dealing with that data source for a year or so they transitioned to a collection system that actually collected the data into first name, middle name(s), last name (singular) data entry. This was not in the 1980's but as recent as 2012 when data collection folks should have had the word 30 years ago that a single name field is poor design.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Other "name" data I have had to process actually included data comments like "see grandma in Apt B" or "the green trailer".&lt;/P&gt;</description>
      <pubDate>Wed, 13 Jun 2018 16:51:55 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Extracting-first-and-last-words-in-a-string/m-p/469996#M120282</guid>
      <dc:creator>ballardw</dc:creator>
      <dc:date>2018-06-13T16:51:55Z</dc:date>
    </item>
    <item>
      <title>Re: Extracting first and last words in a string</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Extracting-first-and-last-words-in-a-string/m-p/607076#M176367</link>
      <description>Anything further on this?</description>
      <pubDate>Mon, 25 Nov 2019 18:09:00 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Extracting-first-and-last-words-in-a-string/m-p/607076#M176367</guid>
      <dc:creator>tomrvincent</dc:creator>
      <dc:date>2019-11-25T18:09:00Z</dc:date>
    </item>
  </channel>
</rss>

