DATA Step, Macro, Functions and more

extracting middle names from a variable?

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 13
Accepted Solution

extracting middle names from a variable?

 I have data like this.

 

 

 

Data ds;
Infile datalines;
Input idno name&$30. Team $ strtwght endwght ;
Datalines;
1331 Jason Schock paul Long blue 187 172
1067 Kanoko john rav Nagasaka green 135 122
1251 Richard roy granny Rose blue 181 166
1192 Charlene tin Armstrong yellow 152 139
1352 Bette Schock green 156 137
1262 Yao Chen Garg blue 196 180
1124 Adrienne Fink green 156 142
;

 

i nedd to get middle names of name variable

 

 the ouput should be like this;

 

middlenames  only

  Jason Schock paul Long ---------->Schock paul(middle name)


Run;


Accepted Solutions
Solution
‎07-29-2016 03:26 AM
Super Contributor
Posts: 254

Re: extracting middle names from a variable?

Will this help you?

 

data _null_;
length middle $20;
   set ds;
   count = countw(name, ' ');
   middle = ' ';
   do i = 2 to count -1;
      word = scan(name, i, ' ', 'm');
      call catx(' ', middle, word);
   end;
   put name = middle =;
run;

View solution in original post


All Replies
Regular Contributor
Posts: 213

Re: extracting middle names from a variable?

I would recommend looking into using the Regular Expression functions in SAS. Here is a paper with some examples to get you started

http://www.lexjansen.com/wuss/2004/data_warehousing/c_dwdb_taming_your_charac_p2.pdf

 

Good luck,

Ahmed

Solution
‎07-29-2016 03:26 AM
Super Contributor
Posts: 254

Re: extracting middle names from a variable?

Will this help you?

 

data _null_;
length middle $20;
   set ds;
   count = countw(name, ' ');
   middle = ' ';
   do i = 2 to count -1;
      word = scan(name, i, ' ', 'm');
      call catx(' ', middle, word);
   end;
   put name = middle =;
run;
Super Contributor
Posts: 254

Re: extracting middle names from a variable?

New Addition to the Earlier Solution

 

data _null_;
length middle $20;
   set ds;
   count = countw(name, ' ');
   middle = ' ';
   word1 = scan(name,1, ' ');
   word2 = scan(name, count, ' ');
   middle = transtrn(name, strip(word1), strip(' '));
   middle = transtrn(middle, strip(word2), strip(' '));

   put name = middle =;
run;

Super User
Posts: 10,500

Re: extracting middle names from a variable?

If your data is regular enough

 

Middlename=scan(Name,3);

 

That pulls the third word from the name variable.

Your output example isn't really clear what your resulting data set would look like. I would recommend adding a new variable for the middle name.

You might also then consider removing the middle name from the name field but it isn't clear if that was what you intend.

Super User
Posts: 9,681

Re: extracting middle names from a variable?

Here are two ways . One is CALL SCAN() , another is prxchange().



Data ds;
Infile datalines;
Input idno name&$30. Team $ strtwght endwght ;
call scan(name,1,p1,l1);
call scan(name,-1,p2,l2);
Middle_Name1=strip(substr(name,l1+1,p2-l1-1));

Middle_Name2=prxchange('s/^\w+|\w+$//',-1,strip(name));

drop p1 p2 l1 l2;
Datalines;
1331 Jason Schock paul Long   blue 187 172
1067 Kanoko john rav Nagasaka   green 135 122
1251 Richard roy granny Rose   blue 181 166
1192 Charlene tin Armstrong   yellow 152 139
1352 Bette Schock   green 156 137
1262 Yao Chen Garg   blue 196 180
1124 Adrienne Fink   green 156 142
;
run;

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 5 replies
  • 523 views
  • 2 likes
  • 5 in conversation