BookmarkSubscribeRSS Feed
ernie86
Calcite | Level 5

I need to concatenate the first name and the last name whenever the last 3 digits of the first name column and last name column matches. For example, FirstName001 has to concatenate with LastName001 and then create a new column Student001. My actual dataset has more than 100 columns, so I can’t list them out one by one.

 

Have:

Center FirstName001 FirstName002 FirstName003 LastName001 LastName002 LastName003
A Jason Katie Peter Berkowitz. Basel Hirsch
B Dawn Betty Chris Crane Barhaghi Choe
C Andy Ben Donald Nassar Amidi Naomi
D Andrew Ronald Chirsty Lee Hesham JoAnn
E Lee Jennifer Louis Banooni Kimelman Douglas

Want:

Center Student001 Student002 Student003
A Jason Berkowitz. Katie Basel Peter Hirsch
B Dawn Crane Betty Barhaghi Chris Choe
C Andy Nassar Ben Amidi Donald Naomi
D Andrew Lee Ronald Hesham Chirsty JoAnn
E Lee Banooni Jennifer Kimelman Louis Douglas
10 REPLIES 10
rajeshalwayswel
Pyrite | Level 9

data have;
input Center $ FirstName001 $ FirstName002 $ FirstName003 $ LastName001 $ LastName002 $ LastName003 $;
cards;
A Jason Katie Peter Berkowitz. Basel Hirsch
B Dawn Betty Chris Crane Barhaghi Choe
C Andy Ben Donald Nassar Amidi Naomi
D Andrew Ronald Chirsty Lee Hesham JoAnn
E Lee Jennifer Louis Banooni Kimelman Douglas
run;


data want;
set have;
student001=catx("",FirstName001,LastName001);
student002=catx("",FirstName002,LastName002);
student003=catx("",FirstName003,LastName003);

keep center student:;
run;

kiranv_
Rhodochrosite | Level 12

are there only few columns then 

 

 student001=catx(' ', firstname001, lastname001);
student002 =catx(' ',firstname002, lastname002)
student003 = catx(' ', firstname003,lastname003)

ernie86
Calcite | Level 5
There are more than 100 columns.
novinosrin
Tourmaline | Level 20

@ernie86  Are you wanting to make it trickier with the subject of conditional concatenation? Well, if your columns are in pattern of sets as in firstname set of 3 precedes last name set as shown in your example, all you need is number of elements in one set to compute the automation.

So no matter what the total number of columns divided/2 will give you one set 

 

data have;
input Center $ FirstName001 $ FirstName002 $ FirstName003 $ LastName001 $ LastName002 $ LastName003 $;
cards;
A Jason Katie Peter Berkowitz. Basel Hirsch
B Dawn Betty Chris Crane Barhaghi Choe
C Andy Ben Donald Nassar Amidi Naomi
D Andrew Ronald Chirsty Lee Hesham JoAnn
E Lee Jennifer Louis Banooni Kimelman Douglas
run;

data want;
set have;
array t(*) FirstName001--LastName003;
array t1(*)$25 Student001	Student002	Student003;
_n=dim(t)/2;
do _n_=1 to _n;
t1(_n_)=catx(' ',t(_n_),t(_n_+_n));
end;
drop _:;
run;

 

novinosrin
Tourmaline | Level 20

@ernie86 you wrote- "There are more than 100 columns."

 

This makes it 50 students.

 

you can just change the second array subscript to 50 like

 

array student(50) $25; /* this generates your student number list that want to derive*/

kiranv_
Rhodochrosite | Level 12

or something like this

data want;
set have;

array new(*) $30 student001-student003;
array firstname(*) $ FirstName001-FirstName003;
array lastname(*) $ lastName001-lastName003;
do i = 1 to dim(firstname);
if substr(vname(firstname[i]),10,3) = substr(vname(lastname[i]),9,3)
then new{i}=cats(' ',firstname{i},lastname{i});
end;
run;
/*or if you have counterpart for every first and last name then*/


data want;
set have;

array new(*) $20 student001-student003;
array firstname(*) $ FirstName001-FirstName003;
array lastname(*) $ lastName001-lastName003;
do i = 1 to dim(firstname);
 new{i}=cats(' ',firstname{i},lastname{i});
end;
run;

 

novinosrin
Tourmaline | Level 20

@kiranv_  I like the fact your approach is much simpler to understand however I noticed there can be a few redundancies in your code that can be eliminated like this-->

 

data want;
set have;
array firstname(*) $ FirstName:;
array lastname(*) $ lastName:;
array student(*)  $30  student001-student003;
do i = 1 to dim(firstname);
student(i)=catx(' ', firstname(i), lastname(i));
end;
run;
kiranv_
Rhodochrosite | Level 12

you are right @novinosrin, fixed it 

PGStats
Opal | Level 21

Here is a solution that will work for any number of columns

 


proc transpose data=have out=long(where=(_name_ ne "Center"));
by center notsorted;
var _character_;
run;

data temp;
set long;
if _name_ =: "First" then part = 1; 
else if _name_ =: "Last" then part = 2;
id = scan(_name_, 1, "", "ab");
drop _name_;
run;

proc sql;
create table longWant as
select a.center, a.id, catx(' ', a.col1, b.col1) as name
from 
    temp as a inner join 
    temp as b on a.center=b.center and a.id=b.id
where a.part=1 and b.part=2
order by center, id;
quit;

proc transpose data=longWant out=want(drop=_name_) prefix=Student;
by center;
var name;
id id;
run;
PG
Vipul_Sawlani
Calcite | Level 5
data q1;
infile datalines ;
input Center:$ FirstName001:$ FirstName002:$ FirstName003:$ LastName001:$10. LastName002:$ LastName003:$;
datalines;
A  Jason   Katie    Peter   Berkowitz.	Basel	  Hirsch
B  Dawn	   Betty    Chris   Crane	Barhaghi  Choe
C  Andy	   Ben	    Donald  Nassar	Amidi	  Naomi
D  Andrew  Ronald   Chirsty Lee	        Hesham	  JoAnn
E  Lee	   Jennifer Louis   Banooni   	Kimelman  Douglas
run;
proc print data=q1;
run;

data sol (keep=Center Student001 Student002 Student003);
set q1;
Student001=catx("",FirstName001,LastName001);
Student002=catx("",FirstName002,LastName002);
Student003=catx("",FirstName003,LastName003);
run;
proc print data=sol;
run;

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 10 replies
  • 2735 views
  • 2 likes
  • 6 in conversation