Hi,
I am trying to clean a Customer related data which is in excel format.Basically, what I am trying to achieve is that I am trying to assign multiple contact details from different observations to the same name(the same person has multiple observations with different contact details.
e.g.
Name Contact e-mail
Tim +918223378569 tim.allen@gmail.com
Tim +343434343434 tim123.@yahoo.com
Tim +916544778889 ris_tim@hotmail.com
I want this to be grouped as a single observation.(and all other observations should follow the same rule ofcourse)
So this is the result that I wish to achieve
Name Conatct email
Tim +918223378569,+343434343434,+916544778889 tim.allen@gmail.com,tim123.@yahoo.com,ris_tim@hotmail.com
Thanks in advance.
Try:
Data A;
Input Name $4. Contact $14. e_mail $20.;
Datalines;
Tim +918223378569 tim.allen@gmail.com
Tim +343434343434 tim123.@yahoo.com
Tim +916544778889 ris_tim@hotmail.com
BBB +118223378569 tim.allen@gmail.com
BBB +143434343434 tim123.@yahoo.com
BBB +116544778889 ris_tim@hotmail.com
;
Run;
Proc Sort Data=A;
By Name;
Run;
Data Want;
Length Contact_Agg $100. e_mail_Agg $100.;
Retain Contact_Agg e_mail_Agg;
Do _N_=1 By 1 Until (Last.Name);
Set a;
By Name;
Contact_Agg=IfC(First.Name,Contact,CatX(',',Contact_Agg,Contact));
e_mail_Agg=IfC(First.Name,e_mail,CatX(',',e_mail_Agg,e_mail_Agg));
End;
Run;
Sort by Name.
Then do
data want (
drop=contact email
rename=(new_contact=contact new_email=email)
);
set have;
by Name;
length new_contact new_email $ 500; * or whatever is needed to hold all values;
retain new_contact new_email;
if first.Name then do;
new_contact = '';
new_email = '';
end;
new_contact = catx(',',trim(new_contact),contact);
new_email = catx(',',trim(new_email),email);
if last.Name then output;
run;
Is this a one time shot, or are you preparing a more permanent solution?
If so, your proposed solution might be optimal. To search for a specific phon no needs scanning through a variable length variable, which is not query friendly.
Consider a to normalize the table structure instead, either by having separate phone and a email tables, or a common with contact type and the address (phone no or email).
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Check out this tutorial series to learn how to build your own steps in SAS Studio.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.