BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
mlogan
Lapis Lazuli | Level 10

Hi All,

I have the following table with people's contact information. for each by group I have their Home/Mobile/Email and Cellular no. I need to find the observation with contact_type = 'Home'. If the by group don't have a 'Home' contact type then the code will bring the contact_type = 'Mobile', if Mobile is missing then it will bring the contact_type = 'Cellular'. 

 

ID     Contact_Type   Contact

101  Home                 4434323454

101  Email                 abc@gmail.com

101  Mobile               8767876546

102  Mobile               8987609878

102  Email                 ghg@gmail.com

103  Cellular             6779998786

103  Email                aaa@gmail.com

 

Output will look like this:

 

ID     Contact_Type   Contact

101  Home                 4434323454

102  Mobile               8987609878

103  Cellular             6769998786

 

 

Can someone help me please. Thanks,

1 ACCEPTED SOLUTION

Accepted Solutions
FreelanceReinh
Jade | Level 19

Hi @mlogan,

 

I can reassure you: "My" code (well, actually it's primarily @Reeza's with valuable input from @LinusH) works 

even if there is no email address available in a particular BY group (but only some of the other three contact types).

 

Okay, so you don't want to see email addresses in your output dataset. This is no problem and you already took the first step in adapting the code correspondingly: You removed 'Email' from the arguments of WHICHC.


As a consequence, however, any existing observation with Contact_Type='Email' (or with any other value of Contact_Type not listed in the arguments of WHICHC, for that matter) will be assigned priority 0. This zero, in turn, will always end up as the minimum priority in the BY group and hence the observation will be selected (even more than one such observation, if any).

 

To prevent this, we can simply amend the WHERE clause to restrict the selection to observations with non-zero (and non-missing) priority:

where not missing(Contact) & calculated priority

With this small change you will get your expected output from the new input data you posted. To read the raw data correctly, we have to correct your suggested INPUT statement, though (I assume the "&" was a typo):

input ID (Contact_type Contact) (:$20.);

 

Please note that values of Contact_Type not listed in WHICHC(...) could still cause unexpected results, as described above. This includes values such as 'HOME', 'mobile', 'Cell' or ' Home' (the latter with leading blank).

 

To make the code more robust, we could modify the function call as follows:

whichc(put(left(Contact_Type),$upcase3.), 'HOM', 'MOB', 'CEL')

This would handle the abovementioned spellings of contact types as well as the ordinary ones.

 

Finally, it should be mentioned that an ID with two or more observations with "top priority" (e.g. two different 'Mobile' numbers and no 'Home') would occur with all these observations in the output dataset.

View solution in original post

7 REPLIES 7
Reeza
Super User

So basically you have a priority. 

 

Code me the phone type to the priority level - you can use if/then, select, format or a merge/join. 

Sort by id and priority and use first.id to obtain the highest priority record. 

 

Untested but should get you started. 

Proc format;
InValue $ phone_type_fmt
'Home' = 1
'Mobile' = 2
....
Rest of your definitions;

Proc sql;
Create table temp as
Select a.*, input(contact_type, $phone_type_fmt.) as priority
From have
Where not missing(phone_number)
Order by id, priority;
Quit;

Then use a data step with first.id to obtain the highest priority number. 

 

mlogan
Lapis Lazuli | Level 10
Hi Reeza, Would you please write the first.id for procedure for me. I want the output like:

101 Home 4434323454
102 Mobile 8987609878
103 Cellular 6779998786

Thanks,


data have;
input ID (Contact_type Contact) ( & $20.);
cards;

101 Home 4434323454
101 Mobile 8767876546
102 Mobile 8987609878
102 Email ghg@gmail.com
103 Email aaa@gmail.com
103 Cellular 6779998786
104 Email sss@gmail.com
;
RUN

Proc format;
InValue $ Contact_type_fmt
'Home' = 1
'Mobile' = 2
'Cellular' = 3
'Email'= 4;

Proc sql;
Create table temp as
Select have.*, input(contact_type, $contact_type_fmt.) as priority
From have
Where not missing(Contact)
Order by id, priority;
Quit;
LinusH
Tourmaline | Level 20
If you would like to do this in one step, just slightly modify @Reeza's SQL to include a group by, and a having priority = min(priority).
Data never sleeps
FreelanceReinh
Jade | Level 19

Hi @mlogan,

 

To simplify the code further, you could use the WHICHC function rather than an informat to define variable PRIORITY:

proc sql;
create table want(drop=priority) as
select *, whichc(Contact_Type, 'Home', 'Mobile', 'Cellular', 'Email') as priority
from have
where not missing(Contact)
group by id
having priority=min(priority);
quit;
mlogan
Lapis Lazuli | Level 10

Thanks Reinhard, your code works, but it doesn't work when 101 by group don't have an email and 104 has only email (see the following below). I want Home, Mobile and Cellular as priority order. If all are missing, then I don't want that observation (104 in this case).  Can you please help further. Thanks,

 

 

data have;
input ID (Contact_type Contact) ( & $20.);
cards;

101 Home 4434323454
101 Mobile 8767876546
102 Mobile 8987609878
102 Email ghg@gmail.com
103 Email aaa@gmail.com
103 Cellular 6779998786
104 Email sss@gmail.com
;
RUN


proc sql;
create table want(drop=priority) as
select *, whichc(Contact_Type, 'Home', 'Mobile', 'Cellular') as priority
from have
where not missing(Contact)
group by id
having priority=min(priority);
quit;

 

Folowing is my expected output:

 

101 Home 4434323454
102 Mobile 8987609878
103 Cellular 6779998786

FreelanceReinh
Jade | Level 19

Hi @mlogan,

 

I can reassure you: "My" code (well, actually it's primarily @Reeza's with valuable input from @LinusH) works 

even if there is no email address available in a particular BY group (but only some of the other three contact types).

 

Okay, so you don't want to see email addresses in your output dataset. This is no problem and you already took the first step in adapting the code correspondingly: You removed 'Email' from the arguments of WHICHC.


As a consequence, however, any existing observation with Contact_Type='Email' (or with any other value of Contact_Type not listed in the arguments of WHICHC, for that matter) will be assigned priority 0. This zero, in turn, will always end up as the minimum priority in the BY group and hence the observation will be selected (even more than one such observation, if any).

 

To prevent this, we can simply amend the WHERE clause to restrict the selection to observations with non-zero (and non-missing) priority:

where not missing(Contact) & calculated priority

With this small change you will get your expected output from the new input data you posted. To read the raw data correctly, we have to correct your suggested INPUT statement, though (I assume the "&" was a typo):

input ID (Contact_type Contact) (:$20.);

 

Please note that values of Contact_Type not listed in WHICHC(...) could still cause unexpected results, as described above. This includes values such as 'HOME', 'mobile', 'Cell' or ' Home' (the latter with leading blank).

 

To make the code more robust, we could modify the function call as follows:

whichc(put(left(Contact_Type),$upcase3.), 'HOM', 'MOB', 'CEL')

This would handle the abovementioned spellings of contact types as well as the ordinary ones.

 

Finally, it should be mentioned that an ID with two or more observations with "top priority" (e.g. two different 'Mobile' numbers and no 'Home') would occur with all these observations in the output dataset.

mlogan
Lapis Lazuli | Level 10
Hi Reinhard, Thanks for your help. it worked for me.
input ID (Contact_type Contact) (& $20.); actually worked for me and I don't know why, but I learned a second thing(:$20.);

whichc(put(left(Contact_Type),$upcase3.), 'HOM', 'MOB', 'CEL') is quite handy.

Thank you,

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 1250 views
  • 9 likes
  • 4 in conversation