Hi All,
I need to pick out unique phone numbers among three columns per ID as below:
USERID | Phone1 | Phone2 | Phone3 |
1 | 1234567890 | 1234567890 | 1234567890 |
2 | 2345678901 | 2345678901 | |
3 | 4567890123 | 4567890124 | 4567890125 |
4 | 9876543210 | 9876543211 | |
5 | 8765432109 | 8765432109 | 8765432110 |
6 | 6543210987 | 6543210988 |
Some records actually have same numbers, like USERID 1 and 2 while others have different. I just wanted to keep unique numbers. If there is only one unique number, then phone1_valid takes that value; If there are two unique number, then phone1_valid and phone2_valid take the two values; etc..
Do you have a way to do that?
Thanks!
Lz
Hi @lizzy28 I would recommend to have a FLAG for unique and dups, so that you can filter and summarize however you want-
data have;
input USERID $ (Phone1 Phone2 Phone3) (:$12.);
cards;
1 1234567890 1234567890 1234567890
2 2345678901 2345678901 .
3 4567890123 4567890124 4567890125
4 9876543210 9876543211 .
5 8765432109 8765432109 8765432110
6 6543210987 . 6543210988
;
proc transpose data=have out=temp;
by userid;
var phone1-phone3;
run;
proc sort data=temp;
where col1>' ';
by userid col1;
run;
data want;
set temp;
by userid col1;
length check $12;
if not(first.col1 and last.col1) then check='Dups';
else check='unique';
rename col1=phone_no
_name_=Phone;
run;
USERID | Phone | phone_no | check |
---|---|---|---|
1 | Phone1 | 1234567890 | Dups |
1 | Phone2 | 1234567890 | Dups |
1 | Phone3 | 1234567890 | Dups |
2 | Phone1 | 2345678901 | Dups |
2 | Phone2 | 2345678901 | Dups |
3 | Phone1 | 4567890123 | unique |
3 | Phone2 | 4567890124 | unique |
3 | Phone3 | 4567890125 | unique |
4 | Phone1 | 9876543210 | unique |
4 | Phone2 | 9876543211 | unique |
5 | Phone1 | 8765432109 | Dups |
5 | Phone2 | 8765432109 | Dups |
5 | Phone3 | 8765432110 | unique |
6 | Phone1 | 6543210987 | unique |
6 | Phone3 | 6543210988 | unique |
Hi @lizzy28 I would recommend to have a FLAG for unique and dups, so that you can filter and summarize however you want-
data have;
input USERID $ (Phone1 Phone2 Phone3) (:$12.);
cards;
1 1234567890 1234567890 1234567890
2 2345678901 2345678901 .
3 4567890123 4567890124 4567890125
4 9876543210 9876543211 .
5 8765432109 8765432109 8765432110
6 6543210987 . 6543210988
;
proc transpose data=have out=temp;
by userid;
var phone1-phone3;
run;
proc sort data=temp;
where col1>' ';
by userid col1;
run;
data want;
set temp;
by userid col1;
length check $12;
if not(first.col1 and last.col1) then check='Dups';
else check='unique';
rename col1=phone_no
_name_=Phone;
run;
USERID | Phone | phone_no | check |
---|---|---|---|
1 | Phone1 | 1234567890 | Dups |
1 | Phone2 | 1234567890 | Dups |
1 | Phone3 | 1234567890 | Dups |
2 | Phone1 | 2345678901 | Dups |
2 | Phone2 | 2345678901 | Dups |
3 | Phone1 | 4567890123 | unique |
3 | Phone2 | 4567890124 | unique |
3 | Phone3 | 4567890125 | unique |
4 | Phone1 | 9876543210 | unique |
4 | Phone2 | 9876543211 | unique |
5 | Phone1 | 8765432109 | Dups |
5 | Phone2 | 8765432109 | Dups |
5 | Phone3 | 8765432110 | unique |
6 | Phone1 | 6543210987 | unique |
6 | Phone3 | 6543210988 | unique |
You want to remove duplicates within rows. Do you also want to remove duplicates across rows? Or can USERIDs share a phone number?
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.