Solved: looking for uniques among 10 variables

pangea17 · Posted 10-21-2015 01:55 PM

I have a dataset with phone numbers. There are 10 phone number fields for each observation. phone1 to phone10. What I am trying to do is determine if the number in phone1 is unique versus the number in fields phone2 through phone10. Also, is phone2 unique against all the other phone fields, etc. I can do this with a bunch of if statements, but I was thinking there must be a macro way to do this that would be less prone to typing error.

Astounding · Posted 10-21-2015 02:29 PM

This might be the most flexible approach, for future analysis purposes. It produces 10 new variables (count1-count10), containing a count of how many times the phone number appears within the set of 10.

data want;

set have;

array phone {10};

array count {10};

do i=1 to 10;

count{i}=0;

end;

do i=1 to 10;

do j=1 to 10;

if phone{i} = phone{j} then count{i} + 1;

end;

run;

So uniqueness is identified as a COUNT variable being 1. But you can also look for low counts, not just unique vs. not unique.

View solution in original post

PaigeMiller · Posted 10-21-2015 02:04 PM

With respect to a macro solution versus a data step solution, both would require do loops, and I don't see one as being easier or simpler or less coding than the other.

--
Paige Miller

Haikuo · Posted 10-21-2015 02:17 PM

What is your purpose? What would be your final outcome? Are you de-dup the phone numbers or Are you only select those unique? The best way to explain is to offer a set of HAVE/WANT, along with your descriptions.

pangea17 · Posted 10-21-2015 02:23 PM

I want to get a a number of uniques for each phone variable category by a group of other variables. So breaking it out by day of the week, by agency, and other things. So how many uniques are there for each phone1, .... phone 10.

ndp · Posted 10-21-2015 02:20 PM

This will create pairs of values that are same

data test1;
	set test;
	length x $100;
	x="";
	array test{*} var1-var10;
	do i=1 to dim(test)-1;
		do j=1 to dim(test)-i;
			if test{i}=test{i+j} then x=strip(x)||" "||strip(put(i,best.))||"-"||strip(put(i+j,best.));
		end;
	end;
	drop i j;
run;

Astounding · Posted 10-21-2015 02:29 PM

This might be the most flexible approach, for future analysis purposes. It produces 10 new variables (count1-count10), containing a count of how many times the phone number appears within the set of 10.

data want;

set have;

array phone {10};

array count {10};

do i=1 to 10;

count{i}=0;

end;

do i=1 to 10;

do j=1 to 10;

if phone{i} = phone{j} then count{i} + 1;

end;

run;

So uniqueness is identified as a COUNT variable being 1. But you can also look for low counts, not just unique vs. not unique.

MikeZdeb · Posted 10-22-2015 12:08 PM

Hi. If all you want to do is identify the unique phone numbers in each observation, you could try this ...

proc format;
picture phone (default=14) low-high='999)-999-9999' (prefix='(');
run;

data phone;
input id p1-p10 @@;
datalines;
123
9999999999 1234567890 3234567890 4234567890
5234567890 6234567890 7234567890 9234567890
1234567890 9234567890
234
6234567890 1234567890 9234567890 8234567890
5234567890 6234567890 7234567890 1234567890
1234567890 6234567890
;

data u_phone (keep=id phone);
set phone;
array p(10);
all = catx('|',of p(*));
do i=1 to 10;
if lengthn(tranwrd(all,cat(p(i)),'*')) eq 100 then do; phone=p(i); output; end;
end;
format phone phone.;
run;

data set U_PHONE ...

id=123

Obs phone

1    (999)-999-9999
2    (323)-456-7890
3    (423)-456-7890
4    (523)-456-7890
5    (623)-456-7890
6    (723)-456-7890

id=234

Obs phone

7    (923)-456-7890
8    (823)-456-7890
9    (523)-456-7890
10    (723)-456-7890

looking for uniques among 10 variables

Re: looking for uniques among 10 variables

Re: looking for uniques among 10 variables

Re: looking for uniques among 10 variables

Re: looking for uniques among 10 variables

Re: looking for uniques among 10 variables

Re: looking for uniques among 10 variables

Re: looking for uniques among 10 variables

Registration is open

Registration is open

SAS Training: Just a Click Away