Obsidian | Level 7

Identifying duplicates across multiple variables

Hi,

I have 4 variables as var1…var4. I want to identify the observations where same value occurs in more than 1 variable and then keep the value at it’s first occurrence.

For example,

The dataset I have:

Var1 var2 var3 var4

1          2          3          4

0          0          4          9

1          1          1          20

0          1          2          2

1          0          2          1

5          9          9          9

3          3          3          3

The dataset I want:

Var1 var2 var3 var4

1          2          3          4

0          .           4          9

1          1          .           20

0          1          2          .

1          0          2          .

5          9          .           .

3          .           .           .

1 ACCEPTED SOLUTION

Accepted Solutions
Super User

Re: Identifying duplicates across multiple variables

``````data have;
input Var1 var2 var3 var4;
cards;
1          2          3          4
0          0          4          9
1          1          1          20
0          1          2          2
1          0          2          1
5          9          9          9
3          3          3          3
;

data want;
set have;
array v{*} var1-var4;
array x{*} x1-x4;
do i=1 to dim(v);
if v{i} not in x then x{i}=v{i};
else v{i}=.;
end;
keep var1-var4;
run;``````
4 REPLIES 4
SAS Super FREQ

Re: Identifying duplicates across multiple variables

The logic you want to implement is not clear since the second observation doesn't follow the guidelines you describe. In that observation you have two '1's! If I understand your guidelines the second observation should be 1 . . 20. Can you explain why that is not the case?
Obsidian | Level 7

Re: Identifying duplicates across multiple variables

Yes- you are right. it is about the 3rd observation

The dataset I want:

Var1 var2 var3 var4

1          2          3          4

0          .           4          9

1          .          .           20

0          1          2          .

1          0          2          .

5          9          .           .

3          .           .           .

Super User

Re: Identifying duplicates across multiple variables

``````data have;
input Var1 var2 var3 var4;
cards;
1          2          3          4
0          0          4          9
1          1          1          20
0          1          2          2
1          0          2          1
5          9          9          9
3          3          3          3
;

data want;
set have;
array v{*} var1-var4;
array x{*} x1-x4;
do i=1 to dim(v);
if v{i} not in x then x{i}=v{i};
else v{i}=.;
end;
keep var1-var4;
run;``````
Obsidian | Level 7

Re: Identifying duplicates across multiple variables

Thanks very much.

Discussion stats
• 4 replies
• 346 views
• 1 like
• 3 in conversation