SAS Data Integration Studio, DataFlux Data Management Studio, SAS/ACCESS, SAS Data Loader for Hadoop and others

How to compare a set of variables, detect identical values, and retain only one value?

Accepted Solution Solved
Reply
Frequent Contributor
Posts: 75
Accepted Solution

How to compare a set of variables, detect identical values, and retain only one value?

The current data looks like:

data have;

input ID var1 var2 var3;

datalines;

01     1     2     3

02     4     2     4

03     5     5     5

;

I want to compare from var1 to var3 and, if the values of any two or more of these variables are equal, then only retain value at one variable. Any variable is OK, but if it matters let's say the preceding variable. E.g., if var1 and var2 have an identical value, then the value at var1 will be retained.

The new data should look like:

01     1     2     3

02     4     2     .

03     5     .     .

EDIT: I think a two-phase data reshape can do the job, but not sure if there's a more efficient way.

First reshape from wide to long format:

data wantlong;set have;

array avarlist

var1 var2 var3;

do i = 1 to 3;

varlong = avarlist(i);output;end;run;

proc sort data=wantlong nodupkey;by ID varlong;

Then bring it back to wide format.

proc transpose data=wantlong out= wantwide prefix=var;
  by ID;
  id i;
  var varlong;
run;


Accepted Solutions
Solution
‎09-25-2015 06:23 AM
Super User
Super User
Posts: 7,942

Re: How to compare a set of variables, detect identical values, and retain only one value?

Posted in reply to NonSleeper

Hi,

You were right about the array, I would just take each value in reverse order and see if the value exists, if so blank current record:

data want (drop=i j);

  set have;

  array var{3};

  do i=3 to 1 by -1;

    do j=i-1 to 1 by -1;

      if var{i}=var{j} then var{i}=.;

    end;

  end;

run;

View solution in original post


All Replies
Occasional Contributor
Posts: 11

Re: How to compare a set of variables, detect identical values, and retain only one value?

Posted in reply to NonSleeper

I think your solution is a logical and reliable one. Your variables are in fact observations, so why not transpose them...

Another approach may be putting your variables into an array and sort and deduplicate it in the array. There are papers (google sas array sort) about the quicksort algorithm implemented in arrays.

Cheers,

Eric

Solution
‎09-25-2015 06:23 AM
Super User
Super User
Posts: 7,942

Re: How to compare a set of variables, detect identical values, and retain only one value?

Posted in reply to NonSleeper

Hi,

You were right about the array, I would just take each value in reverse order and see if the value exists, if so blank current record:

data want (drop=i j);

  set have;

  array var{3};

  do i=3 to 1 by -1;

    do j=i-1 to 1 by -1;

      if var{i}=var{j} then var{i}=.;

    end;

  end;

run;

Super User
Posts: 10,018

Re: How to compare a set of variables, detect identical values, and retain only one value?

Posted in reply to NonSleeper

You want compare it for each row ? not the whole dataset ?

Code: Program

data have;
input ID var1 var2 var3;
datalines;
01 1 2 3
02 4 2 4
03 5 5 5
;
run;
data want;
if _n_ eq 1 then do;
  length k 8;
  declare hash ha();
  ha.definekey('k');
  ha.definedone();
end;
set have;
array x{*} var:;
ha.clear();
do i=1 to dim(x);
k=x{i};
if ha.check()=0 then call missing(x{i});
  else ha.add();
end;
drop k i;
run;

Xia Keshan

Occasional Contributor
Posts: 19

Re: How to compare a set of variables, detect identical values, and retain only one value?

Posted in reply to NonSleeper

data b(drop= value i j);

  SET A2;

  RETAIN value;

  ARRAY number{1,3} var1-var3;

  DO I = 1 to DIM1(number);

       DO J = 2 to DIM2(number);

            IF J= 2 THEN value = var1;

                IF number{i,j} = value THEN number{i,j} = .;

          END;

  END;

RUN;

🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 4 replies
  • 429 views
  • 3 likes
  • 5 in conversation