BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
NonSleeper
Quartz | Level 8

The current data looks like:

data have;

input ID var1 var2 var3;

datalines;

01     1     2     3

02     4     2     4

03     5     5     5

;

I want to compare from var1 to var3 and, if the values of any two or more of these variables are equal, then only retain value at one variable. Any variable is OK, but if it matters let's say the preceding variable. E.g., if var1 and var2 have an identical value, then the value at var1 will be retained.

The new data should look like:

01     1     2     3

02     4     2     .

03     5     .     .

EDIT: I think a two-phase data reshape can do the job, but not sure if there's a more efficient way.

First reshape from wide to long format:

data wantlong;set have;

array avarlist

var1 var2 var3;

do i = 1 to 3;

varlong = avarlist(i);output;end;run;

proc sort data=wantlong nodupkey;by ID varlong;

Then bring it back to wide format.

proc transpose data=wantlong out= wantwide prefix=var;
  by ID;
  id i;
  var varlong;
run;

1 ACCEPTED SOLUTION

Accepted Solutions
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Hi,

You were right about the array, I would just take each value in reverse order and see if the value exists, if so blank current record:

data want (drop=i j);

  set have;

  array var{3};

  do i=3 to 1 by -1;

    do j=i-1 to 1 by -1;

      if var{i}=var{j} then var{i}=.;

    end;

  end;

run;

View solution in original post

4 REPLIES 4
EricHoogenboom
Fluorite | Level 6

I think your solution is a logical and reliable one. Your variables are in fact observations, so why not transpose them...

Another approach may be putting your variables into an array and sort and deduplicate it in the array. There are papers (google sas array sort) about the quicksort algorithm implemented in arrays.

Cheers,

Eric

RW9
Diamond | Level 26 RW9
Diamond | Level 26

Hi,

You were right about the array, I would just take each value in reverse order and see if the value exists, if so blank current record:

data want (drop=i j);

  set have;

  array var{3};

  do i=3 to 1 by -1;

    do j=i-1 to 1 by -1;

      if var{i}=var{j} then var{i}=.;

    end;

  end;

run;

Ksharp
Super User

You want compare it for each row ? not the whole dataset ?

Code: Program

data have;
input ID var1 var2 var3;
datalines;
01 1 2 3
02 4 2 4
03 5 5 5
;
run;
data want;
if _n_ eq 1 then do;
  length k 8;
  declare hash ha();
  ha.definekey('k');
  ha.definedone();
end;
set have;
array x{*} var:;
ha.clear();
do i=1 to dim(x);
k=x{i};
if ha.check()=0 then call missing(x{i});
  else ha.add();
end;
drop k i;
run;

Xia Keshan

KrisNori
Obsidian | Level 7

data b(drop= value i j);

  SET A2;

  RETAIN value;

  ARRAY number{1,3} var1-var3;

  DO I = 1 to DIM1(number);

       DO J = 2 to DIM2(number);

            IF J= 2 THEN value = var1;

                IF number{i,j} = value THEN number{i,j} = .;

          END;

  END;

RUN;

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to connect to databases in SAS Viya

Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 4 replies
  • 2994 views
  • 3 likes
  • 5 in conversation