The current data looks like:
data have;
input ID var1 var2 var3;
datalines;
01 1 2 3
02 4 2 4
03 5 5 5
;
I want to compare from var1 to var3 and, if the values of any two or more of these variables are equal, then only retain value at one variable. Any variable is OK, but if it matters let's say the preceding variable. E.g., if var1 and var2 have an identical value, then the value at var1 will be retained.
The new data should look like:
01 1 2 3
02 4 2 .
03 5 . .
EDIT: I think a two-phase data reshape can do the job, but not sure if there's a more efficient way.
First reshape from wide to long format:
data wantlong;set have;
array avarlist
var1 var2 var3;
do i = 1 to 3;
varlong = avarlist(i);output;end;run;
proc sort data=wantlong nodupkey;by ID varlong;
Then bring it back to wide format.
proc transpose data=wantlong out= wantwide prefix=var;
by ID;
id i;
var varlong;
run;
Hi,
You were right about the array, I would just take each value in reverse order and see if the value exists, if so blank current record:
data want (drop=i j);
set have;
array var{3};
do i=3 to 1 by -1;
do j=i-1 to 1 by -1;
if var{i}=var{j} then var{i}=.;
end;
end;
run;
I think your solution is a logical and reliable one. Your variables are in fact observations, so why not transpose them...
Another approach may be putting your variables into an array and sort and deduplicate it in the array. There are papers (google sas array sort) about the quicksort algorithm implemented in arrays.
Cheers,
Eric
Hi,
You were right about the array, I would just take each value in reverse order and see if the value exists, if so blank current record:
data want (drop=i j);
set have;
array var{3};
do i=3 to 1 by -1;
do j=i-1 to 1 by -1;
if var{i}=var{j} then var{i}=.;
end;
end;
run;
You want compare it for each row ? not the whole dataset ?
data have;
input ID var1 var2 var3;
datalines;
01 1 2 3
02 4 2 4
03 5 5 5
;
run;
data want;
if _n_ eq 1 then do;
length k 8;
declare hash ha();
ha.definekey('k');
ha.definedone();
end;
set have;
array x{*} var:;
ha.clear();
do i=1 to dim(x);
k=x{i};
if ha.check()=0 then call missing(x{i});
else ha.add();
end;
drop k i;
run;
Xia Keshan
data b(drop= value i j);
SET A2;
RETAIN value;
ARRAY number{1,3} var1-var3;
DO I = 1 to DIM1(number);
DO J = 2 to DIM2(number);
IF J= 2 THEN value = var1;
IF number{i,j} = value THEN number{i,j} = .;
END;
END;
RUN;
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.
Find more tutorials on the SAS Users YouTube channel.