The current data looks like:
data have;
input ID var1 var2 var3;
datalines;
01 1 2 3
02 4 2 4
03 5 5 5
;
I want to compare from var1 to var3 and, if the values of any two or more of these variables are equal, then only retain value at one variable. Any variable is OK, but if it matters let's say the preceding variable. E.g., if var1 and var2 have an identical value, then the value at var1 will be retained.
The new data should look like:
01 1 2 3
02 4 2 .
03 5 . .
EDIT: I think a two-phase data reshape can do the job, but not sure if there's a more efficient way.
First reshape from wide to long format:
data wantlong;set have;
array avarlist
var1 var2 var3;
do i = 1 to 3;
varlong = avarlist(i);output;end;run;
proc sort data=wantlong nodupkey;by ID varlong;
Then bring it back to wide format.
proc transpose data=wantlong out= wantwide prefix=var;
by ID;
id i;
var varlong;
run;
Hi,
You were right about the array, I would just take each value in reverse order and see if the value exists, if so blank current record:
data want (drop=i j);
set have;
array var{3};
do i=3 to 1 by -1;
do j=i-1 to 1 by -1;
if var{i}=var{j} then var{i}=.;
end;
end;
run;
I think your solution is a logical and reliable one. Your variables are in fact observations, so why not transpose them...
Another approach may be putting your variables into an array and sort and deduplicate it in the array. There are papers (google sas array sort) about the quicksort algorithm implemented in arrays.
Cheers,
Eric
Hi,
You were right about the array, I would just take each value in reverse order and see if the value exists, if so blank current record:
data want (drop=i j);
set have;
array var{3};
do i=3 to 1 by -1;
do j=i-1 to 1 by -1;
if var{i}=var{j} then var{i}=.;
end;
end;
run;
You want compare it for each row ? not the whole dataset ?
data have;
input ID var1 var2 var3;
datalines;
01 1 2 3
02 4 2 4
03 5 5 5
;
run;
data want;
if _n_ eq 1 then do;
length k 8;
declare hash ha();
ha.definekey('k');
ha.definedone();
end;
set have;
array x{*} var:;
ha.clear();
do i=1 to dim(x);
k=x{i};
if ha.check()=0 then call missing(x{i});
else ha.add();
end;
drop k i;
run;
Xia Keshan
data b(drop= value i j);
SET A2;
RETAIN value;
ARRAY number{1,3} var1-var3;
DO I = 1 to DIM1(number);
DO J = 2 to DIM2(number);
IF J= 2 THEN value = var1;
IF number{i,j} = value THEN number{i,j} = .;
END;
END;
RUN;
Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.
Register today!Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.
Find more tutorials on the SAS Users YouTube channel.