Hi @Emma8
You can use PROC COMPARE to achieve this, along with the NOMISS option.
- It can only compare pairs of variables with the same type. To select them, you can run the code twice, first with 'num' in PROC SQL, and then with 'char' as the type.
- A PROC SQL and a MACRO are used to store all comparisons pairwise in macrovariables. Typically,
&var = var1 var1 var1 var1 var2 var2 var2 var4 var4 var5
&with = var2 var4 var5 var6 var4 var5 var6 var5 var6 var6
Output:
Best,
data have;
infile datalines dlm=" " dsd missover;
input Var1 $ var2 $ var3 $ var4 $ var5 $ var6 $;
datalines;
1 a 4 6 a 1
9 l 6 g l 5
8 d 5 y d
7 & 7 y 6
;
proc sql noprint;
select name
into:var1 -
from dictionary.columns
where libname = 'WORK' and memname = 'HAVE' and type = "char"; /*put either num or char*/
select count(name)
into:nb_var
from dictionary.columns
where libname = 'WORK' and memname = 'HAVE' and type = "char"; /*put either num or char*/
quit;
%global var with;
%let var=;
%let with=;
%macro pairs;
%local i j;
%do i = 1 %to &nb_var;
%do j = %eval(&i+1) %to &nb_var;
%let var=&var &&var&i;
%let with=&with &&var&j;
%end;
%end;
%mend pairs;
%pairs;
/* Report */
proc compare base=have listequalvar novalues nomiss;
var &var;
with &with;
run;
NB: you can leverage the following material to go one step further: https://support.sas.com/content/dam/SAS/support/en/sas-global-forum-proceedings/2018/1654-2018.pdf
Hi @Emma8
You can use PROC COMPARE to achieve this, along with the NOMISS option.
- It can only compare pairs of variables with the same type. To select them, you can run the code twice, first with 'num' in PROC SQL, and then with 'char' as the type.
- A PROC SQL and a MACRO are used to store all comparisons pairwise in macrovariables. Typically,
&var = var1 var1 var1 var1 var2 var2 var2 var4 var4 var5
&with = var2 var4 var5 var6 var4 var5 var6 var5 var6 var6
Output:
Best,
data have;
infile datalines dlm=" " dsd missover;
input Var1 $ var2 $ var3 $ var4 $ var5 $ var6 $;
datalines;
1 a 4 6 a 1
9 l 6 g l 5
8 d 5 y d
7 & 7 y 6
;
proc sql noprint;
select name
into:var1 -
from dictionary.columns
where libname = 'WORK' and memname = 'HAVE' and type = "char"; /*put either num or char*/
select count(name)
into:nb_var
from dictionary.columns
where libname = 'WORK' and memname = 'HAVE' and type = "char"; /*put either num or char*/
quit;
%global var with;
%let var=;
%let with=;
%macro pairs;
%local i j;
%do i = 1 %to &nb_var;
%do j = %eval(&i+1) %to &nb_var;
%let var=&var &&var&i;
%let with=&with &&var&j;
%end;
%end;
%mend pairs;
%pairs;
/* Report */
proc compare base=have listequalvar novalues nomiss;
var &var;
with &with;
run;
NB: you can leverage the following material to go one step further: https://support.sas.com/content/dam/SAS/support/en/sas-global-forum-proceedings/2018/1654-2018.pdf
@Emma8, after hard work and discussion made in the forum, next code uses call sortC with vname fuction
to bring wanted results.
The code is tested:
data have;
infile datalines dlm=" " dsd missover;
input Var1 $ var2 $ var3 $ var4 $ var5 $ var6 $;
datalines;;
1 a 4 6 a 1
9 l 6 g l 5
8 d 5 y d
7 & 7 y 6
;;;
run;
data _null_;
set have;
array _c {*} _character_;
array _n {*} _numeric_ ;
call symput('nvarC', left(dim(_c)));
call symput('nvarN', left(dim(_n)));
run;
%put nvarC = &nvarC;
%put nvarN = &nvarN;
data want(drop=i );
set have;
dlm = '10'x; /* any none expected character in values may be used as delimiter */
fmt = '$char2.'; /* force keep leading spaces */
array _BEFORE {*} _character_;
nvarc = dim(_BEFORE);
array _NAMES {&nvarC} $10. _vnm1 - _vnm&nvarC;
array _temp {&nvarC} $20. _tmp1 - _tmp&nvarC;
do i=1 to &nvarC;
_names(i) = vname(_BEFORE(i));
_temp(i) = catx(dlm, putc(_BEFORE(i), fmt), i );
end;
call sortC(of _temp(*));
do i=1 to dim(_temp)-1;
j=i+1;
value1 = scan(_temp(i),1, dlm);
value2 = scan(_temp(j),1, dlm);
if value1 = value2 then do;
ix1 = scan(_temp(i),2, dlm);
v1 = _names(ix1);
ix2 = scan(_temp(j),2, dlm);
v2 = _names(ix2);
put _N_= v1= v2= value1= value2= ;
end;
end;
run;
output in log shows:
Would you please post the full log.
I shall check and repair where need.
@Emma8 , are you comparing the char type variables only or both numeric and character?
I'll be glad to help you fix the code but I need see the full log:
I need see all adaptations done to the code, warning and error messages,
especially messages concerning "out of range at line ##### column ##" - because
line in log is not the same line in code and it changes per run. Usually such message
displays all variables values.
copy the full log into the </> icon-window and post it.
Without seeing the log I can only guess.
1) Pay attention that macro variable &nvarN is the number of all numeric variables in your data.
When you omit some variables - depending on which line code or which data step - you may cause
the index i point over the end of the array, thus resulting in "... array subscript out of range at line ..." error.
2) When you get such error message, the given line number points to the log line where the error occurred.
It is important to check the code and the do loop in the program, containing the same code as in the log.
3) Sometimes such error may be the result of overriding the index i inside the loop.
If you want me to continue help with that code, supply the full log in the next post.
If you prefer to struggle the code by yourself, I wish you success and appreciate it vary much.
BTW, if you compare only numeric variables, why have you post a test sample of character variables ?!
data have;
infile datalines dlm=" " dsd missover;
input Var1 $ var2 $ var3 $ var4 $ var5 $ var6 $;
length similar_char_vars similar_num_vars $32767.;
array _c{*} _character_;
array _n{*} _numeric_;
do i = 1 to dim(_c);
similar_char_vars='';
do j= i+1 to dim(_c);
if _c[i] = _c[j] and not missing(_c[i])then similar_char_vars = strip(similar_char_vars)||' '||vname(_c[i])||' '||vname(_c[j])||' '||strip(_c[j]);
end;
if ^missing(similar_char_vars) then output;
end;
do i = 1 to dim(_c);
similar_num_vars='';
do j= i+1 to dim(_c);
if _c[i] = _c[j] and not missing(_c[i])then similar_num_vars = strip(similar_num_vars)||' '||vname(_c[i])||' '||vname(_c[j])||' '||strip(_c[j]);
end;
if ^missing(similar_char_vars) then output;
end;
drop i j;
put similar_char_vars=;
datalines;;
1 a 4 6 a 1
9 l 6 g l 5
8 d 5 y d
7 & 7 y 6
;;;
run;
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.