Your post
"
In one dataset, there are variables before “=“ and in another dataset, there are variables after “=“ sign: CrAtBat = ab CrBB =fkae CrHits =vrgsd ........ etc. ....
does not answer my question - "
What do you mean by "one may have more ID" - is ID unique at one or both datasets?
Suppose you have unique ID in table1 but more than one observations of same id in table 2 - what do you want to compare? and how to treat the many to one matching ?"
Example 1
suppose table_1 contains:
ID VAR_A
22 333
and table_2 contains:
ID VAR_X
22 340
22 500
22 333
22 999
What output you want? is any other variable or attribute in table_2 that distinguish between
the observations of same ID? - such as serial number or date or ...
in order to report:
var_A = var_X when ID=22 and Table_2 date=...
Example 2:
suppose table_1 contains:
ID VAR_A
22 333
25 444
and table_2 contains:
ID VAR_X
22 333
25 777
Would it be correct to say that:
var_A = var_X ???
I can accept asking to ignore cases of non equal values, either null or not,
but: Saying "Var_x is different than var_a, obviously." - why?
in Example_1 there are ID=22 with VAR_A=VAR_X=333
in Example_2 there are ID=22 with VAR_A=VAR_X=333 (same as in example 1)
Are you asking to report variable names only if there is at least one pair of observations
with equal values on all IDs ?
So, I made 2 datasets:
I want to get just list of variables with the same values except missing:
For example, bothe datasets have the same Id-s (variable "name"),
CrAtBat = ab
CrBB =fkae
CrHits =vrgsd
CrHome =dfwef
CrRbi =vewef
CrRuns =fff
Div =wsfde_d
Division = vdfgew
League =dfgwe_23
Position = fgerg
Salary =vregf
Team =fre
YrMajor = rewa
logSalary = fwef
nAssts =cefwe
nAtBat =efqwef
nBB =qfqewfq
nError =qfefe
nHits =fewfe
Import both tables to sas then try next, not tested, code,
after adapting variable names creating the ID - assumed to be NAME TEAM
and adapt the max length of char type variables (assumed up to $40):
/* import the csv files into table1 and table2 in work library */
/*
proc sql;
select * from dictionary.columns into :list1 SEPARATED BY ' '
where libname='WORK' and memname 'TABLE1';
select * from dictionary.columns into :list2 SEPARATED BY ' '
where libname='WORK' and memname 'TABLE2';
quit;
*/
%let IDS = NAME TEAM; /* suposed that both variables create the ID */
data temp1;
set table1;
length _type $1 varname $32 value_c $40; /* adapt value_c to max length of char type vars */
keep &IDS _type varname value_c value_n;
array _c {*} _character_;
array _n {*} _numeric_ ;
do i=1 to dim(_c);
type = 'C';
varname = vname(_c(i));
value_c = _c(i); value_n=.;
if varname not in "&IDS" then output;
end;
do i=1 to dim(_n);
type = 'N';
varname = vname(_n(i));
value_n = _n(i); value_c=' ';
if varname not in "&IDS" then output;
end;
run;
data temp2;
set table2;
length _type $1 varname $32 value_c $40; /* adapt value_c to max length of char type vars */
keep &IDS _type varname value_c value_n;
array _c {*} _character_;
array _n {*} _numeric_ ;
do i=1 to dim(_c);
type = 'C';
varname = vname(_c(i));
value_c = _c(i); value_n=.;
if varname not in "&IDS" then output;
end;
do i=1 to dim(_n);
type = 'N';
varname = vname(_n(i));
value_n = _n(i); value_c=' ';
if varname not in "&IDS" then output;
end;
run;
proc sort data=temp1; by &IDS _type value_c value_n; run;
proc sort data=temp2; by &IDS _type value_c value_n; run;
%let IDC = NAME, TEAM; /* same as above but separated by comma (',') */
proc sql;
select a.varname, b.varname
from temp1 as a
inner join temp2 as both
on a.NAME = b.NAME and
a.TEAM = b.TEAM and
a._type = b._tyep and
a.value_c = b._value_c and
a.value_n = b.value_n;
quit;
My typo - code for &ID with more than one varaible should be:
if varname not in ("&IDS") then output;
As you defined only one variable variable: subid the code need be
changed into:
if varname not = "&IDS" then output;
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Still thinking about your presentation idea? The submission deadline has been extended to Friday, Nov. 14, at 11:59 p.m. ET.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.