11-28-2016 05:11 PM
Please, I need a soft method to compare two datasets.
The idea, is to loop on the lines of first table, if the line in table 1 is the same in table 2, we keep the obs number , if not 0
I know that, we can use compare proc, but this, this proc will compare the observers by order (first with first, so on ..) . But I do not want that
proc compare base=t1 comp=t2;
I know that, we can use proc sort to sort the the tables by _all_ and then using compare , but I do not want that
proc sort data=ti out ti;
11-28-2016 06:08 PM
data T1 ; input Id Name $; cards; 1 Bob 2 sausan 3 Petter 4 bonde ; data T2; input Id Name $; row=_n_; cards; 4 bonde 2 sausan 1 Bob ; proc sort data =t1; by id; run; proc sort data =t2; by id; run; option missing=0; data t1t2; merge t1 t2; by id; run;
11-28-2016 10:45 PM
You can do it, but you might need a lot of code to determine when you have a match. To illustrate, I'll use the name variable that you provided.
do _n_=1 to _nobs_;
set t2 (rename=(name=name2)) point=_n_ nobs=_nobs_;
if name=name2 then obs_in_T2=_n_;
if obs_in_T2=. then obs_in_T2=0;
This is feasible with a small number of variables, but very clumsy with a lot of variables. But it can be done.
11-29-2016 04:29 AM
How about HASH?
What is the key - ID or Name or both?
Here is a solution where you can have the key in whatever way you want. You said you want to use with several variables? Do you mean the key as a combination of variables? In that case hash solution is easy. If T2 is a very large file, then you include just the varaibles of interest into the hash table using KEEP/DROP to minimize the size of the hash table.
data T1 ; input id Name $; cards; 1 Bob 2 sausan 3 Petter 4 bonde ; run; data T2; input id Name $; cards; 4 bonde 2 sausan 1 Bob ; run; data want; if _n_ = 1 then do; if 0 then set t2; declare hash h(); h.definekey('id'); h.definedata('rowid'); h.definedone(); do rowid = 1 by 1 until(last); set t2 end = last; h.add(); end; end; set t1; if h.find() ^= 0 then rowid = 0; run; proc print data = want; run; If the key is 'NAME', replace H.DEFINEKY('ID') by h.definekey('Name'); If ID and NAME are the composite key then use h.dfinekey('id','Name'); All other statements remain the same.