Help using Base SAS procedures

Simple Match-NoMatch Counts for Character Variables Using PROC COMPARE?

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 14
Accepted Solution

Simple Match-NoMatch Counts for Character Variables Using PROC COMPARE?

I have two datasets with over 1,000 character variables in common between them.  I would like to get a simple count for each of these character variables showing the number of observations that match and the number of observations that don't match.  Something like this:

 

Variable  Match  NoMatch

   v1      100      10 

   v2       50      60

   v3        0     110

    .

    .

    .

 

If I could output the match-nomatch counts to a SAS dataset that would be even better.

 

The datasets also have a lot of numeric variables in common between them.  I already was able to do a suitable comparison using PROC COMPARE for the numeric variables as follows:

 

proc compare base=ds_old compare=ds_new outstats=ds_stats nosummary allstats novalues nomiss;
   id idnum;
run;

 

Want to do the something similar for all the character variables.

 

Thanks much,

 

Dave


Accepted Solutions
Solution
‎02-21-2017 01:06 PM
PROC Star
Posts: 7,363

Re: Simple Match-NoMatch Counts for Character Variables Using PROC COMPARE?

You could do it all with one proc compare, combined with a proc freq to analyze the results. e.g.,

 

data one;
  set sashelp.class;
  idnum=_n_;
run;

data two;
  retain height sex weight name age;
  set sashelp.class;
  idnum=_n_;
  if mod(_n_,2) then do;
    name='Ralph';
    sex='N';
    age=6;
    height=12;
    weight=74;
  end;
run;

proc compare base=one compare=two out=ds nosummary outdiff stats nomiss;
   id idnum;
run;

data ds;
  set ds;
  array chars _character_;
  array nums _numeric_;
  do over chars;
    if index(chars,'X') then chars='0';
    else chars=1;
  end;
  do over nums;
    if nums=0 then nums=1;
    else nums=0;
  end;
run;

proc freq data=ds;
  tables name--weight;
run;

Of course, you could add two formats if you want the output to be more descriptive.

 

Art, CEO, AnalystFinder.com

 

View solution in original post


All Replies
Super User
Posts: 17,840

Re: Simple Match-NoMatch Counts for Character Variables Using PROC COMPARE?

PROC COMPARE doesn't limit to numeric variables. 

 

I think it's best if you post sample data that we can work with and expected output that matches your sample data. 

 

 

Super User
Posts: 9,682

Re: Simple Match-NoMatch Counts for Character Variables Using PROC COMPARE?

Can you post some sample data to describe your problem.
It is easy for IML code.


data old;
 set sashelp.class;
run;
data new;
 set sashelp.class;
 if _n_=1 then sex='X';
 if _n_=4 then name='KSharp';
run;
proc iml;
use old nobs nobs;
read all var _char_ into old[c=vname];
close;

use new;
read all var _char_ into new;
close;
var=t(vname);
match=t((old=new)[+,]);
not_match=t(nobs-match);

create want var {var match not_match};
append;
close;
quit;

Solution
‎02-21-2017 01:06 PM
PROC Star
Posts: 7,363

Re: Simple Match-NoMatch Counts for Character Variables Using PROC COMPARE?

You could do it all with one proc compare, combined with a proc freq to analyze the results. e.g.,

 

data one;
  set sashelp.class;
  idnum=_n_;
run;

data two;
  retain height sex weight name age;
  set sashelp.class;
  idnum=_n_;
  if mod(_n_,2) then do;
    name='Ralph';
    sex='N';
    age=6;
    height=12;
    weight=74;
  end;
run;

proc compare base=one compare=two out=ds nosummary outdiff stats nomiss;
   id idnum;
run;

data ds;
  set ds;
  array chars _character_;
  array nums _numeric_;
  do over chars;
    if index(chars,'X') then chars='0';
    else chars=1;
  end;
  do over nums;
    if nums=0 then nums=1;
    else nums=0;
  end;
run;

proc freq data=ds;
  tables name--weight;
run;

Of course, you could add two formats if you want the output to be more descriptive.

 

Art, CEO, AnalystFinder.com

 

Occasional Contributor
Posts: 14

Re: Simple Match-NoMatch Counts for Character Variables Using PROC COMPARE?

Thanks to Reeza, Ksharp, and Art297 for responding.  With regards to Ksharp's suggestion about using IML - wish we had it but we don't.  Perhaps IML will someday be included with SAS Foundation?  Here's hoping...

 

In the end, I went with a variant of what Art297 suggested.  Here's what I did.  Not elegant, but it does get the job done.

 

Thanks to all,

 

Dave

 

proc compare base=ds_old compare=ds_new out=ds_stats2 outdif noprint;
   id idnum;
   var _character_;
run;

data ds_stats2b;
   set ds_stats2;
   array chars _character_;
   do over chars;
      if index(chars,'X') then chars = '0';
      else                     chars = '1';
   end;
run;

ods output OneWayFreqs = ds_stats2c;
proc freq data=ds_stats2b;
  tables _character_;
run;
ods output close;

data ds_stats2d;
   set ds_stats2c;
   where round(cumpercent,.01) < 100.00;
   element = substr(table,7);
   pctdiff = percent;
   keep element pctdiff;
run;

proc sort data=ds_stats2d;
   by descending pctdiff;
run;

proc print data=ds_stats2d (obs=2000);
   title 'ds_stats2d';
run;
☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 4 replies
  • 123 views
  • 0 likes
  • 4 in conversation