BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
SWEETSAS
Obsidian | Level 7

I have two directories. mylibA and mylibB in which some datasets names are the same and some are close but not exactly (have v_ before the dataset name). The names of the dataset looks like following:

------------------

mylibA

-----------------
want1
want2
want3
want4

-------------------

mylibB

------------------
want1
v_want2
want3
v_want4

 

Is there a way to use PROC COMPARE to loop through and automatically compare all these datasets both libraries: i.e, mylibA.want1 and mylibB.want1, mylibA.want2 and mylibB.v_want2, mylibA.want3 and mylibB.want3, mylibA.want4 and mylib.v_want4.

 

That is, v_want2 and v_want4 should actually by want2 and want4, respectively.

 

Thanks!

 

1 ACCEPTED SOLUTION

Accepted Solutions
ChrisNZ
Tourmaline | Level 20

Something like this?

proc sql;
  create table TABLES as
  select LIBNAME, MEMNAME, left(tranwrd(MEMNAME,'V_',' ')) as KEY 
  from DICTIONARY.TABLES
  where LIBNAME in ('MYLIB1','MYLIB2')
    and MEMNAME like '%WANT%' ;
quit;

data _null_;
  merge TABLES (in=A where=(LIBNAME='MYLIB1') rename=(MEMNAME=M1))
        TABLES (in=B where=(LIBNAME='MYLIB2') rename=(MEMNAME=M2));
  by KEY;
  if A and B;
  call execute('proc compare base=MYLIB1.'||M1||' data=MYLIB2.'||M2||';run;');
run

 

View solution in original post

12 REPLIES 12
ballardw
Super User

What do you really want to know from the comparison?

What is the final result of your process? "That is, v_want2 and v_want4 should actually by want2 and want4, respectively." isn't very clear. Do you intend to Rename the data sets to standard version? If that is the case then you may be contemplating a lot of extra work if the pattern is actual V_ prefixing a name. Or do you have a bunch of different prefixes that vary in no regular pattern? (Hint: regular patterns you program for relatively easily).

 

You can get the names of the data sets by selecting records from the view SASHELP.Vtable, or using Proc Sql Dictionary.tables. An example of getting all the table names for one library.

Proc sql;
   create table Libone as
   select libname, memname 
   from dictionary.tables
   where libname='MYLIB1'
   ;
run;

I'm not going to go any further with code as there a number of ways to interpret your request and some of them might require more thinking than I'm willing to do on a Saturday.

 

Once you have the text involved there are several functions to compare text variables such as SOUNDEX, COMPLEV and COMPGED

SWEETSAS
Obsidian | Level 7

Thanks!@ballardw.

The goal is to compare datasets in both directories.

A good program should look like the following:

 

proc compare

base=MYLIB1.want1

compare=MYLIB2.want1

out=MYLIB3.want1

run;

 

proc compare

base=MYLIB1.want2

compare=MYLIB2.v_want2

out=MYLIB3.want2

run;

etc

 

If the dataset were the same name in both libraries, this can be done with something like this:

do i=1 to &n;

proc compare

base=MYLIB1.&ds&i

compare=MYLIB2.&ds&i

out=MYLIB3.&ds&i

run;

end;

where &ds and &n are macro variables from DATA  _NULL_  that reference the different datasets.

But the challenge here is that for some datasets in the second directory, the dataset names begins with the prefix v_  (no other prefix) and some are the same name in both directories. 

Patrick
Opal | Level 21

@SWEETSAS wrote:

Thanks!@ballardw.

 

If the dataset were the same name in both libraries, this can be done with something like this:

do i=1 to &n;

proc compare

base=MYLIB1.&ds&i

compare=MYLIB2.&ds&i

out=MYLIB3.&ds&i

run;

end;

 


You can't run a Procedure within a SAS datastep so your code would not work. You could though achieve what you want to do by having the Proc Compare in a macro and then use call execute() within a do loop.

 

If the potential difference between DS names is only the V_ then it wouldn't be hard to create a list of the DS to be compared by querying dictionary.tables BUT: The V_... is often a naming convention for Views. Are you 100% sure these v_have1 "tables" are not just views of the have1 tables? If so a Proc Compare wouldn't make any sense for such cases.

 

Here some sample code how this could look like. Amend as required for your actual use case.

options dlcreatedir;

libname lib1 "%sysfunc(pathname(work))\lib1";
libname lib2 "%sysfunc(pathname(work))\lib2";

data lib1.want1 lib1.want2 lib1.want3;
  set sashelp.class;
run;
data lib2.want1;
  set lib1.want1;
run;
data lib2.v_want2;
  set lib1.want1;
  if _n_=5 then call missing(sex);
run;

proc sql;
  create view lib2.v_want3 as
  select * from lib1.want3
  ;
quit;

proc sql;
  create table comp_tables as
  select 
    l.libname as lib1,
    l.memname as ds1,
    r.libname as lib2,
    r.memname as ds2
  from 
    dictionary.tables l
    inner join
    dictionary.tables r
    on
      l.libname='LIB1' and r.libname='LIB2'
      and l.memtype='DATA' and r.memtype='DATA'
      and 
        (
          l.memname=r.memname
          or 
          substrn(r.memname,1,2)='V_' and l.memname=substrn(r.memname,3)
        )
  ;
quit;

%macro compare(lib1,ds1,lib2,ds2);
  proc compare data=&lib1..&ds1 comp=&lib2..&ds2;
  run;
%mend;

data _null_;
  set comp_tables;
  length cmd $100;
  cmd=cats( '%compare(',catx(',',lib1,ds1,lib2,ds2),')' );
  call execute(cmd);
run;
SWEETSAS
Obsidian | Level 7

Thanks @Patrick !

Yes! The difference between DS names is only the V_. 

Patrick
Opal | Level 21

@SWEETSAS  Then the code I've posted should work. ...and it will exclude any views.

SWEETSAS
Obsidian | Level 7

Thanks for this program. I have some few questions.

 

libname lib1 "%sysfunc(pathname(work))\lib1";
libname lib2 "%sysfunc(pathname(work))\lib2";

What is the above program doing? Creating two libraries under WORK? 

 

What is the last SQL procedure meant to do? Place all datasets in LIB1 into ds1 and all datasets in LIB2 into a ds2? And this is performed for only datasets that exit in both libraries?

I ask because I did not see a counter that loops through the datasets in the COMPARE procedure.

 

Thanks!

Patrick
Opal | Level 21

You didn't provide sample data so I had to create it. 

Yes, the libname statements are just creating two sub-folders under work. This to create a fully working self-contained sample program that mimics what you describe.

 

Everything up to the last SQL proc is just for creation of sample data.

 

The last SQL is then creating a table with a row per DS you want to compare. And the data _null_ step is then calling macro %compare() once per row in this table and though generating the code required to compare your tables.

call execute() is "stacking" up code - so first the data _null_ step processes the source table (created in the SQL, have a look how this table looks like), call execute creates one macro call per row in this source table (and there we pass in the variable in this source table as the parameters to the macro) ...and once the data step finished iterating through the source table all the generated code (the macro calls) get executed.

If you run the code I've posted you can see in the SAS log how these call execute() generated macro call syntax looks like.

ChrisNZ
Tourmaline | Level 20

Something like this?

proc sql;
  create table TABLES as
  select LIBNAME, MEMNAME, left(tranwrd(MEMNAME,'V_',' ')) as KEY 
  from DICTIONARY.TABLES
  where LIBNAME in ('MYLIB1','MYLIB2')
    and MEMNAME like '%WANT%' ;
quit;

data _null_;
  merge TABLES (in=A where=(LIBNAME='MYLIB1') rename=(MEMNAME=M1))
        TABLES (in=B where=(LIBNAME='MYLIB2') rename=(MEMNAME=M2));
  by KEY;
  if A and B;
  call execute('proc compare base=MYLIB1.'||M1||' data=MYLIB2.'||M2||';run;');
run

 

SWEETSAS
Obsidian | Level 7

thanks @ChrisNZ  

 

Thanks! thanks !! Thanks!!!. The code works. The code is able to elegantly sieve through the two libraries and smartly pick up the correct datasets to compare in both libraries and does the desired comparison, the V_ prefix not withstanding. Amazing!!!!

 

Please, I want to output the results of the comparison into corresponding datasets in another library (LIB3) using OUT= in PROC COMAPRE. The dataset name in the OUT= should be the same as captured in variable KEY (i.e., the dataset names  without the V_). That is, for MYLIB1.want1, the results of the comparison will be sent to MYLIB3.want1. For MYLIB1.want2, the results of comparison will be outputted into MLIB3.want2, etc.    I also want to include ID statement in PROC COMPARE. Below is a macro that shows what I am trying to achieve. I sincerely thank you for your help!! 

 

/*Data steps to assign counter "num" and data set names "d"*/

%macro compare;
do i=1 to #
proc compare base=LIB1..&&d&i compare=LIB2..&d&i out=LIB3..&&d&i;
id id;
run;
%mend compare;

ChrisNZ
Tourmaline | Level 20

Just add the required changes to the call execute. Like:

call execute('proc compare base=MYLIB1.'||M1||' data=MYLIB2.'||M2||' out=LIB3.'||KEY||'; id ID; run;');
ChrisNZ
Tourmaline | Level 20

Or

call execute(catt('proc compare base=MYLIB1.', M1, ' data=MYLIB2.', M2, ' out=LIB3.', KEY, '; id ID; run;'));
SWEETSAS
Obsidian | Level 7

Many thanks @ChrisNZ

 

It worked!! 

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 12 replies
  • 1768 views
  • 0 likes
  • 4 in conversation