BookmarkSubscribeRSS Feed
abhityagi
Obsidian | Level 7

Hi,

 

I am looking for a logic where I can compare one source sas dataset with multiple destination sas datasets without proc compare. Also column names are different in the destination datasets. 

 

Thanks!

6 REPLIES 6
abhityagi
Obsidian | Level 7

For Example:

I have data set 

 

Dataset1:

Columns: A B C D E F;

 

Dataset2:

Columns: A B;

 

Dataset 3:

Columns: C D;

 

Dataset $:

Columns: E F;

 

Here I want to data compare dataset1 with dataset2, 3 and dataset4.

 

SuryaKiran
Meteorite | Level 14

Are you trying to find all tables that have columns that exist in one main table. Try joining dictionary.columns to find the tables that have similar columns. 

data table_abc;
a='';
b='';
c='';
run;

data table_a;
a='';
run;
data table_b;
b='';
run;
data table_c;
c='';
run;

proc sql;
create table test as 
select a.memname,a.name,b.memname as memname_c,b.name as name_c
from dictionary.columns as a
inner join dictionary.columns as b on a.name=b.name and a.memname<>b.memname
where a.libname='WORK' and a.memname='TABLE_ABC';
quit;
Thanks,
Suryakiran
Reeza
Super User

Here's one way:

https://gist.github.com/statgeek/3b57ae085d9f7a36a2d95c15f04e72e6

*Here's a bit of a long way to generate the list of datasets with variables present.
This program looks at ALL variables, to restrict it to a select list, see the line of 
code to be modified via the comments.;


    /*Generate fake data to work with*/
	data class1;
		set sashelp.class;
		drop age sex;
	run;
	
	data class2;
		set sashelp.class;
		drop weight height;
	run;
	
	data class3;
		set sashelp.class;
		Order=1;
	run;
	
	data class4;
		set sashelp.class;
		keep name;
	run;
	
	data class5;
		set sashelp.class;
	run;
	
    /*Extract metadata from dictionary tables*/
	proc sql noprint;
		create table column_list as select memname, libname, name, type, 1 as count 
			from dictionary.columns where libname='WORK' and memname like 'CLASS%'
			/*ADD THE VARIABLE LIST HERE*/
			order by memname, name;
	quit;
	
    /*Transpose results to a more user friendly format*/
	proc transpose data=column_list out=flipped;
		by memname;
		id name;
		idlabel name;
		var count;
	run;
	
    /*Format output*/
	data want;
		retain memname;
		set flipped;
		array class(*) _NUMERIC_;
	
		do i=1 to dim(class);
	
			if class(i)=. then
				class(i)=0;
		end;
		All_Variables=sum(of _numeric_)-I;
		DROP I _NAME_;
	run;

@abhityagi wrote:

For Example:

I have data set 

 

Dataset1:

Columns: A B C D E F;

 

Dataset2:

Columns: A B;

 

Dataset 3:

Columns: C D;

 

Dataset $:

Columns: E F;

 

Here I want to data compare dataset1 with dataset2, 3 and dataset4.

 


 

ScottBass
Rhodochrosite | Level 12

@abhityagi wrote:

For Example:

I have data set 

 

Dataset1:

Columns: A B C D E F;

 

Dataset2:

Columns: A B;

 

Dataset 3:

Columns: C D;

 

Dataset $:

Columns: E F;

 

Here I want to data compare dataset1 with dataset2, 3 and dataset4.

 


 

PROC COMPARE will handle this:

 

data
   base
   comp1 (drop=name age)
   comp2 (drop=age sex)
   comp3 (drop=height weight)
   comp4 (drop=age sex weight)
   ;
   set sashelp.class;
run;

%macro comp(base=,comp=);
proc compare base=&base comp=&comp;
run;
%mend;
%comp(base=base,comp=comp1); 
%comp(base=base,comp=comp2); 
%comp(base=base,comp=comp3); 
%comp(base=base,comp=comp4); 

If this does not summarize your data requirements please provide more explanation.  Otherwise why reinvent the wheel?


Please post your question as a self-contained data step in the form of "have" (source) and "want" (desired results).
I won't contribute to your post if I can't cut-and-paste your syntactically correct code into SAS.
SuryaKiran
Meteorite | Level 14

Hello,

 

What is that you want to compare between both the dataset? Why not proc compare, doesn't it enough for your requirement. In that case you may have to write explicit code based on the data you have. Like SQL set operation with EXCEPT to find if same data exists in both the tables or other joins based on what you want to try. 

 

If the datasets are in different location, with dataset name being same, then you may first have to find the locations where dataset exist and then need to assign a LIBNAME. If you have only few locations then you can manually assign the libnames. If you have to automate instead of manually assigning then there are several scripts for searching the file name in a folder or sub-folders. This might help http://support.sas.com/kb/45/805.html

Thanks,
Suryakiran

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 573 views
  • 3 likes
  • 4 in conversation