BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Adnan_Razaq
Calcite | Level 5

Hi,

 

I have similar datasets.

 

Dataset A has 376 variables and

Dataset B has 428 variables.

 

How do I get a list of the variables which are not in both datasets?

 

Thanks

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

Proc Compare is one tool that provides that information (as well as if the same-named variables are of different types).

 

Proc compare base=datasetA compare=datasetb novalues listvar;

run;

 

View solution in original post

3 REPLIES 3
Reeza
Super User

1. Get a list of variables from each data set

Use either SASHELP.VCOLUMN or PROC CONTENTS

 

2A. Use SQL to compare them

 

2B. Use PROC COMPARE to compare the lists

 

2C. Use a data step to do the comparison.

 

I have an example of the 2C approach here - note this is a fully worked example so the first few steps are generating sample data. Please make sure to read the comments.

 

*Here's a bit of a long way to generate the list of datasets with variables present.
This program looks at ALL variables, to restrict it to a select list, see the line of 
code to be modified via the comments.;


    /*Generate fake data to work with*/
	data class1;
		set sashelp.class;
		drop age sex;
	run;
	
	data class2;
		set sashelp.class;
		drop weight height;
	run;
	
	data class3;
		set sashelp.class;
		Order=1;
	run;
	
	data class4;
		set sashelp.class;
		keep name;
	run;
	
	data class5;
		set sashelp.class;
	run;
	
    /*Extract metadata from dictionary tables*/
	proc sql noprint;
		create table column_list as select memname, libname, name, type, 1 as count 
			from dictionary.columns where libname='WORK' and memname like 'CLASS%'
			/*ADD THE VARIABLE LIST HERE*/
			order by memname, name;
	quit;
	
    /*Transpose results to a more user friendly format*/
	proc transpose data=column_list out=flipped;
		by memname;
		id name;
		idlabel name;
		var count;
	run;
	
    /*Format output*/
	data want;
		retain memname;
		set flipped;
		array class(*) _NUMERIC_;
	
		do i=1 to dim(class);
	
			if class(i)=. then
				class(i)=0;
		end;
		All_Variables=sum(of _numeric_)-I;
		DROP I _NAME_;
	run;

https://gist.github.com/statgeek/3b57ae085d9f7a36a2d95c15f04e72e6

ballardw
Super User

Proc Compare is one tool that provides that information (as well as if the same-named variables are of different types).

 

Proc compare base=datasetA compare=datasetb novalues listvar;

run;

 

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 1426 views
  • 2 likes
  • 3 in conversation