BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Adnan_Razaq
Calcite | Level 5

Hi,

 

I have similar datasets.

 

Dataset A has 376 variables and

Dataset B has 428 variables.

 

How do I get a list of the variables which are not in both datasets?

 

Thanks

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

Proc Compare is one tool that provides that information (as well as if the same-named variables are of different types).

 

Proc compare base=datasetA compare=datasetb novalues listvar;

run;

 

View solution in original post

3 REPLIES 3
Reeza
Super User

1. Get a list of variables from each data set

Use either SASHELP.VCOLUMN or PROC CONTENTS

 

2A. Use SQL to compare them

 

2B. Use PROC COMPARE to compare the lists

 

2C. Use a data step to do the comparison.

 

I have an example of the 2C approach here - note this is a fully worked example so the first few steps are generating sample data. Please make sure to read the comments.

 

*Here's a bit of a long way to generate the list of datasets with variables present.
This program looks at ALL variables, to restrict it to a select list, see the line of 
code to be modified via the comments.;


    /*Generate fake data to work with*/
	data class1;
		set sashelp.class;
		drop age sex;
	run;
	
	data class2;
		set sashelp.class;
		drop weight height;
	run;
	
	data class3;
		set sashelp.class;
		Order=1;
	run;
	
	data class4;
		set sashelp.class;
		keep name;
	run;
	
	data class5;
		set sashelp.class;
	run;
	
    /*Extract metadata from dictionary tables*/
	proc sql noprint;
		create table column_list as select memname, libname, name, type, 1 as count 
			from dictionary.columns where libname='WORK' and memname like 'CLASS%'
			/*ADD THE VARIABLE LIST HERE*/
			order by memname, name;
	quit;
	
    /*Transpose results to a more user friendly format*/
	proc transpose data=column_list out=flipped;
		by memname;
		id name;
		idlabel name;
		var count;
	run;
	
    /*Format output*/
	data want;
		retain memname;
		set flipped;
		array class(*) _NUMERIC_;
	
		do i=1 to dim(class);
	
			if class(i)=. then
				class(i)=0;
		end;
		All_Variables=sum(of _numeric_)-I;
		DROP I _NAME_;
	run;

https://gist.github.com/statgeek/3b57ae085d9f7a36a2d95c15f04e72e6

ballardw
Super User

Proc Compare is one tool that provides that information (as well as if the same-named variables are of different types).

 

Proc compare base=datasetA compare=datasetb novalues listvar;

run;

 

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 748 views
  • 2 likes
  • 3 in conversation