BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
chipp635
Fluorite | Level 6

Hi, I am trying to identify the specific variables that are found in dataset1 but not in dataset2. Proc compare can find the common ones but does not list the ones that are distinct:

proc compare data=data1 (obs=0) compare=data2(obs=0);

run;

How can I find the unique variables?

 

 

Thanks.

 

 

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

@chipp635 wrote:

Hi, I am trying to identify the specific variables that are found in dataset1 but not in dataset2. Proc compare can find the common ones but does not list the ones that are distinct:

proc compare data=data1 (obs=0) compare=data2(obs=0);

run;

How can I find the unique variables?

 


I think you may not be researching the appropriate syntax:

When I run this on my data sets:

proc compare data=ipp.hd1_2014(obs=0) 
             compare=ipp.hd457_2014 (obs=0)
           listbasevar listcompvar  
;
run;

 

I get (as expected) in the output

 

Listing of Variables in IPP.HD1_2014 but not in IPP.HD457_2014                                    
                                                                                                  
Variable  Type  Length  Label                                                                     
                                                                                                  
GSpec     Char       1  GC Specimen Site                                                          
                                                                                                  
                                                                                                  
Listing of Variables in IPP.HD457_2014 but not in IPP.HD1_2014                                    
                                                                                                  
Variable          Type  Length  Informat     Label                                                
                                                                                                  
ACCESSION_NUMBER  Char      16  $16.         Record Identifier                                    
ORACE             Char      12  $STD_ORACE.                                                       
                                                

View solution in original post

6 REPLIES 6
Reeza
Super User

I have some example code on that here. I don't find PROC COMPARE particularly useful - they could do a lot to improve it.

I have two demo codes that are useful to me:

 

create a table that shows which variables are in which datasets:

https://gist.github.com/statgeek/3b57ae085d9f7a36a2d95c15f04e72e6

 

Identifiers variables in one data set and not in another:

https://gist.github.com/statgeek/e0b98c4627aa31a567e5

 


@chipp635 wrote:

Hi, I am trying to identify the specific variables that are found in dataset1 but not in dataset2. Proc compare can find the common ones but does not list the ones that are distinct:

proc compare data=data1 (obs=0) compare=data2(obs=0);

run;

How can I find the unique variables?

 

 

Thanks.

 

 

 

 


 

chipp635
Fluorite | Level 6

Thanks. Would it be possible to do it across multiple datasets and form a table of lets say:

Data1 Data2 Data3 Data4 

var1    var1    var1   var1 

var2    var2    var2   var2

           var3    var3   var3

                                var4

 

where matching variables are listed in the same order but distinct ones are listed additionally as well?

 

.

Reeza
Super User

That's almost exactly what the first program does. 

 


@chipp635 wrote:

Thanks. Would it be possible to do it across multiple datasets and form a table of lets say:

Data1 Data2 Data3 Data4 

var1    var1    var1   var1 

var2    var2    var2   var2

           var3    var3   var3

                                var4

 

where matching variables are listed in the same order but distinct ones are listed additionally as well?

 

.




novinosrin
Tourmaline | Level 20
/*create a uncommon variable for test purpose*/
data w;
set sashelp.class;
_name=name;
run;



data uncommon;
if _n_=1 then do;
 dcl hash H (dataset:"sashelp.vcolumn(where=(libname='SASHELP' and memname='CLASS' ))") ;
   h.definekey  ("name") ;
    h.definedone () ;
 end;
set sashelp.vcolumn(where=(libname='WORK' and memname='W'));
if h.check() ne 0;
keep name;
run;
ballardw
Super User

@chipp635 wrote:

Hi, I am trying to identify the specific variables that are found in dataset1 but not in dataset2. Proc compare can find the common ones but does not list the ones that are distinct:

proc compare data=data1 (obs=0) compare=data2(obs=0);

run;

How can I find the unique variables?

 


I think you may not be researching the appropriate syntax:

When I run this on my data sets:

proc compare data=ipp.hd1_2014(obs=0) 
             compare=ipp.hd457_2014 (obs=0)
           listbasevar listcompvar  
;
run;

 

I get (as expected) in the output

 

Listing of Variables in IPP.HD1_2014 but not in IPP.HD457_2014                                    
                                                                                                  
Variable  Type  Length  Label                                                                     
                                                                                                  
GSpec     Char       1  GC Specimen Site                                                          
                                                                                                  
                                                                                                  
Listing of Variables in IPP.HD457_2014 but not in IPP.HD1_2014                                    
                                                                                                  
Variable          Type  Length  Informat     Label                                                
                                                                                                  
ACCESSION_NUMBER  Char      16  $16.         Record Identifier                                    
ORACE             Char      12  $STD_ORACE.                                                       
                                                
chipp635
Fluorite | Level 6

Thank you!

I was missing the listbasevar listcompvar code.

 

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 2979 views
  • 5 likes
  • 4 in conversation