Help using Base SAS procedures

Finding Uncommon variables across two datasets

Accepted Solution Solved
Reply
New Contributor
Posts: 3
Accepted Solution

Finding Uncommon variables across two datasets

Hi, I am trying to identify the specific variables that are found in dataset1 but not in dataset2. Proc compare can find the common ones but does not list the ones that are distinct:

proc compare data=data1 (obs=0) compare=data2(obs=0);

run;

How can I find the unique variables?

 

 

Thanks.

 

 

 

 


Accepted Solutions
Solution
‎05-03-2018 02:12 PM
Super User
Posts: 13,498

Re: Finding Uncommon variables across two datasets


@chipp635 wrote:

Hi, I am trying to identify the specific variables that are found in dataset1 but not in dataset2. Proc compare can find the common ones but does not list the ones that are distinct:

proc compare data=data1 (obs=0) compare=data2(obs=0);

run;

How can I find the unique variables?

 


I think you may not be researching the appropriate syntax:

When I run this on my data sets:

proc compare data=ipp.hd1_2014(obs=0) 
             compare=ipp.hd457_2014 (obs=0)
           listbasevar listcompvar  
;
run;

 

I get (as expected) in the output

 

Listing of Variables in IPP.HD1_2014 but not in IPP.HD457_2014                                    
                                                                                                  
Variable  Type  Length  Label                                                                     
                                                                                                  
GSpec     Char       1  GC Specimen Site                                                          
                                                                                                  
                                                                                                  
Listing of Variables in IPP.HD457_2014 but not in IPP.HD1_2014                                    
                                                                                                  
Variable          Type  Length  Informat     Label                                                
                                                                                                  
ACCESSION_NUMBER  Char      16  $16.         Record Identifier                                    
ORACE             Char      12  $STD_ORACE.                                                       
                                                

View solution in original post


All Replies
Super User
Posts: 23,663

Re: Finding Uncommon variables across two datasets

I have some example code on that here. I don't find PROC COMPARE particularly useful - they could do a lot to improve it.

I have two demo codes that are useful to me:

 

create a table that shows which variables are in which datasets:

https://gist.github.com/statgeek/3b57ae085d9f7a36a2d95c15f04e72e6

 

Identifiers variables in one data set and not in another:

https://gist.github.com/statgeek/e0b98c4627aa31a567e5

 


@chipp635 wrote:

Hi, I am trying to identify the specific variables that are found in dataset1 but not in dataset2. Proc compare can find the common ones but does not list the ones that are distinct:

proc compare data=data1 (obs=0) compare=data2(obs=0);

run;

How can I find the unique variables?

 

 

Thanks.

 

 

 

 


 

New Contributor
Posts: 3

Re: Finding Uncommon variables across two datasets

Thanks. Would it be possible to do it across multiple datasets and form a table of lets say:

Data1 Data2 Data3 Data4 

var1    var1    var1   var1 

var2    var2    var2   var2

           var3    var3   var3

                                var4

 

where matching variables are listed in the same order but distinct ones are listed additionally as well?

 

.

Super User
Posts: 23,663

Re: Finding Uncommon variables across two datasets

That's almost exactly what the first program does. 

 


@chipp635 wrote:

Thanks. Would it be possible to do it across multiple datasets and form a table of lets say:

Data1 Data2 Data3 Data4 

var1    var1    var1   var1 

var2    var2    var2   var2

           var3    var3   var3

                                var4

 

where matching variables are listed in the same order but distinct ones are listed additionally as well?

 

.




PROC Star
Posts: 1,767

Re: Finding Uncommon variables across two datasets

/*create a uncommon variable for test purpose*/
data w;
set sashelp.class;
_name=name;
run;



data uncommon;
if _n_=1 then do;
 dcl hash H (dataset:"sashelp.vcolumn(where=(libname='SASHELP' and memname='CLASS' ))") ;
   h.definekey  ("name") ;
    h.definedone () ;
 end;
set sashelp.vcolumn(where=(libname='WORK' and memname='W'));
if h.check() ne 0;
keep name;
run;
Solution
‎05-03-2018 02:12 PM
Super User
Posts: 13,498

Re: Finding Uncommon variables across two datasets


@chipp635 wrote:

Hi, I am trying to identify the specific variables that are found in dataset1 but not in dataset2. Proc compare can find the common ones but does not list the ones that are distinct:

proc compare data=data1 (obs=0) compare=data2(obs=0);

run;

How can I find the unique variables?

 


I think you may not be researching the appropriate syntax:

When I run this on my data sets:

proc compare data=ipp.hd1_2014(obs=0) 
             compare=ipp.hd457_2014 (obs=0)
           listbasevar listcompvar  
;
run;

 

I get (as expected) in the output

 

Listing of Variables in IPP.HD1_2014 but not in IPP.HD457_2014                                    
                                                                                                  
Variable  Type  Length  Label                                                                     
                                                                                                  
GSpec     Char       1  GC Specimen Site                                                          
                                                                                                  
                                                                                                  
Listing of Variables in IPP.HD457_2014 but not in IPP.HD1_2014                                    
                                                                                                  
Variable          Type  Length  Informat     Label                                                
                                                                                                  
ACCESSION_NUMBER  Char      16  $16.         Record Identifier                                    
ORACE             Char      12  $STD_ORACE.                                                       
                                                
New Contributor
Posts: 3

Re: Finding Uncommon variables across two datasets

Thank you!

I was missing the listbasevar listcompvar code.

 

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 6 replies
  • 300 views
  • 5 likes
  • 4 in conversation