BookmarkSubscribeRSS Feed

Comparing Attributes of Common Variables

Started ‎08-16-2020 by
Modified ‎09-18-2020 by
Views 3,740

When working with multiple datasets, there may be a need to compare attributes of common variables.  This is generally important to confirm attributes are consistent across common variables in multiple datasets. For example, a difference in length can cause data truncation when merging datasets or worse a difference in type can cause programming errors.  Smarter programmers are proactive to identify all common variables and then confirm all attributes, label, length, type and format are consistent.  

 

In this tip, I show how easy it is to compare common variable attributes using dictionary tables.  In the SAS example below, I create two temporary datasets, CLASS and CLASS2.  CLASS2 dataset is exactly as CLASS dataset except that I converted AGE from numeric to character and also renamed AGE to AGE_NUM.  The objective is to programmatically identify AGE being numeric in one dataset and character in another dataset.  This informs us to take action to make AGE type consistent.

 

 

* create working class dataset;
data class;
set sashelp.class;
run;

* create class2 with age char;
data class2 (rename = (age2 = age));
 length age2 $2.;
 set sashelp.class;
 age2 = left(put(age, best.));
 rename age = age_num;
run;

 

Proc SQL data dictionary table, SASHELP.VCOLUMN, is used with WORK libname to review CLASS and CLASS2 variable attributes.  Upcase(NAME) is used since NAME can be case sensitive.  Name variable is the key variable for CLASS.  GROUP BY is important to GROUP BY key variable and HAVING COUNT(NAME) > 1 identifies multiple records for the same variable name or the common variables.

 

 

proc sql;
create table comvar as
select upcase(name) as name2, memname, type, length, label, format
 from sashelp.vcolumn
 where libname='WORK'
 group by upcase(name)
 having count(name) > 1;
quit;

 

Next,  the dataset COMVAR is sorted by variable name,  dataset and type.  This is important to sort order the records for comparison.

 

 

proc sort data=comvar;
 by name2 memname type;
run;

 

The next data step identifies differences in variable type.  It does this by retaining two variables NAMEHLD and TYPEHLD.  When the retained variable name is the same but the retained type is not the same, then we keep that record.  These are the common variables with different types.

 

 

data comtype;
 retain namehld typehld;
 set comvar;
 by name2 memname type;

 * for each name retain name and type;
 if first.name2 then do;
  namehld = name2; typehld = type;
 end;

* output if name is same and type is different;
 if name2 = namehld and type ^= typehld;
run;

 

Next, we merge this dataset back to display all variables for these different type records.  We do this with Proc SQL right join. This gives us the big picture from both datasets so that we can then take the next step to convert variable type.

 

 

proc sql;
create table comtype2 as
 select upcase(name) as name2, memname, type, length, label, format
 from sashelp.vcolumn as a
 right join (
 select unique name2 as b_name from comtype) as b on 
   upcase(a.name)=b.b_name and libname='WORK' 
   group by  upcase(a.name);
quit;

proc print data=comtype2;
run;

 

 

common vars tip.png

 

Version history
Last update:
‎09-18-2020 03:12 PM
Updated by:

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

Free course: Data Literacy Essentials

Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning  and boost your career prospects.

Get Started

Article Tags