DATA Step, Macro, Functions and more

Efficient way to verify a file structure and type

Accepted Solution Solved
Reply
Super User
Posts: 17,749
Accepted Solution

Efficient way to verify a file structure and type

I'm looking for recommendations on a quick way to verify that a file has the variables and appropriate variable types (character/numeric) that I expect.

Any ideas?


Accepted Solutions
Solution
‎05-16-2014 12:57 PM
PROC Star
Posts: 1,227

Re: Efficient way to verify a file structure and type

Hi,

You can make a shell dataset  that has the variables you expect.

Then run proc contents on your shell and what you have and compare the output.

Pseudo code:

proc contents data=shell out=__list1 (keep=name);

run;

proc contents data=have out=__list2 (keep=name);

run;

proc compare base=__list1 compare=__list2 error;

  id name ;

run;

That makes it pretty easy to control what you want to consider a difference.  So you can decide to ignore case in names, or you can add type and label and other attributes to the output dataset from proc contents, or whatever.  With the  error option, PROC COMPARE will throw an error if it finds any difference, which is a great option I only discovered in the past couple years.

Instead of proc contents you could use dictionary tables, but often proc contents turns out to be faster if you have a lot of libraries defined and dictionary.columns is huge.

HTH,

--Q.


View solution in original post


All Replies
Valued Guide
Posts: 3,208

Re: Efficient way to verify a file structure and type

proc dataset/contents for retrieving the information in a datasets and a compare with a predefined  datasets with what you expected.
If you need retrieve that from SAS metadata-server some interfaces are existing.
It is like working with research-data now it is metadata (metadata is describing the data).

---->-- ja karman --<-----
Trusted Advisor
Posts: 1,204

Re: Efficient way to verify a file structure and type

Hi Reeza,

This is what I do usually

proc contents data=have out=want;

run;

Then I can explore dataset want to see data structures/variable types lengths etc.

Super User
Posts: 17,749

Re: Efficient way to verify a file structure and type

How do you capture the output from the proc compare to verify file structure? I want this to run automatically with no intervention from me.

So if the variable is missing or a variable is numeric when it should be character I want to print an error to that effect.

Trusted Advisor
Posts: 1,204

Re: Efficient way to verify a file structure and type

I am not sure if this answers your questions. Please see below to compare two datasets' structures. There are two datasets (have before processing) and (want after processing).

proc contents data=have out=one;

run;

proc contents data=want out=two;

run;

data one;

set one;

flag=1;

run;

data two;

set two;

flag=2;

run;

proc sql;

create table all as

select * from one

union all

select * from two;

quit;

proc tabulate data=all;

class name type flag;

table name*type,flag;

run;

Super User
Posts: 17,749

Re: Efficient way to verify a file structure and type

I'd still have to read the tabulate output Smiley Happy

I ended up using a SQL Full Join.

If the name was missing in one file then I print an error to the log using a data _null_ step. 

Solution
‎05-16-2014 12:57 PM
PROC Star
Posts: 1,227

Re: Efficient way to verify a file structure and type

Hi,

You can make a shell dataset  that has the variables you expect.

Then run proc contents on your shell and what you have and compare the output.

Pseudo code:

proc contents data=shell out=__list1 (keep=name);

run;

proc contents data=have out=__list2 (keep=name);

run;

proc compare base=__list1 compare=__list2 error;

  id name ;

run;

That makes it pretty easy to control what you want to consider a difference.  So you can decide to ignore case in names, or you can add type and label and other attributes to the output dataset from proc contents, or whatever.  With the  error option, PROC COMPARE will throw an error if it finds any difference, which is a great option I only discovered in the past couple years.

Instead of proc contents you could use dictionary tables, but often proc contents turns out to be faster if you have a lot of libraries defined and dictionary.columns is huge.

HTH,

--Q.


Super User
Posts: 17,749

Re: Efficient way to verify a file structure and type

Thanks Quentin, the error is what I was looking for.

Valued Guide
Posts: 3,208

Re: Efficient way to verify a file structure and type

little to be added. All approaches being mentioned except the SAS datastep merge for comparing

You have now:  2 options to get the info, 3 for comparing.
choose to your additional requirements

---->-- ja karman --<-----
☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 8 replies
  • 337 views
  • 2 likes
  • 4 in conversation