BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.

I'm looking for recommendations on a quick way to verify that a file has the variables and appropriate variable types (character/numeric) that I expect.

Any ideas?

1 ACCEPTED SOLUTION

Accepted Solutions
Quentin
Super User

Hi,

You can make a shell dataset  that has the variables you expect.

Then run proc contents on your shell and what you have and compare the output.

Pseudo code:

proc contents data=shell out=__list1 (keep=name);

run;

proc contents data=have out=__list2 (keep=name);

run;

proc compare base=__list1 compare=__list2 error;

  id name ;

run;

That makes it pretty easy to control what you want to consider a difference.  So you can decide to ignore case in names, or you can add type and label and other attributes to the output dataset from proc contents, or whatever.  With the  error option, PROC COMPARE will throw an error if it finds any difference, which is a great option I only discovered in the past couple years.

Instead of proc contents you could use dictionary tables, but often proc contents turns out to be faster if you have a lot of libraries defined and dictionary.columns is huge.

HTH,

--Q.


BASUG is hosting free webinars Next up: Mike Sale presenting Data Warehousing with SAS April 10 at noon ET. Register now at the Boston Area SAS Users Group event page: https://www.basug.org/events.

View solution in original post

8 REPLIES 8
jakarman
Barite | Level 11

proc dataset/contents for retrieving the information in a datasets and a compare with a predefined  datasets with what you expected.
If you need retrieve that from SAS metadata-server some interfaces are existing.
It is like working with research-data now it is metadata (metadata is describing the data).

---->-- ja karman --<-----
stat_sas
Ammonite | Level 13

Hi Reeza,

This is what I do usually

proc contents data=have out=want;

run;

Then I can explore dataset want to see data structures/variable types lengths etc.

Reeza
Super User

How do you capture the output from the proc compare to verify file structure? I want this to run automatically with no intervention from me.

So if the variable is missing or a variable is numeric when it should be character I want to print an error to that effect.

stat_sas
Ammonite | Level 13

I am not sure if this answers your questions. Please see below to compare two datasets' structures. There are two datasets (have before processing) and (want after processing).

proc contents data=have out=one;

run;

proc contents data=want out=two;

run;

data one;

set one;

flag=1;

run;

data two;

set two;

flag=2;

run;

proc sql;

create table all as

select * from one

union all

select * from two;

quit;

proc tabulate data=all;

class name type flag;

table name*type,flag;

run;

Reeza
Super User

I'd still have to read the tabulate output Smiley Happy

I ended up using a SQL Full Join.

If the name was missing in one file then I print an error to the log using a data _null_ step. 

Quentin
Super User

Hi,

You can make a shell dataset  that has the variables you expect.

Then run proc contents on your shell and what you have and compare the output.

Pseudo code:

proc contents data=shell out=__list1 (keep=name);

run;

proc contents data=have out=__list2 (keep=name);

run;

proc compare base=__list1 compare=__list2 error;

  id name ;

run;

That makes it pretty easy to control what you want to consider a difference.  So you can decide to ignore case in names, or you can add type and label and other attributes to the output dataset from proc contents, or whatever.  With the  error option, PROC COMPARE will throw an error if it finds any difference, which is a great option I only discovered in the past couple years.

Instead of proc contents you could use dictionary tables, but often proc contents turns out to be faster if you have a lot of libraries defined and dictionary.columns is huge.

HTH,

--Q.


BASUG is hosting free webinars Next up: Mike Sale presenting Data Warehousing with SAS April 10 at noon ET. Register now at the Boston Area SAS Users Group event page: https://www.basug.org/events.
Reeza
Super User

Thanks Quentin, the error is what I was looking for.

jakarman
Barite | Level 11

little to be added. All approaches being mentioned except the SAS datastep merge for comparing

You have now:  2 options to get the info, 3 for comparing.
choose to your additional requirements

---->-- ja karman --<-----

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 1542 views
  • 2 likes
  • 4 in conversation