The SAS Output Delivery System and reporting techniques

Manipulate proc contents

Reply
Occasional Contributor
Posts: 16

Manipulate proc contents

Hello,

I have around 1000 data sets and I am trying to write a code to create a proc contents for all these data sets and once I have them my second step would be to check for specific fields like account_number,DOB,sex etc and if they are available in the proc contents my second step should filter those datasets and write them to output.

Example :- I have datasets A,B,C,D,E and now I write a macro to do proc contents on each of A,B,C,D,E datasets and I create a separate output for each of them. Now my second step of code need to see if each of the new datasets created (i.e the output of the proc contents) have a set of fields in them and if they have the field then the second step should write them out.

Do any one have thoughts on the best way of doing it?

Valued Guide
Posts: 3,208

Re: Manipulate proc contents

Use the sashelp.v- members as view to them all. But be aware the processing can be slow.

---->-- ja karman --<-----
Trusted Advisor
Posts: 1,795

Re: Manipulate proc contents

Don't you mean SASHELP.VCOLUMN instead of SASHELP.VMEMBER?

Valued Guide
Posts: 3,208

Re: Manipulate proc contents

And sashelp.vtable  .... I yes but was too lazy

---->-- ja karman --<-----
Respected Advisor
Posts: 3,788

Re: Manipulate proc contents

If the data sets are all in the same data libraray you can get PROC CONTENTS on all of them with DATA=libname._ALL_;  You will want NOPRINT and OUT= option with KEEP=data set option to keep only the variables in the OUT= data that you are interested in like MEMNAME, NAME, TYPE, LENGTH etc. 

You may need other or perhaps all of them depending on what you are doing.

Trusted Advisor
Posts: 1,228

Re: Manipulate proc contents

Hi,

Suppose your all datasets are in the same folder then following may generate the desired results.

proc contents data=SASHELP._all_ out=vars(keep=memname name); /* used SASHELP as an example */
run;

proc sql;
select distinct memname from vars
where name in ('A','B','C'); /* Fields to be searched */
quit;

Super User
Super User
Posts: 6,845

Re: Manipulate proc contents

So if you have the list of variable names then a single query against DICTIONARY.COLUMNS should get the list of datasets that contain all of the variables.


%let varlist='NAME','AGE';

%let nvars=2 ;

proc sql  ;

  select distinct catx('.',libname,memname) as dsname

    from dictionary.columns

    where libname='SASHELP'

      and upcase(name) in (&varlist)

    group by 1

    having count(*) = &nvars

  ;

quit;

Occasional Contributor
Posts: 16

Re: Manipulate proc contents

I wrote the below basic sas code to start with (as given below). This filters out the variables names and then it checks if specific variables are available and if they are available if writes the output. So from the title name I can identify the file name and that in turn help me to identify which dataset has these variables

CHALLENGE :- The datasets that I need to scan are spread across multiple folders and across multiple years. For example as below

Dataset :- /june/data/<port>/<rep>/<year>/<month>/<sasfile name>

here port,rep,year,month,sasfile name all are different. I could go define each of the files but that is going to take me a long time. Is there some technique that would allow me to penetrate inside the folders?

--------

SAS CODE USED

--------

libname rpl "/june/dev/adhoc/test/SASEG" ;

libname check "/june/dev/adhoc/test/SASEG" ;

%macro x(port);

libname bbr "/june/data/&port./repor1/2014/06/";

proc contents data=bbr.file1 noprint out=rpl.&port. (keep=name);

run ;

data check.&port. ;

set rpl.&port. ;

if NAME in ('ACCOUNT','CCSTAT','ZIPCODE') ;

run ;

proc print data= check.&port. ;

title check.&port. ;

run ;

%mend ;

%x(zone1);

%x(zone2);

%x(zone3);

Super User
Posts: 19,167

Re: Manipulate proc contents

Yes, you can use standard OS commands.

Google Recursively list all files in a directory in (YOUR OS)

Here's a link for Unix:

https://kb.iu.edu/d/abef

Trusted Advisor
Posts: 1,795

Re: Manipulate proc contents

You don't need PROC CONTENTS or what follows. You simply need to issue the appropriate LIBNAME statements. Then use the solution by Tom in message 6 of this thread, probably without

libname='SASHELP'.


Valued Guide
Posts: 3,208

Re: Manipulate proc contents

Several ways::

1/ You can use OS commands when the installation/configuration has been set to allow you.
In this case you can use an Unix "ls -latr" command with redirecting to analyse that file building up a dataset of folders.
When you have access to the system doing that "ls"" command but not being allowed to run that from SAS you split that up in several steps

2/ Build your SAS dataset with OS folder using SAS functions. It will be a lot of work

1 and 2/ Proceed with that SAS dataset creating the associated libnamess Put some counter in the libname def (max string length 8)

3/ Build all libnames using some macro-looping. An optimistic approach is allowed ignoring the failures of combinations that do not exist.

Having alle libnames defined use the SAShelp.Vcolumns dataset with your selection on column names.

All contents information will be automatic generated by SAS at the moment of accessing that view. 

---->-- ja karman --<-----
Occasional Contributor
Posts: 16

Re: Manipulate proc contents

Sorry guys, have been trying to swim against the wave (don't hate me for not using sashelp yet !!)

let me tell you what I did so far...since the list of sas files were around 60,000 spread across multiple directories/filenames etc I decided to do a basic listing (LS) and created that as a .txt output file. Once I did that I read the .txt file at a particular position and hence my new output file just had the file names

the LS output

-rwxr-x---   1 bag23  too8      56K Jul  1 11:40 /july/first/ny/fq/2014/06/jjssn.sas7bndx

-rwxr-x---   1 bag23  too8      32K Jul  1 11:40 /july/first/ny/fq/2014/06/jjssn.sas7bdat

-rwxr-x---   1 bag23  too8      56K Jul  1 11:40 /july/first/ny/fq/2014/06/jjpii.sas7bndx

-rwxr-x---   1 bag23  too8      32K Jul  1 11:40 /july/first/ny/fq/2014/06/jjpii.sas7bdat

-rwxr-x---   1 bag23  too8      56K Jul  1 11:40 /july/first/ny/fq/2014/06/jj.sas7bndx

NEW FILE output after selecting using the new code

/july/first/ny/fq/2014/06/jjssn.sas7bndx

/july/first/ny/fq/2014/06/jjssn.sas7bdat

/july/first/ny/fq/2014/06/jjpii.sas7bndx

/july/first/ny/fq/2014/06/jjpii.sas7bdat

/july/first/ny/fq/2014/06/jj.sas7bndx 

now I want to read this NEW FILE such that I could get the proc print of each of these records..i.e read a record from the NEW FILE and then PRINT the proc content of first record and then read next record print the PROC CONTENT of that etc..

Valued Guide
Posts: 2,177

Re: Manipulate proc contents

I would use a data step to parse out the paths which hold sas datasets into a libs dataset. Then

data ;

Set libs;

By path;

If last.path ;

CALL EXECUTE( 'libname L ' !! Quote(trim( path)) !!' ;' ) ;

CALL EXECUTE( 'PROC CONTENTS DATA=L._ALL_ NOPRINT OUT=A; ' ) ;

CALL EXECUTE( data b; set a; ');

call execute(' length path $300 ; retain path ' !! quote( trim( path )) );

call execute(' ; run; proc append base= collection data= b ; run ; ' ) ;

Run ;

That provides a work.collection detailling all columns along with the path.

you might want to drop columns you don't need.

Super User
Posts: 19,167

Re: Manipulate proc contents

There's a libname function in SAS, as well as the fact that the libname statement will take multiple arguments.

SAS(R) 9.2 Language Reference: Dictionary, Fourth Edition

libname  test ('C:\Data\StatCan' ,

  'C:\Data\StatCan\ShapeToTab');

Super User
Super User
Posts: 6,845

Re: Manipulate proc contents

I have totally lost the big picture of what you want to do.  Originally it had something to do with testing which datasets had particular variables.

If you are talking about looking at the contents of tens of thousands of SAS datasets and then printing the data in those files?  That does not make much sense as who is going to read all of that output.

Now if you are talking about making a report that summarizes each dataset then it might make more sense.  Say something like

FILENAME                                   NOBS   VAR1  VAR2  VAR3

/july/first/ny/fq/2014/06/jjssn.sas7bdat  1,212    YES    NO    NO

/july/first/ny/fq/2014/06/jjpii.sas7bdat    512    YES    YES   YES



Ask a Question
Discussion stats
  • 28 replies
  • 1054 views
  • 29 likes
  • 8 in conversation