Help using Base SAS procedures

Reading from sashelp.vcolumn is taking too long

Accepted Solution Solved
Reply
Contributor
Posts: 44
Accepted Solution

Reading from sashelp.vcolumn is taking too long

I have a program that reads regularly from this table. The time taken to read data from it tends to vary, but it should only take a few seconds. When I proc print it to the screen, it's 7,000 observations or so. Proc printing it takes 55 seconds to run. Any proc or data step that accesses it takes 55 seconds or so on average, and just now it took 2½ minutes. Any idea why this might be? The PC is a slow one, but even 1GB of RAM and a 2.2GhZ processor should have no problems with this. There are other processes running in the background, but the only place in the program that hangs is where it accesses this table. I can't access my C drive on Windows Explorer, where the file is, so I can't tell if my C drive is short of space. 20434  proc print data = sashelp.vcolumn; 20435      where libname = 'WORK' and memtype = 'DATA' and substr(name,1,3) = 'VAR'; 20436  run; NOTE: No observations were selected from data set SASHELP.VCOLUMN. NOTE: There were 0 observations read from the data set SASHELP.VCOLUMN.       WHERE (libname='WORK') and (memtype='DATA') and (SUBSTR(name, 1, 3)='VAR'); NOTE: PROCEDURE PRINT used (Total process time):       real time          2:30.42       cpu time            7.56 seconds


Accepted Solutions
Solution
‎07-22-2014 10:46 AM
PROC Star
Posts: 1,324

Re: Reading from sashelp.vcolumn is taking too long

Posted in reply to mediaeval

There are lots of ways to make a CSV file, but using your current macro as a starting point, below changes to use PROC CONTENTS to get the var list from the dataset, and PROC SQL to make a macro variable holding that list:

options mprint;

%Macro CreateRPTFile(OutputFile, Dataset, RecLength, Library=work);
%local varlist;

Filename Fname "&OutputFile"; 

proc contents data=&library..&dataset out=__varlist (keep=name) noprint;
run;

proc sql noprint;
  select distinct name into : varlist  separated by ',' 
  from __varlist 
  ;
quit;

**This datastep outputs all of the observations in the specified dataset to the external file; 
Data _Null_; 
  File Fname DLM = "," LRECL = &RecLength; 
  Set &Library..&Dataset;   
  If _n_ = 1 Then Put "!,&varlist";
  put "*," (_all_) (:);
run;

proc datasets library=work memtype=data nolist;
  delete __varlist;
run;
%Mend;


data test;
  array x{5};
  do id=1 to 5;
    do i=1 to dim(x);
      x{i}=ranuni(0);
    end;
    output;
  end;
  drop i;
run;

%CreateRptFile(d:/junk/test2.txt,test,100)

HTH,

--Q.

View solution in original post


All Replies
Super User
Super User
Posts: 7,979

Re: Reading from sashelp.vcolumn is taking too long

Posted in reply to mediaeval

The SASHELP.VCOLUMN is actually an SQL view, so you need to factor that in.  If you use it regularly then datastep to another area restricting on libname ne sashelp and then reference that dataset, save re-running each time.

Super User
Posts: 5,516

Re: Reading from sashelp.vcolumn is taking too long

Posted in reply to mediaeval

A couple of alternative approaches that might run faster ...

Try using PROC SQL on dictionary.columns (instead of using sashelp.vcolumn).

Try running PROC CONTENTS using the options NOPRINT and OUT= to capture the contents of a particular SAS data set.  Also note that you can capture information about all data sets in a library by specifying DATA=libname._all_.

Trusted Advisor
Posts: 3,215

Re: Reading from sashelp.vcolumn is taking too long

Posted in reply to mediaeval

Accessing the sashelp.vmember or dictionary.columns will result at runtime:

- query-ing of all active allocates libraries for existing tables

- for each table of that list query-ing on each table for the existing columns and their attributes.

All this queries are a lot of small pieces of information the system is not tuned and designed for. This same kind of performance burden can be seen opening a lot files with small data to the same size of data collected to be in some bigger files. There is a lot overhead accessing headers/metadata.

When a RDBMS is involved (libname) or a RLS (sas/connect) it go everytime round the circle as the information is not seen as a static table.

It is good example of very easy way of getting of information can cause expected bad performance.   
Copying the sashelp.vcolums first to a work.tabel and than go for selecting can help improving your performance without losing the ease on usage.

Advice: only once using the sashelp.vcloumn further processing on a static copy.

---->-- ja karman --<-----
Contributor
Posts: 44

Re: Reading from sashelp.vcolumn is taking too long

Posted in reply to mediaeval

Here is how I am using sashelp.vcolumn. I'm unsure how to replace the references to it, such that it will work. Also, can you guys see the following code properly? It's all run together in my browser. I'm using Internet Explorer 7.

%Macro CreateRPTFile(OutputFile, Dataset, RecLength, Library=work); **This data step writes the variable names contained in the SAS dataset that is specified in   the &Dataset macro variable, to the file specified in the OutputFile parameter; Filename Fname "&OutputFile"; Data _Null_; File Fname DLM = "," LRECL = &RecLength; Set sashelp.vcolumn; If _n_ = 1 Then Put "!," @; **For libname and memname to work, the names that feed into them must be in uppercase ; Where libname = Put("&Library",$upcase50.) and memname = Put("&Dataset",$upcase50.); Put name @; Run; **This datastep outputs all of the observations in the specified dataset to the external file; Data _Null_; File Fname DLM = "," LRECL = &RecLength Mod; Set &Library..&Dataset;     put "*," (_all_) (Smiley Happy; run; %Mend;

Super User
Super User
Posts: 7,979

Re: Reading from sashelp.vcolumn is taking too long

Posted in reply to mediaeval

Sorrry, I find it very difficult to read that code.  Do you just want to print a dataset to a text file with a delimiter?  Try proc export?  If not maybe clarify what you want out as I don't see what you are trying to achieve, is it perhaps to check the existence before outputting - then use the exist() function.

Super User
Super User
Posts: 7,074

Re: Reading from sashelp.vcolumn is taking too long

Posted in reply to mediaeval

Here is your problem.

Where libname = Put("&Library",$upcase50.)

SAS will not be able to optimize this since it involves the use of a run-time function call to the PUT() function.  So it will need to attempt to open every table in every active library.

Try changing it to:

Where libname = %upcase("&Library")

Now to regular SAS the where clause will contain a constant and SAS should be able to limit itself to just querying the specified library.



Contributor
Posts: 44

Re: Reading from sashelp.vcolumn is taking too long

Tom, I changed my code to use the %ucase function but the times doubled, approximately. This probably isn't anything to do with the change of function, it's probably other factors. When I logged on this morning, a proc print statement on sashelp.vcolumn took 3:45.

Solution
‎07-22-2014 10:46 AM
PROC Star
Posts: 1,324

Re: Reading from sashelp.vcolumn is taking too long

Posted in reply to mediaeval

There are lots of ways to make a CSV file, but using your current macro as a starting point, below changes to use PROC CONTENTS to get the var list from the dataset, and PROC SQL to make a macro variable holding that list:

options mprint;

%Macro CreateRPTFile(OutputFile, Dataset, RecLength, Library=work);
%local varlist;

Filename Fname "&OutputFile"; 

proc contents data=&library..&dataset out=__varlist (keep=name) noprint;
run;

proc sql noprint;
  select distinct name into : varlist  separated by ',' 
  from __varlist 
  ;
quit;

**This datastep outputs all of the observations in the specified dataset to the external file; 
Data _Null_; 
  File Fname DLM = "," LRECL = &RecLength; 
  Set &Library..&Dataset;   
  If _n_ = 1 Then Put "!,&varlist";
  put "*," (_all_) (:);
run;

proc datasets library=work memtype=data nolist;
  delete __varlist;
run;
%Mend;


data test;
  array x{5};
  do id=1 to 5;
    do i=1 to dim(x);
      x{i}=ranuni(0);
    end;
    output;
  end;
  drop i;
run;

%CreateRptFile(d:/junk/test2.txt,test,100)

HTH,

--Q.

Contributor
Posts: 44

Re: Reading from sashelp.vcolumn is taking too long

Many thanks! This runs much more quickly. However, the variables are written to the output text file in alphabetical order. Is there an option on proc contents that will write the variables to __varlist in the order in which they were created in the input dataset?

Super User
Super User
Posts: 7,074

Re: Reading from sashelp.vcolumn is taking too long

Posted in reply to mediaeval

ORDER BY VARNUM

proc contents data=&library..&dataset out=__varlist (keep=name varnum) noprint;

run;

proc sql noprint;

  select distinct name into : varlist  separated by ','

  from __varlist

  order by varnum

  ;

quit;

PROC Star
Posts: 1,324

Re: Reading from sashelp.vcolumn is taking too long

Posted in reply to mediaeval

You can add varnum to the KEEP= option when PROC CONTENTS outputs the dataset.

Then add ORDER BY VARNUM to the SQL step.

You can also remove DISTINCT from the SQL step, since you can't have two vars with same name it isn't needed.

PROC Star
Posts: 1,324

Re: Reading from sashelp.vcolumn is taking too long

Posted in reply to mediaeval

Agree with Astounding's suggestion that PROC CONTENTS is often faster than hitting sashelp.vcolumn (or dictionary.columns).

Also, are you sure there are only 7,000 obs in sashelp.vcolumn?  When I start PC SAS, I have more records than that, just from MAPS libraries and others.   Remember, it has information on the variables for every dataset in every library.... it gets big quick.  In BI setting with pre-assigned libraries, it can be enormous.

Another point is that using the SUBSTR() function on the where statement may mean that SAS can't use indexes to find the records more quickly.

Contributor
Posts: 44

Re: Reading from sashelp.vcolumn is taking too long

Quentin, on login this morning, I'm seeing 5,200 observations when I run proc print on sashelp.vcolumn. Yesterday evening I saw something over 7,000. Running this morning's proc print took 3:45!

Respected Advisor
Posts: 3,799

Re: Reading from sashelp.vcolumn is taking too long

Posted in reply to mediaeval

Do you see CEDA messages? 

15379 - Cross-Environment Data Access (CEDA) utilization message

NOTE: Data file libref.member.DATA is in a format native to

  another host or the file encoding does not match the

  session encoding. Cross Environment Data Access will

  be used, which may require additional CPU resources

  and reduce performance.

🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 20 replies
  • 891 views
  • 8 likes
  • 7 in conversation