Hi Folks, you've been very helpful with past questions, so going back to the community well again...
Many articles point out several ways to efficiently get the number of observations (nobs, and other metadata such as number of variables) in a dataset, such as https://communities.sas.com/t5/SAS-Communities-Library/Determining-the-number-of-observations-in-a-S.... In our case, the datasets are data objects, tables or views, in Oracle and Amazon Redshift data warehouses. I've tried several different methods, and they don't work. (They do for the built-in dataset libraries, such as sashelp.) Brute force does work, such as proc sql's select count(*) from data.object. But some of these are large objects, and there are hundreds of interest in our warehouses, so efficiency matters. The results from proc datasets, and querying dictionary.tables in proc sql (which seems ideal to me), return missing values for the nobs. The data step descriptor object returns an impossibly large number, even for a simple (dimension) table of a dozen rows.
Has anyone seen this before, and have any suggestions for me? Given SAS is a sophisticated environment, I imagine this could be caused by a number of possible things: SAS EG tool that we use, SAS/Access software, engines, ODBC drivers, configurations of any of those, type of data object, etc. I'd appreciate any suggestions in determining if (a) it should work in our environment, and (b) if it should work, isolating and addressing what might be causing the issue. Of course, if running snippets of code and posting them and the results here would be useful, I'd be more than happy to do so.
Thanks, Peter
For an EXTERNAL data system use the external systems metadata programs/tables using pass through code to generate the results entirely in the external database. Then pull that result into SAS. The SAS dictionary tables are an implementation of what external DBMS have done for years but their tools are their tools. You may have to work with your DB administrator(s) for the most efficient way to get what you want, or possibly set up some views that are maintained just for your unit to pull into SAS.
Or only use the SAS methods on datasets that have been pulled into SAS, not on the remote files.
Thank you for that response, and that certainly makes sense. I'll discuss this subject with our DBAs and SAS environment admins.
Thanks, Peter
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.