turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- SAS Procedures
- /
- Why Proc contents does not show the number of obse...

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

08-15-2013 03:41 PM

Hi SAS Forum;

Could you please help me on this layman question.

I ran the below code for my large data set to see how many records are there.

Proc contents data=huge_data_set varnum;

run;

The output table generated has a dot in front of the "number of observations"

and it does not show the real number of observations in the data set.

Q:Why Proc contents doesn't provide the "number of observations" ?

Thanks for your time

Mirisage

Accepted Solutions

Solution

08-15-2013
05:50 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Astounding

08-15-2013 05:50 PM

And apparently from another thread, if the number records exceeds the maximum value that SAS will store in a numeric variable then Contets reports missing. Depending on how precise you need the record count, you can get an estimate from the the Engine/host dependent information. Use ("Number of data set pages" - "First Data Page")* "Max obs per page".

The first "pages" store the header information and varies depending on numbers of variables and lengths of labels and such.

You could try looking up the number of observations in the Dictionary table Members, but will probably find the number of records there missing as well.

All Replies

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Mirisage

08-15-2013 04:26 PM

There are several possibilities as to why the number of observations would be unknown.

1. The data is stored on tape.

2. The data is stored on disk, but in tape format.

3. The data set is actually a view rather than a data set.

You probably already know which of these applies ... but if you don't then it's likely to be #2.

Good luck.

Solution

08-15-2013
05:50 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Astounding

08-15-2013 05:50 PM

And apparently from another thread, if the number records exceeds the maximum value that SAS will store in a numeric variable then Contets reports missing. Depending on how precise you need the record count, you can get an estimate from the the Engine/host dependent information. Use ("Number of data set pages" - "First Data Page")* "Max obs per page".

The first "pages" store the header information and varies depending on numbers of variables and lengths of labels and such.

You could try looking up the number of observations in the Dictionary table Members, but will probably find the number of records there missing as well.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to ballardw

08-16-2013 08:20 PM

Hi **Astounding **and **ballardw,**

Many thanks to both of you for providing me with this knowledge.

Best regards

Mirisage

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Mirisage

08-17-2013 02:07 AM

What happens if you use the following?

%LET DSID = %SYSFUNC(OPEN(WORK.NEWFILES,I));

%LET TOTAL = %SYSFUNC(ATTRN(&DSID.,NOBS));

Does the macro variable store the correct number of observations?

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Scott_Mitchell

08-17-2013 05:38 PM

Hi Scott_Mitchell,

I am very curious to run your code and see.

My very large data set is stored in a warehouse.

Lets say I assign the library name to the warehouse as "location".

Then the proc contenets code that has not provided us with the actual number of observations is like below.

Proc contents data=location.huge_data_set varnum;

run;

I can understand only that much.

Question:

Then what should I do to apply your code?

Is that just copy your below code into my sas editor and running?

**%LET DSID = %SYSFUNC(OPEN(WORK.NEWFILES,I));**

**%LET TOTAL = %SYSFUNC(ATTRN(&DSID.,NOBS));**

Please let me know.

Thanks

Mirisage

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Mirisage

08-17-2013 05:55 PM

Take Scott's code, paste it into a SAS program and submit it. Add %put statements to output the results to the LOG. Something like this . . .

**%LET DSID = %SYSFUNC(OPEN(WORK.NEWFILES,I));**

**%LET TOTAL = %SYSFUNC(ATTRN(&DSID.,NOBS));**

**%PUT the number of observations is &TOTAL;**

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Fugue

08-17-2013 10:06 PM

You also need to change WORK.NEWFILES to the name of your dataset LOCATION.HUGE_DATA_SET. Probably self evident, but just in case.

%LET DSID = %SYSFUNC(OPEN(WORK.NEWFILES,I));

%LET TOTAL = %SYSFUNC(ATTRN(&DSID.,NOBS));

%PUT the number of observations is &TOTAL;

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Scott_Mitchell

08-19-2013 01:57 PM

Hi Fugue and Scott_Mitchell,

Thanks. Then I ran the follwing SAS code.

%LET DSID = %SYSFUNC(OPEN(LOCATION.HUGE_DATA_SET,I));

%LET TOTAL = %SYSFUNC(ATTRN(&DSID.,NOBS));

%PUT the number of observations is &TOTAL;

It produced the following log note.

%LET DSID = %SYSFUNC(OPEN(pcfwh.U_PCF_PROPENSITY,I));

16

17 %LET TOTAL = %SYSFUNC(ATTRN(&DSID.,NOBS));

SYMBOLGEN: Macro variable DSID resolves to 2

18

19 %PUT the number of observations is &TOTAL;

SYMBOLGEN: Macro variable TOTAL resolves to -1

the number of observations is -1

It seems that the number of observations is not given correctly.

Thanks anyway to both of you.

Mirisage

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Mirisage

08-19-2013 02:06 PM

Which isn't really surprising. The ATTRN function is going to have the same numeric storage/display limitations as any other SAS numeric value.

I would wonder if the data warehouse has an equivalent of the SAS dictionary tables that could be queried.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to ballardw

08-19-2013 07:59 PM

I was hoping for an error or a missing value, as opposed to -1 for number of observations not available.

I can't believe there isn't an accurate way to identify the number of observations in a dataset that size.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Mirisage

08-20-2013 04:58 PM

What version of SAS are you running?

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Fugue

08-22-2013 08:04 PM

The reason I asked about the version of SAS you are running is that there is an EXTENDOBSCOUNT data set option beginning in SAS 9.3 that is designed for use with large datasets. I haven't thoroughly read the SAS documentation and we don't have datasets anywhere near the necessary size limits, so I'm not sure whether this option will have any effect on observation counts produced by PROC CONTENTS, etc.

Anyone know?

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Fugue

08-23-2013 08:24 AM

Hi Tom and Fugue,

Sorry I could not respond you.

For Tom's question, What do you mean by a 'warehouse'? It is a Oracle data base. May be I cannot give the name of the data base for confidentiality

For Fugue's question: I am using SAS Ver 9.2 in Windows. It is emebedded in SAS EG ver 4.3. (My IT knowledge is not good, but I think we can say like ....."embedded in SAS EG").

Thanks every one for this discussion and great knowledge.

Regards

Mirisage

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Mirisage

08-23-2013 08:58 AM

Could he use sql pass thru and run a query to count the number of rows and convert the results to character before they are passed back to SAS?

You could also try the system table all_tables column num_rows however this isn't always up to date.