Hi SAS Forum;
Could you please help me on this layman question.
I ran the below code for my large data set to see how many records are there.
Proc contents data=huge_data_set varnum;
run;
The output table generated has a dot in front of the "number of observations"
and it does not show the real number of observations in the data set.
Q:Why Proc contents doesn't provide the "number of observations" ?
Thanks for your time
Mirisage
And apparently from another thread, if the number records exceeds the maximum value that SAS will store in a numeric variable then Contets reports missing. Depending on how precise you need the record count, you can get an estimate from the the Engine/host dependent information. Use ("Number of data set pages" - "First Data Page")* "Max obs per page".
The first "pages" store the header information and varies depending on numbers of variables and lengths of labels and such.
You could try looking up the number of observations in the Dictionary table Members, but will probably find the number of records there missing as well.
There are several possibilities as to why the number of observations would be unknown.
1. The data is stored on tape.
2. The data is stored on disk, but in tape format.
3. The data set is actually a view rather than a data set.
You probably already know which of these applies ... but if you don't then it's likely to be #2.
Good luck.
And apparently from another thread, if the number records exceeds the maximum value that SAS will store in a numeric variable then Contets reports missing. Depending on how precise you need the record count, you can get an estimate from the the Engine/host dependent information. Use ("Number of data set pages" - "First Data Page")* "Max obs per page".
The first "pages" store the header information and varies depending on numbers of variables and lengths of labels and such.
You could try looking up the number of observations in the Dictionary table Members, but will probably find the number of records there missing as well.
Hi Astounding and ballardw,
Many thanks to both of you for providing me with this knowledge.
Best regards
Mirisage
What happens if you use the following?
%LET DSID = %SYSFUNC(OPEN(WORK.NEWFILES,I));
%LET TOTAL = %SYSFUNC(ATTRN(&DSID.,NOBS));
Does the macro variable store the correct number of observations?
Hi Scott_Mitchell,
I am very curious to run your code and see.
My very large data set is stored in a warehouse.
Lets say I assign the library name to the warehouse as "location".
Then the proc contenets code that has not provided us with the actual number of observations is like below.
Proc contents data=location.huge_data_set varnum;
run;
I can understand only that much.
Question:
Then what should I do to apply your code?
Is that just copy your below code into my sas editor and running?
%LET DSID = %SYSFUNC(OPEN(WORK.NEWFILES,I));
%LET TOTAL = %SYSFUNC(ATTRN(&DSID.,NOBS));
Please let me know.
Thanks
Mirisage
Take Scott's code, paste it into a SAS program and submit it. Add %put statements to output the results to the LOG. Something like this . . .
%LET DSID = %SYSFUNC(OPEN(WORK.NEWFILES,I));
%LET TOTAL = %SYSFUNC(ATTRN(&DSID.,NOBS));
%PUT the number of observations is &TOTAL;
You also need to change WORK.NEWFILES to the name of your dataset LOCATION.HUGE_DATA_SET. Probably self evident, but just in case.
%LET DSID = %SYSFUNC(OPEN(WORK.NEWFILES,I));
%LET TOTAL = %SYSFUNC(ATTRN(&DSID.,NOBS));
%PUT the number of observations is &TOTAL;
Hi Fugue and Scott_Mitchell,
Thanks. Then I ran the follwing SAS code.
%LET DSID = %SYSFUNC(OPEN(LOCATION.HUGE_DATA_SET,I));
%LET TOTAL = %SYSFUNC(ATTRN(&DSID.,NOBS));
%PUT the number of observations is &TOTAL;
It produced the following log note.
%LET DSID = %SYSFUNC(OPEN(pcfwh.U_PCF_PROPENSITY,I));
16
17 %LET TOTAL = %SYSFUNC(ATTRN(&DSID.,NOBS));
SYMBOLGEN: Macro variable DSID resolves to 2
18
19 %PUT the number of observations is &TOTAL;
SYMBOLGEN: Macro variable TOTAL resolves to -1
the number of observations is -1
It seems that the number of observations is not given correctly.
Thanks anyway to both of you.
Mirisage
Which isn't really surprising. The ATTRN function is going to have the same numeric storage/display limitations as any other SAS numeric value.
I would wonder if the data warehouse has an equivalent of the SAS dictionary tables that could be queried.
I was hoping for an error or a missing value, as opposed to -1 for number of observations not available.
I can't believe there isn't an accurate way to identify the number of observations in a dataset that size.
What version of SAS are you running?
The reason I asked about the version of SAS you are running is that there is an EXTENDOBSCOUNT data set option beginning in SAS 9.3 that is designed for use with large datasets. I haven't thoroughly read the SAS documentation and we don't have datasets anywhere near the necessary size limits, so I'm not sure whether this option will have any effect on observation counts produced by PROC CONTENTS, etc.
Anyone know?
Hi Tom and Fugue,
Sorry I could not respond you.
For Tom's question, What do you mean by a 'warehouse'? It is a Oracle data base. May be I cannot give the name of the data base for confidentiality
For Fugue's question: I am using SAS Ver 9.2 in Windows. It is emebedded in SAS EG ver 4.3. (My IT knowledge is not good, but I think we can say like ....."embedded in SAS EG").
Thanks every one for this discussion and great knowledge.
Regards
Mirisage
Could he use sql pass thru and run a query to count the number of rows and convert the results to character before they are passed back to SAS?
You could also try the system table all_tables column num_rows however this isn't always up to date.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.