DATA Step, Macro, Functions and more

Count number of records in list of files?

Accepted Solution Solved
Reply
Frequent Contributor
Posts: 90
Accepted Solution

Count number of records in list of files?

Hello everyone,

I have a need to count the number of records in a list of dynamic files and create a new small table with the name of the file as the key and the total number of records as one data field.   (Well in next version[s] I will be adding a per term [strm] record count field/term, and snapshot, etc.)   Anyway I am not sure how to go about this,  I can easily count a file or two but I think I need an array or a do loop and am not sure what is the best way given my long term goal.  ...and being a bit of a hack at SAS with just one year of experience have as of yet to try a single array. I am hoping to get some advice. And maybe a bit of starter code.  Anyway here is some of my starter code (SAS 9.4):

*be sure to update the snapshot and term in set_variable.sas program for each run...;
%include "&pname.set_variables.sas";
libname test odbc dsn=test schema = dbo;

%let Current_DB=test;
%put &Current_DB;

%put &snapshot;
%put &strm;
options nocenter;
options ls=256;

proc sql;
create table mytables as
select distinct TABLE_CATALOG length=8,
TABLE_NAME length=32,
'9999999999' as file_obs_ct length 10
from &Current_DB..my_metadata;
quit;

data mytables2;
format TABLE_CATALOG $8.;
format TABLE_NAME $32.;
set mytables;
run;

data mytables;
set mytables2;
run;

proc sql noprint;
select TABLE_NAME into: TABLE_NAME_list
separated by '","'
from mytables
where lower(TABLE_NAME) like "xw_%";
quit;
%put "&TABLE_NAME_list.";

proc sql noprint;
select distinct count(TABLE_NAME) into: TABLE_NAME_ct
separated by '","'
from mytables
/*where lower(TABLE_NAME) like "xw_%";*/
quit;
%put &TABLE_NAME_ct.;

TIA  -Keith

 


Accepted Solutions
Solution
‎09-06-2016 07:41 PM
Frequent Contributor
Posts: 90

Re: Count number of records in list of files?

I ended up using the SQL query, with some SAS selects to build my lists* of tables based on does the file contain a strm or not, if a strm does it also have a snapshot.  A strm can be thought of as a student term.  Anyway from the point of already having the three lists of files here is the most complicated sample for anyone who might like it macro and the automated way to call it.   I am sure some out there can make it ever shorter but this is a template so I can via meta query call any db I have and built grand totals, strm totals, and strm/snapshot totals for any given file type.

 

*those selects to build my lists of files are not shown here.

 

%macro get_counts_strm_snapshot(My_file);
data &My_file.;
set &Current_DB..&My_file.;
run;
proc sort data = &My_file.;by strm snapshot;run;
data &My_file.1;
  set &My_file.;
  count + 1;
  by strm snapshot;
  if first.snapshot then count = 1;
run;
/*proc print data = &My_file.1 noobs n;*/
/*var strm snapshot count;*/
/*run;*/
proc sql;
create table &My_file.2 as
select distinct 
	strm,
	snapshot,
	max(count) as strm_snapshot_count
from &My_file.1
group by strm;
quit;
title "&Current_DB. : &My_file.";
proc print data=&My_file.2 noobs n;run;
title;
/* deleting temp dataset(s), keeping dataset name -> My_file.2 for later use */
proc datasets library=WORK;
   delete &My_file.1 &My_file.;
run;
%mend get_counts_strm_snapshot;


/* and my call to execute them */

data _NULL_;
/* current list of files that have strm plus snapshots */ set mytable_strm_snapshot; code = catt('%nrstr(', '%get_counts_strm_snapshot(',TABLE_NAME,')', ')'); call execute(code); run; quit;

I am sure there are a lot of ways to do this, however in my case I can change my db name and cross check are my tables from test to production in sync like they should be at any given time, and also not care about if I have developed a new table or changed a table from no strm to strm based and still check if everything count is equal.

View solution in original post


All Replies
Frequent Contributor
Posts: 90

Re: Count number of records in list of files?

My actual first attempt was this, but I don't know how to keep track of the results from file to file:
/*temp trial with just one file */
%let file=xw_term;
%macro find_ct(file);
proc sql noprint;
select count(*) into: file_ct
from &Current_DB..&file.;
quit;
%GLOBAL &file._file_ct;
%let &file._file_ct=&file_ct.;
%put &file._file_ct;
%mend find_ct;
%find_ct(&file.);

%put &file._file_ct;
%put &xw_term_file_ct.;
Super User
Posts: 10,516

Re: Count number of records in list of files?

You may be leaving out part of the description of the problem if you are reading from an external database.

I don't work in such environment so I am not sure if this will work in that case.

SAS maintains some metadata about library members in dictionary.tables (and others) for use with SQL or SASHELP.VTABLE.

 

So it may be easier to query those tables (if the code below works)

proc sql;
   create table want as
   select memname,nobs
   from dictionary.tables
   where libname='TEST';
quit;

You may need to examine the contents of the dictionary table as some things are stored in all caps and some mixed case. So either use upcase (or lowcase) on the values when using equal comparisons.

 

Frequent Contributor
Posts: 90

Re: Count number of records in list of files?

[ Edited ]

Actually you are right, I am reading against a MS SQL db.   I was trying to keep it simple and solely in SAS terms.  Sorry if I made it worse or left out critical details.

 

pre-  MS SQL code run:

 

DROP TABLE my_metadata;

SELECT distinct *
INTO my_metadata
FROM INFORMATION_SCHEMA.columns
where TABLE_NAME<>'my_metadata';

 

 On the one hand if my tables were few enough and small enough I could just read the full tables into a data set and use your idea locally but they are too large for my run time needs... Thanks for the reply -KJ

Super User
Posts: 10,516

Re: Count number of records in list of files?

I would look to an MS SQL equivalent of the SAS dictionary table. It is very likely that something similar exists in the DB metadata. It would be much more efficient. A quick google search of "MS SQL DICTIONARY TAble" turned up a fair number of results, so it may not be necessary to reinvent a wheel. Hopefully the data is exposed in such a way that you could read it using SAS.

 

Though you may get to modify the code when SQL gets 'updated';

Frequent Contributor
Posts: 90

Re: Count number of records in list of files?

I might take your idea and run with it, but first I found another resource, I hope cross listing is okay here? UCLA has some info on "enumerate" that sounds promising.

http://www.ats.ucla.edu/stat/sas/faq/enumerate.htm

…based on their simple example I could for a next step Proc SQL another table dropping in their shown case “score” field (in other words all fields I do not wish to summarize), with the count field I could use max(), use the ‘distinct’ key word, and last I would group by my remaining fields.   I think that is in line with my needs.   Then only issue is changing from file to file …    Have I forgotten anything?   Macro-ing across files/tables is my bain  -KJ

Solution
‎09-06-2016 07:41 PM
Frequent Contributor
Posts: 90

Re: Count number of records in list of files?

I ended up using the SQL query, with some SAS selects to build my lists* of tables based on does the file contain a strm or not, if a strm does it also have a snapshot.  A strm can be thought of as a student term.  Anyway from the point of already having the three lists of files here is the most complicated sample for anyone who might like it macro and the automated way to call it.   I am sure some out there can make it ever shorter but this is a template so I can via meta query call any db I have and built grand totals, strm totals, and strm/snapshot totals for any given file type.

 

*those selects to build my lists of files are not shown here.

 

%macro get_counts_strm_snapshot(My_file);
data &My_file.;
set &Current_DB..&My_file.;
run;
proc sort data = &My_file.;by strm snapshot;run;
data &My_file.1;
  set &My_file.;
  count + 1;
  by strm snapshot;
  if first.snapshot then count = 1;
run;
/*proc print data = &My_file.1 noobs n;*/
/*var strm snapshot count;*/
/*run;*/
proc sql;
create table &My_file.2 as
select distinct 
	strm,
	snapshot,
	max(count) as strm_snapshot_count
from &My_file.1
group by strm;
quit;
title "&Current_DB. : &My_file.";
proc print data=&My_file.2 noobs n;run;
title;
/* deleting temp dataset(s), keeping dataset name -> My_file.2 for later use */
proc datasets library=WORK;
   delete &My_file.1 &My_file.;
run;
%mend get_counts_strm_snapshot;


/* and my call to execute them */

data _NULL_;
/* current list of files that have strm plus snapshots */ set mytable_strm_snapshot; code = catt('%nrstr(', '%get_counts_strm_snapshot(',TABLE_NAME,')', ')'); call execute(code); run; quit;

I am sure there are a lot of ways to do this, however in my case I can change my db name and cross check are my tables from test to production in sync like they should be at any given time, and also not care about if I have developed a new table or changed a table from no strm to strm based and still check if everything count is equal.

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 6 replies
  • 407 views
  • 1 like
  • 2 in conversation