BookmarkSubscribeRSS Feed
Sean_OConnor
Fluorite | Level 6

Folks,

 

I've a query which perhaps people could shed some light on. 

 

I'm utilsing the following piece of code to to get a count of the number of observations from each of my datasets, for a number of different libs.

 

proc sql; 
create table data as
	select libname, memname, nvar from dictionary.tables
where libname in ('BUS2011','BUS2012','BUS2013','BUS2014','BUS2015');quit;

However, within these datasets there is a particular variable I would like to get a count of. Let us call it 'ID'. ID is common in all datasets, across all years. 

 

Within ID there are two indicators to whether the variable is blank. This is either 'XXXX' or 'NONE'. 

 

What I would like to do is tell SAS if ID has either value to delete it. 

 

From here I would only be left with values with an actual ID. I would then like a count of all of these observations.

 

I understand how to this in a datastep, but the fact I'm dealing with 5 different libs, with 20 datasets in each one, would mean it would take some time to carry this out. 

 

Thus, I've turned to SQL, which I'm sure could speed up the process, but I'm just not too sure how to carry this out. 

 

Any help would be much appreciated. 

7 REPLIES 7
RW9
Diamond | Level 26 RW9
Diamond | Level 26

SQL is unlikely to speed things up, if anything on large datasets, it may well slow things down.  As for your question, the simplest method is to just generate the required code:

data _null_;
/* Expand this as you need with libnames, it gets a list of all datasets in */
/* the given libnames */

  set sashelp.vtable (where=(libname in "BUS2011","BUS2012"));

/* Using this list create a datastep for each dataset in those libnames */
/* Note I am updating them to v_<libname> - as I assume you don't want */
/* to overwrite */

  call execute('data v_'||catx('.',libname,memname)||'; set '||catx('.',libname,memname),'; where id not in ('XXXX','NONE'); run;');
run;   

So for every dataset in th egiven libnames this datastep will generate a datastep which looks like:

data v_<libname>.<dataset>;

  set <libname>.<dataset>;

  where id not in ('XXXX','NONE');

run;

 

So in effect copy the dataset and dropping those records.  You can then pull out the num obs from the created datasets.  

Sean_OConnor
Fluorite | Level 6

RW9,

 

Thank you for this, but I'm a bit confused by the required code.

proc sql; 
create table vtable as
	select libname, memname, nvar from dictionary.tables
	
where libname in ('BUS2011');quit;


data _null_;
/* Expand this as you need with libnames, it gets a list of all datasets in */
/* the given libnames */

  set sashelp.vtable (where=(libname in ('BUS2011')));

/* Using this list create a datastep for each dataset in those libnames */
/* Note I am updating them to v_<libname> - as I assume you don't want */
/* to overwrite */

 call execute('data v_'||catx('.',libname,memname)||'; set '||catx('.',libname,memname),'; where id not in ('XXXX','NONE'); run;');
run; 

I'm wondering is there a need for the quottation marks in the line with call excute.?

 

When I attempt to run it I get the following errors;

 

ERROR: Invalid hexadecimal constant string '; where id not in ('X.
ERROR 388-185: Expecting an arithmetic operator.

ERROR 76-322: Syntax error, statement will be ignored.

 

RW9
Diamond | Level 26 RW9
Diamond | Level 26

Firstly, you do not need the proc sql part.  Next, yes quotes are needed.  Call execute() function requires a string, which is injected into the Base SAS compiler after the datastep stops executing.  This string therefore needs to resolve to valid SAS sysntax.

data _null_;
  set sashelp.vtable (where=(libname in ('BUS2011')));
  call execute('data v_'||catx('.',libname,memname)||'; 
set '
||catx('.',libname,memname),';
where id not in ("
XXXX","NONE");
run;'
); run;
 

You will note I put double quotes around the in() now, that is why you were getting the error (if I had some test data as a datastep, required output, then I would be able to test it directly, but I don't).  

Astounding
PROC Star

Looking at these features:

 

  • You already have a table with all the data sets that need to be processed
  • You will need to go through each data set to count the valid ID values

Why not get a little extra information?  Construct a program that looks like this:

 

proc format;

value $mygroup ' '='Blank  'XXXX', 'NONE' = 'Invalid' other='Valid';

run;

 

proc freq data=memname.libname;

title "memname.libname";

tables id / missing;

format id $mygroup.;

run;

 

The trick is getting SAS to construct the dozens of PROC FREQ steps.  So hard-code the PROC FORMAT, then:

 

data _null_;

set data;

call execute(cats('proc freq data=', libname, '.', memname, ';'));

call execute(cats('title "', libname, '.', memname, '";'));

call execute ('tables id / missing; format id $mygroup.; run;');

run;

Sean_OConnor
Fluorite | Level 6

Folks, 

 

Could I extend said code to only look at certain datasets within the quoted libs?

 

Something such as;

 

data _null_;

set sashelp.vtable 
(where=(libname in ('P2005','P2006','P2007','P2008','P2009','P2010','P2011','P2013','P2014','P2015')));
(where=(memname in ('ACCOUNTS','PERSONAL','TRADE'))); 

call execute(cats('proc freq data=', libname, '.', memname, ';'));

call execute(cats('title "', libname, '.', memname, '";'));

call execute ('tables id / missing; format id $mygroup.; run;');

run;
Astounding
PROC Star

Yes, but there would still be only one WHERE clause used.  One possibility:

 

set sashelp.vtable 
(where=(libname in ('P2005','P2006','P2007','P2008','P2009','P2010','P2011','P2013','P2014','P2015')
 and memname in ('ACCOUNTS','PERSONAL','TRADE'))); 

 

RW9
Diamond | Level 26 RW9
Diamond | Level 26

Yes, as you have done, but your where is wrong:

(where=(libname in ('P2005','P2006','P2007','P2008','P2009','P2010','P2011','P2013','P2014','P2015') and memname in ('ACCOUNTS','PERSONAL','TRADE'))); 

If you want only certain datasets within the libraries, then you would need both levels:

(where=(catx('.',libname,memname) in ('P2005.ACCOUNTS')));

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

SAS Enterprise Guide vs. SAS Studio

What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 947 views
  • 0 likes
  • 3 in conversation