DATA Step, Macro, Functions and more

Creating a Dataset only if Observation Count > 0

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 17
Accepted Solution

Creating a Dataset only if Observation Count > 0

Hello,

 

I am looking to create the following logic:

 

1. Pull data from database

2. If the dataset is not empty, then run a data step to create a final dataset.

 

In the flow below, curpd_dataset is the data from database. however I cannot figure out a way to get the syntax correct:

 

data _NULL_;

if 0 then set curpd_dataset nobs=num_obs;

if num_obs = 0 then call execute ('%put Using Existing Data;');

if num_obs > 0 then 'DO DATA STEP HERE';

                   data libname.FINAL_DATASET;

                   set curpd_dataset ;

stop;

run;

 

Anyone have any tips on a way to make this work? I would eventually want to integrate this into a Macro.

 

Thanks

 

 


Accepted Solutions
Solution
‎03-17-2016 09:44 AM
Super User
Posts: 5,498

Re: Creating a Dataset only if Observation Count > 0

Here's the approach I would use.  You have already identified CALL EXECUTE as a viable tool here.

 

data _null_;

if done then put 'There were no observations.  Use the existing data.';

else call execute('data libname.final_dataset; set curpd_dataset; run;');

stop;

set curpd_dataset end=done;

run;

 

I'm not sure if you intended more code to go into that final DATA step, but you could always add more.  As you mentioned, if the DATA step is actually more complex, you might want to convert the code to a macro instead.  But so far, there isn't a need to do that.

View solution in original post


All Replies
Super User
Posts: 5,424

Re: Creating a Dataset only if Observation Count > 0

Try to use proc contents or data step functions for SAS data sets or dictionary.tables from PROC SQL or sashelp.vstable.vstable.
Data never sleeps
SAS Super FREQ
Posts: 708

Re: Creating a Dataset only if Observation Count > 0

Hi

 

Please note, that the NOBS= option of the SET statement can only be used with SAS data sets. If your data is coming from a DBMS like Oracle, DB2 etc, then this approach can not be used. Instead you have to count the number of rows.

 

To run code contionally, one usually uses the SAS Macro Language.

 

Find below an example that shows how to count the rows, and then run some DATA Step.

 


%macro createData(dsn=);
  %local rowCount;

  proc sql noprint;
    select
      count(*)
    into  
      :rowCount trimmed
    from
      &dsn
    ;
  quit;

  %put NOTE: &=rowCount;

  %if &rowCount > 0 %then %do;

    data newTable;
      set &dsn;
    run;

  %end;
  %else %do;
    %put NOTE: &sysmacroname &dsn is empty;
  %end;
%mend;

%createData(dsn=xora.tempcars)

Bruno

Super Contributor
Posts: 438

Re: Creating a Dataset only if Observation Count > 0

[ Edited ]
Posted in reply to Bruno_SAS

Bruno,
Be very careful to count the number of rows in a table that is not acessed by the base engine (eg. a RDBMS). You may hit a full table scan and if there are gazillions of rows you will regret the approach. Trying to read one row and determining success or failure is usually better in that case.

Regards,
- Jan.

SAS Super FREQ
Posts: 708

Re: Creating a Dataset only if Observation Count > 0

Posted in reply to jklaverstijn

Hi Jan

 

Thanks for pointing this out. Some SAS/ACCESS to Interfaces will pass down the OBS= data set option as part of the generated SQL Query. There might be various ways to optimize this, depending on the DBMS used.

 

Bruno

Super Contributor
Posts: 438

Re: Creating a Dataset only if Observation Count > 0

Posted in reply to Bruno_SAS
Hi Bruno,

You're welcome. Problem with many (R)DBMS's is that the row count is not Always accurate and up-to-date. Oracle for instance keeps a count in the table statistics but these are updated explicitly and not automatically. So the count may be off. Even considerably. Or not be available in the first place. I must assume that the performance penalty in transactional systems is prohibitive. As long-time SAS users we have become accustomed to NOBS and even NLOBS, but it is a luxury that other vendors not always provide.
Solution
‎03-17-2016 09:44 AM
Super User
Posts: 5,498

Re: Creating a Dataset only if Observation Count > 0

Here's the approach I would use.  You have already identified CALL EXECUTE as a viable tool here.

 

data _null_;

if done then put 'There were no observations.  Use the existing data.';

else call execute('data libname.final_dataset; set curpd_dataset; run;');

stop;

set curpd_dataset end=done;

run;

 

I'm not sure if you intended more code to go into that final DATA step, but you could always add more.  As you mentioned, if the DATA step is actually more complex, you might want to convert the code to a macro instead.  But so far, there isn't a need to do that.

Occasional Contributor
Posts: 17

Re: Creating a Dataset only if Observation Count > 0

Posted in reply to Astounding

Thanks all for the replies! I don't do too much in the final data step, so this will fit my needs for now!

 

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 7 replies
  • 560 views
  • 4 likes
  • 5 in conversation