BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.

I noticed that the EXIST function returns 0 if the data set is specified with parenthetical options such as a keep option:

1    data _null_ ;
2      x1=exist("sashelp.class") ;
3      x2=exist("sashelp.class(keep=name)") ;
4      put x1= x2=;
5    run ;

x1=1 x2=0

6
7    %put %sysfunc(exist(sashelp.class)) ;
1
8    %put %sysfunc(exist(sashelp.class(keep=name))) ;
0

That seems surprising to me.  Is that behavior others would expect?

 

This seems troubling because if I have  macro:

 

%macro foo(data=);

  %if %sysfunc(exist(&data)) %then %do;
    %* [do something] ;   
  %end;

%mend;

I typically want the macro to work if a user decides to pass the data set name with options, e.g.: %foo(data=mydata(keep=_numeric_))  or %foo(data=mydata(obs=5))

 

BASUG is hosting free webinars Next up: Don Henderson presenting on using hash functions (not hash tables!) to segment data on June 12. Register now at the Boston Area SAS Users Group event page: https://www.basug.org/events.
1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

Yes. So don't give the function the dataset options, just the dataset name.

%if %sysfunc(exist(%scan(&data,1,()))) %then %do;

View solution in original post

14 REPLIES 14
Tom
Super User Tom
Super User

Yes. So don't give the function the dataset options, just the dataset name.

%if %sysfunc(exist(%scan(&data,1,()))) %then %do;
Quentin
Super User

Indeed but there are thousands (millions?) of examples of %sysfunc(exist(&data)) out there in the world, all of which break if passed a data set with options.

 

If I suggested this behavior of the EXIST() function is a bug, would you disagree?

 

If it's not a bug, I'd say it's at least a limitation worth documenting.

BASUG is hosting free webinars Next up: Don Henderson presenting on using hash functions (not hash tables!) to segment data on June 12. Register now at the Boston Area SAS Users Group event page: https://www.basug.org/events.
Tom
Super User Tom
Super User

@Quentin wrote:

Indeed but there are thousands (millions?) of examples of %sysfunc(exist(&data)) out there in the world, all of which break if passed a data set with options.

 

If I suggested this behavior of the EXIST() function is a bug, would you disagree?

 

If it's not a bug, I'd say it's at least a limitation worth documenting.


Not sure it is a bug.  It might help to make it clearer in the documentation what it accepts. 

 

Tom
Super User Tom
Super User

Definitely NOT a bug.  It is working as intended and changing it would cause much more trouble that leaving it as it is.

 

Try this program:

data class; set sashelp.class; run;
%let dsn=class ;
%put &=dsn Exists=%sysfunc(exist(&dsn));
%let dsn=class(keep=name) ;
%put &=dsn Exists=%sysfunc(exist(&dsn));

options validmemname=extend;
data "&dsn"n ;
 set sashelp.class;
run;
%put &=dsn Exists=%sysfunc(exist(&dsn));
90   data class; set sashelp.class; run;

NOTE: There were 19 observations read from the data set SASHELP.CLASS.
NOTE: The data set WORK.CLASS has 19 observations and 5 variables.
NOTE: DATA statement used (Total process time):
      real time           0.08 seconds
      cpu time            0.01 seconds


91   %let dsn=class ;
92   %put &=dsn Exists=%sysfunc(exist(&dsn));
DSN=class Exists=1
93   %let dsn=class(keep=name) ;
94   %put &=dsn Exists=%sysfunc(exist(&dsn));
DSN=class(keep=name) Exists=0
95
96   options validmemname=extend;
97   data "&dsn"n ;
98    set sashelp.class;
99   run;

NOTE: There were 19 observations read from the data set SASHELP.CLASS.
NOTE: The data set WORK.'CLASS(KEEP=NAME)'n has 19 observations and 5 variables.
NOTE: DATA statement used (Total process time):
      real time           0.10 seconds
      cpu time            0.00 seconds


100  %put &=dsn Exists=%sysfunc(exist(&dsn));
DSN=class(keep=name) Exists=1
Quentin
Super User

I don't think validmemname=extend really helps.  I don't want the user to have to create a dataset where SAS sees the () as part of the name of the data set.

 

I'm struggling to think of a scenario where I would want exist() wiith data set argument passed with an option to return 0, so can't imagine what it would break if it was changed to return 1 if the data set exists.

BASUG is hosting free webinars Next up: Don Henderson presenting on using hash functions (not hash tables!) to segment data on June 12. Register now at the Boston Area SAS Users Group event page: https://www.basug.org/events.
Tom
Super User Tom
Super User

@Quentin wrote:

I don't think validmemname=extend really helps.  I don't want the user to have to create a dataset where SAS sees the () as part of the name of the data set.

 

I'm struggling to think of a scenario where I would want exist() wiith data set argument passed with an option to return 0, so can't imagine what it would break if it was changed to return 1 if the data set exists.


I don't understand your response. The function checks if the named dataset exists or not. It does not need or want dataset options. So don't pass it dataset options.  If you somehow were able to modify the function so that it did accept dataset options but ignored them then you will have no way to test for the existence of dataset names that contain parentheses in them. Which SAS currently allows when the VALIDMEMNAME option is set to EXTEND.

 

So we are back to my original answer to the question. Just don't pass in the dataset options. It is trivial to remove them using the %SCAN() function (or SCAN() function).  Of course that logic will also have to deal with any goofy names if the VALIDMEMNAME=EXTEND option is set.

Quentin
Super User

@Tom wrote:

@Quentin wrote:

I don't think validmemname=extend really helps.  I don't want the user to have to create a dataset where SAS sees the () as part of the name of the data set.

 

I'm struggling to think of a scenario where I would want exist() wiith data set argument passed with an option to return 0, so can't imagine what it would break if it was changed to return 1 if the data set exists.


I don't understand your response. The function checks if the named dataset exists or not. It does not need or want dataset options. So don't pass it dataset options.  If you somehow were able to modify the function so that it did accept dataset options but ignored them then you will have no way to test for the existence of dataset names that contain parentheses in them. Which SAS currently allows when the VALIDMEMNAME option is set to EXTEND.

 

So we are back to my original answer to the question. Just don't pass in the dataset options. It is trivial to remove them using the %SCAN() function (or SCAN() function).  Of course that logic will also have to deal with any goofy names if the VALIDMEMNAME=EXTEND option is set.




 

When a user passes a data set name to a macro, I want the user to be able to choose whether or not to pass data set options.  If someone  writes a %Print macro and it breaks if I call %Print(data=mydata(obs=5)), I don't like that macro.  Which means I don't like a LOT of macros out there that do something like:

 

%if NOT %sysfunc(exist(&dsn)) %then %do;
  %put ERROR: specified data does not exist;
%return; %end;

I think macro users should expect to be able to pass any valid data set reference to a DATA parameter.

 

So agree, I think I'm stuck having the macro strip off any existing options before passing the value to the EXIST() function.

 

I hadn't considered VALIDMEMNAME=EXTEND (since I avoid it).  I can imagine exist could decide whether to ignore options or not based on the setting of VALIDMEMNAME, but yeah that gets clunky.

 

 

BASUG is hosting free webinars Next up: Don Henderson presenting on using hash functions (not hash tables!) to segment data on June 12. Register now at the Boston Area SAS Users Group event page: https://www.basug.org/events.
gamotte
Rhodochrosite | Level 12

Hello,

 

Sorry but that doesn't make much sense to me. What would

 

exist("sashelp.class(keep=name)")

 

mean ?

 

sashelp.class(keep=name) describes a dataset that would be obtained by keeping only the name column in

the dataset sashelp.class. You can create as much such datasets as you want :

 

data class1 class2 class3;
  set sashelp.class(keep=name);
run;

It does not designate a specific dataset but is more a recipe to build a new dataset.

 

This behaviour thus seems normal to me.

Quentin
Super User

In my head, exist("sashelp.class(keep=name)") would return the same as exist("sashelp.class").

 

But I guess the issue is that while I think of the argument to exist as being a reference to a SAS data set (which I would hope to allow parenthetical options), per the docs it's actually the name of a SAS library member:

 

member-name

is a character constant, variable, or expression that specifies the SAS library member. Ifmember-nameis blank or a null string, then EXIST uses the value of the _LAST_ system variable as the member name.

 

So I guess it's fair that it treats the whole string (including parentheses) as part of the member-name it's checking to see if exists.

 

Guess I'll have to make a mental note to strip off the options if I ever using EXIST(&data) in a macro.  Luckily I usually don't do this check myself, preferring to let users get the usual "data doesn't exist" error rather than writing my own.

BASUG is hosting free webinars Next up: Don Henderson presenting on using hash functions (not hash tables!) to segment data on June 12. Register now at the Boston Area SAS Users Group event page: https://www.basug.org/events.
gamotte
Rhodochrosite | Level 12

From a maintenance point of view, if i want my macro to be flexible i would rather use additionnal parameters to the macro :

 

%macro mymacro(memname, libname=work, keep=);

    %if %sysfunc(exist(&libname..&memname.)) %then %do;

        data tmp;
            set &libname..&memname.

            %if %length(&keep.)>0 %then %do;
                (keep=&keep.)
            %end;
            ;

        run;

    %end;


%mend;

 

than having a single parameter that can accept values of different natures.

 

Quentin
Super User

Yes, @gamotte , that's the more common approach I see.  But I feel like that puts the burden on the user to remember/pass more parameters, and the burden on the macro author to add a LOT of parameters (keep, drop, where, rename ...) .  

 

I like allowing 

%print(data=foo(obs=5))

Because I like allowing the user to do whatever they want, e.g.:

%print(

            data=foo(keep=ID x1-_numeric_-x5 rename=(x1=Myx1 x2=MyX2) obs=5 where=(ID<10))

           )

 

The user of a macro already knows how to write a data set specification with options, because they know the SAS language.  So I like  letting them pass data set options in the familiar manner.

 

But I may be in the minority in that philosophy.

BASUG is hosting free webinars Next up: Don Henderson presenting on using hash functions (not hash tables!) to segment data on June 12. Register now at the Boston Area SAS Users Group event page: https://www.basug.org/events.
gamotte
Rhodochrosite | Level 12

A macro user uses it as a black box and is not supposed to care about its exact implementation.

Having separate parameters for each characteristic of the dataset you want to pass as argument allows to define precisely what usages are allowed for the macro. Having a single parameter that can take values :

- class

- sashelp.class

- sashelp.class(keep=name)

- class(where=(age>12))

makes it difficult for the macro user to anticipate if a particular call of the macro will work or result in an error.

 

That said, i agree with you that it can results in losses in conciseness and bloated programs.

I would use your approach for short programs that are for my own usage but if i want to provide a

macro to other users i would try to make the contract as clear as possible.

 

Another option is to create the transformed dataset prior to the macro call.

Quentin
Super User

@gamotte wrote:

A macro user uses it as a black box and is not supposed to care about its exact implementation.

Having separate parameters for each characteristic of the dataset you want to pass as argument allows to define precisely what usages are allowed for the macro. Having a single parameter that can take values :

- class

- sashelp.class

- sashelp.class(keep=name)

- class(where=(age>12))

makes it difficult for the macro user to anticipate if a particular call of the macro will work or result in an error.

 

Most of my macros have a DATA parameter where the user passes the data set to be read.  The SAS language uses a DATA parameter (though it's not called that) on DATA steps and PROC steps.  So my bias is to say that users should anticipate that a data set reference which would be valid on a PROC step or DATA step should be work when passed to a macro with a DATA parameter.  Similarly if a macro has a VAR parameter, i like it to support all the different kinds of variable lists that SAS allows.

 

Totally agree the macro should be black box. 

 

Suppose someone has written a lovely macro to do something fancy.  And suppose I choose to have my local SAS datasets password protected.  Seems fair that I would want to do:

%lovelymacro(data=mydata(read=MyPassword))

and expect it to work, rather than have to muck with the macro definition to extend it so that it honors a password option.

BASUG is hosting free webinars Next up: Don Henderson presenting on using hash functions (not hash tables!) to segment data on June 12. Register now at the Boston Area SAS Users Group event page: https://www.basug.org/events.
gamotte
Rhodochrosite | Level 12

OK, i get your point and it seems valid to me now that understand it better.

 

I think the name of the parameter, "data", can be confusing though because it doesn't tell much about what type of values are allowed.

 

It would maybe gain in readability to have a separate argument for dataset options :

 

%macro mymacro(memname, libname=work, dataset_options=);

	proc print data=&libname..&memname.(&dataset_options.);
	run;

%mend;

%mymacro(class, libname=sashelp, dataset_options=where=(name="Alice" or age gt 15) keep=name age);

 

But the most important is that you use a consistent rule for all your macros.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 14 replies
  • 1686 views
  • 4 likes
  • 3 in conversation