BookmarkSubscribeRSS Feed
PaigeMiller
Diamond | Level 26

I have seen (and used) DOSUBL in code to create a list of variable names in a SAS data step. And it seems as if DOSUBL provides a lot of power that can be used effectively, but I am unable to think of places where it might be used, other than the case where you want that list of variable names. So, where else could you use it? Give me some examples! (I'm not really asking for code, I'm asking for opportunities or uses of DOSUBL. Most likely, I can write my own code)

--
Paige Miller
15 REPLIES 15
ChrisHemedinger
Community Manager

You probably saw the paper by Rick Langston, published back when he added DOSUBL to the SAS language? A few patterns in there, including an improvement where DOSUBL is more elegant than the RUN_MACRO function for some purposes.

 

I've seen some impressive (and kind of crazy) examples from SAS-L members with ideas for how to employ DOSUBL. Sometimes these uses border on "abuses" and the results aren't what was expected. That said, I've found it useful as a means to an end for some tasks, like this one.

Check out SAS Innovate on-demand content! Watch the main stage sessions, keynotes, and over 20 technical breakout sessions!
PaigeMiller
Diamond | Level 26

Thanks, @ChrisHemedinger for your example using PROC IOMOPERATE.

 

I think I grasp what you are doing there, you get a data set called SPAWNED, then for each row of the data set you run a PROC using DOSUBL, that creates an output data set for each row, then you combine all the output data sets.

 

Since I am not a SAS administrator, I'm still struggling to think of cases where a statistician such as myself (who has to do lots data manipulation) would want to run a PROC on each row of a data set. Since I have had a whole 10 minutes to think about it, I'm not surprised that I'm coming up blank so far, and it may be that on June 23, 2022, I will have a project that needs this capability and then I'll remember ... um ... what was it I was supposed to remember?

--
Paige Miller
Quentin
Super User

How about an example where you want to calculate the cardinality of a bunch of variables, perhaps from different datasets?

 

Below I write a macro function that uses PROC SQL to calculate cardinality, and then call the macro in a DATA step.  It will run slowly (much slower than Chris's https://blogs.sas.com/content/sasdummy/2013/10/16/about-cardinality/), but still seems helpful that it works.

 

%macro cardinality(data=,var=);
  %local rc emit; 

  %let rc=%sysfunc(dosubl(%nrstr(
     proc sql noprint;
       select 
         count(distinct &var) into :emit
       from 
         &data
     quit; 

  )));

  &emit 
%mend ;


data testing (keep = libname memname name distinct_count); 

  set sashelp.vcolumn; 
  where libname = 'SASHELP' and memname = 'CARS'; 

  distinct_count=resolve('%cardinality('
                                || ' data= ' || cats(libname,'.',memname)
                                || ',var= '  || name
                                || ')'
                      ) ;
  put (libname memname name distinct_count)(=) ; 
run;

So any case where you want to run a model on multiple datasets (or multiple models) and collect a result (fit statistic or whatever), it might be handy.

 

But of course you don't need DOSUBL for that.  You could use a macro, or maybe just a by-statement, or CALL EXECUTE, or %include....  Probably just about anything that can be solved with DOSUBL could also be solved without DOSUBL.  SAS was Turing-complete before DOSUBL. : )

BASUG is hosting free webinars Next up: Don Henderson presenting on using hash functions (not hash tables!) to segment data on June 12. Register now at the Boston Area SAS Users Group event page: https://www.basug.org/events.
PaigeMiller
Diamond | Level 26

Or use PROC FREQ with NLEVELS option.

 

My thoughts so far ... suppose I had a data set where I wanted to run some PROC on each row. What would I do without DOSUBL? I would transpose all the rows, then run the PROC on the resulting transposed data set with a BY variable. So far, I haven't found an example like from Chris Hemedinger that DOSUBL adds functionality that I don't already have (or even where DOSUBL makes the programming easier). I get the feeling I'm still missing something, however. Maybe I need to be thinking of non-statistical PROCs, as was the case for Chris.

--
Paige Miller
Quentin
Super User

One place I think it's really handy is when writing function-style macros (like %Varlist, as you mention).

 

Before DOSUBL, writing function-style macros was often cumbersome because the macro couldn't have any SAS statements in them.  

 

With DOSUBL you can write a function-style macro that executes a DATA step, or multiple DATA steps / PROC steps.

 

So if you take any function-style macro that in the past I would would have written using %sysfunc to call cumbersome SCL functions to process data or metadata, today I would consider writing that via DOSUBL.  %Varlist, %CountObs, %AnyObs, etc.

 

Rick's original paper (https://support.sas.com/resources/papers/proceedings13/032-2013.pdf) also describes a use-case for PROC STREAM, which I haven't tried.

 

That said, I too would be interested in more use-case for DOSUBL.  It's almost 10 years old now, and you don't see or hear much about it in user group papers etc.   It seems quite revolutionary in what it allows.

 

@rogerjdeangelis you're a DOSUBL fan/advocate.  Care to jump in with some of your favorite use-cases?  

 

Don't know if @yabwon  or @AllanBowe have found any 'production' uses for it in your work?

BASUG is hosting free webinars Next up: Don Henderson presenting on using hash functions (not hash tables!) to segment data on June 12. Register now at the Boston Area SAS Users Group event page: https://www.basug.org/events.
AllanBowe
Barite | Level 11

Thanks for the callout!

 

The major problem with `dosubl()` is performance - it gets expensive once you call it a few times.  For particular use cases though, it's fantastic!

 

We used it to build an "SCL Transcoding Kit" - an accelerator for projects looking to modernise legacy AF/SCL applications.  You can see it in action here:  https://sasapps.io/modernising-legacy-sas-scl-af-applications

 

Specifically, dosubl() enabled us to handle SLISTs and call modules without breaking out of an SCL method.

/Allan
SAS Challenges - SASensei
MacroCore library for app developers
SAS networking events (BeLux, Germany, UK&I)

Data Workflows, Data Contracts, Data Lineage, Drag & drop excel EUCs to SAS 9 & Viya - Data Controller
DevOps and AppDev on SAS 9 / Viya / Base SAS - SASjs
yabwon
Onyx | Level 15

Hi Team,

 

In my case, I'm using DoSubL() a lot in the macroarray package (https://github.com/yabwon/SAS_PACKAGES/blob/main/packages/macroarray.md).

 

For example the %array() macro uses it in two ways:

1) The syntax for creating an array is basically a data step syntax, which then I embed inside a data step in a %sysfunc(DoSubL(...)) wrapper, so the following:

 

%array(d x1-x5 (5 4 3 2 1)) 

%put _user_;%array(d[-2:2] $ ("a" "b" "c" "d" "e"))

creates 5 macro variables:

 

 

GLOBAL D1 5
GLOBAL D2 4
GLOBAL D3 3
GLOBAL D4 2
GLOBAL D5 1

And since it puts code inside a data step, when used like that, with a function:

 

 

%array(UL[26] $, function = byte(rank("A")+_I_-1) )
  %put &=UL1 &=UL2 ... &=UL25 &=UL26;

we get:

 

UL1=A UL2=B ... UL25=Y UL26=Z

2) The second useful thing is that it helps me to generate macros with dynamic names, more or less like that:

 

 

resetline;
%macro GenerateMacro(name);
%local mtext;
%let mtext = _%sysfunc(datetime(),hex16.)_;
%let rc = %sysfunc(
dosubl(
/*===============================================*/
  options nonotes nosource %str(;)
  DATA _NULL_ %str(;)
    CALL SYMPUTX("&mtext.",
      '%MACRO ' !! "&name.(n,k);" !! 
        " data _&name._;        " !!
        '   do i = 1 to &n.;    ' !!
        '     x = ranuni(&k.);  ' !!
        '     output;           ' !!
        '   end;                ' !!
        ' run;                  ' !!
      '%MEND;', 'G') %str(;)
   STOP %str(;) 
  RUN %str(;) 
/*===============================================*/
)); &&&mtext. %symdel &mtext. / NOWARN ;

%mend GenerateMacro;

%GenerateMacro(ABCDEF)

%abcdef(3,4321)

%GenerateMacro(UVWXYZ)

%UVWXYZ(17,333)

the result in the log is:

 

 

25   %GenerateMacro(ABCDEF)
26
27   %abcdef(3,4321)

NOTE: The data set WORK._ABCDEF_ has 3 observations and 2 variables.
NOTE: DATA statement used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds


28
29   %GenerateMacro(UVWXYZ)
30
31   %UVWXYZ(17,333)

NOTE: The data set WORK._UVWXYZ_ has 17 observations and 2 variables.
NOTE: DATA statement used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds

So in the macroarray package when I write:

 

 

%array(UL[26] $, function = byte(rank("A")+_I_-1), macarray=Y)

I will not only get  a list of 26 macrovariables: UL1 to UL26 but also a macro %UL() will be generated, so I can write:

 

 

%put %UL(1) %UL(2) %UL(3) ... %UL(25) %UL(26);

to "mimic" the data step array behaviour.

 

 

Also the %deleteMacArray(), %concatArrays(), %appendCell(), and %mcHashTable() macros use this approach.

 

One more thing, DoSubL() performs poorly in the "do-loop(doSubL())" setup, i.e. DoSubL() is called inside a loop. But it works quite well in reversed setup when you call a loop inside DoSubL() [the "doSubL(do-loop())" setup], so for example something like a "wholesale Metadata Acquisition" can be more convenient to do. Like in this example:

resetline;
option dlcreatedir;
libname test "%sysfunc(pathname(work))/test";

data test.A;
  do i = 1 to 17;
    output;
  end;
run;

data test.B;
  do i = 1 to 42;
    output;
  end;
run;

data test.C;
  do i = 1 to 303;
    output;
  end;
run;


%macro wholesaleMetadataAcquisition(lib);

%local rc;
%let rc = %sysfunc(DoSubL(%str(
options nonotes nosource;
proc sql;
  create table _temp_ as
  select memname, nobs
  from dictionary.tables
  where libname = %upcase("&lib.")
  ;
quit;
data _null_;
  set _temp_;
  call symputx("nobs_" !! strip(memname), nobs, "G");
run;
proc delete data = _temp_;
run;
)));

%mend;


%wholesaleMetadataAcquisition(test)

%put &=nobs_A. &=nobs_B. &=nobs_C.;

 

I will scratch my head a bit more later and maybe will find something else.

 

 

All the best

Bart

 

 

 

 

 

 

 

 

 

 

 

 

 

 

_______________
Polish SAS Users Group: www.polsug.com and communities.sas.com/polsug

"SAS Packages: the way to share" at SGF2020 Proceedings (the latest version), GitHub Repository, and YouTube Video.
Hands-on-Workshop: "Share your code with SAS Packages"
"My First SAS Package: A How-To" at SGF2021 Proceedings

SAS Ballot Ideas: one: SPF in SAS, two, and three
SAS Documentation



Quentin
Super User

Thanks @yabwon I really like your %array example.  It's a nice case where we know that the DATA step knows how to do something really well (and easily), but it would be hard to do it in the macro language.  So DOSUBL lets you do it in the macro language (via %sysfunc) but just use the familiar DATA  step in the background.

 

It's like the %varlist example.  20 years ago I saw someone do a presentation of a %varlist macro.  And they had done a TON of work to support/expand any sort of valid varlist specification.  So you could do %varlist(foo) or %varlist(foo,keep=_numeric_), or %varlist(foo,keep=bar: ).  It was a TON of pure macro code to cover all the different kinds of variable lists.  But the data step language (and proc language) already knows how to interpret a variable list, so with DOSUBL you can just let it do the work:

 

%macro VarList(data) ;
  %local rc varnames ;
  %let rc = %sysfunc(dosubl(%nrstr(
    proc contents data=&data out=__ExpandVarList(keep=name) noprint ;
    run ;
    proc sql noprint ;
      select name into :varnames separated by ' '
      from __ExpandVarList ;   
      drop table __ExpandVarList ;
    quit ;
  )));
&varnames /*return*/
%mend ;

%put %varlist(sashelp.shoes) ;
%put %varlist(sashelp.shoes(keep=R:)) ;
%put %varlist(sashelp.shoes(keep=_character_)) ;

In your last example, %wholsaleMetadataAcquisition, I'm not sure I see the benefit of wrapping the code in DOSUBL.

 

How is:

%macro wholesaleMetadataAcquisition(lib);

%local rc;
%let rc = %sysfunc(DoSubL(%str(
options nonotes nosource;
proc sql;
  create table _temp_ as
  select memname, nobs
  from dictionary.tables
  where libname = %upcase("&lib.")
  ;
quit;
data _null_;
  set _temp_;
  call symputx("nobs_" !! strip(memname), nobs, "G");
run;
proc delete data = _temp_;
run;
)));

%mend;

better than:

%macro wholesaleMetadataAcquisition(lib);

proc sql;
  create table _temp_ as
  select memname, nobs
  from dictionary.tables
  where libname = %upcase("&lib.")
  ;
quit;
data _null_;
  set _temp_;
  call symputx("nobs_" !! strip(memname), nobs, "G");
run;
proc delete data = _temp_;
run;

%mend;

?

BASUG is hosting free webinars Next up: Don Henderson presenting on using hash functions (not hash tables!) to segment data on June 12. Register now at the Boston Area SAS Users Group event page: https://www.basug.org/events.
yabwon
Onyx | Level 15
%put %wholesaleMetadataAcquisition(test);

With the first above will work 😉

 

B.

_______________
Polish SAS Users Group: www.polsug.com and communities.sas.com/polsug

"SAS Packages: the way to share" at SGF2020 Proceedings (the latest version), GitHub Repository, and YouTube Video.
Hands-on-Workshop: "Share your code with SAS Packages"
"My First SAS Package: A How-To" at SGF2021 Proceedings

SAS Ballot Ideas: one: SPF in SAS, two, and three
SAS Documentation



yabwon
Onyx | Level 15

One more I have (unfortunately I don't know from whom I learned this trick) is use DoSubL() to emulate "try/except" programming without affecting SYSCC macrovariable.

 

filename try TEMP;

data _null_;
file try;
infile cards4;
input;
put _infile_;
cards4;
  data x;
    x = 17 /* <- missing ; */
    y = 42;
  run;
;;;;
run;


data _null_;
  rc = dosubl('
  options nonotes nosource;
  filename _D_ DUMMY;
  proc printto log = _D_;
  run;
  %include try;
  %let try = &SYSCC.;
  ');

  try = symgetn('try');

  put rc= try=;

  if try then put 'Your previous code was bad!';
         else call execute('%include try;');
run;

%put *&try.*;
%put *&SYSCC.*;

 

Bart

_______________
Polish SAS Users Group: www.polsug.com and communities.sas.com/polsug

"SAS Packages: the way to share" at SGF2020 Proceedings (the latest version), GitHub Repository, and YouTube Video.
Hands-on-Workshop: "Share your code with SAS Packages"
"My First SAS Package: A How-To" at SGF2021 Proceedings

SAS Ballot Ideas: one: SPF in SAS, two, and three
SAS Documentation



PaigeMiller
Diamond | Level 26

Great stuff, everyone. Thanks! I am going to think about all of this, and hopefully it will sink in, and maybe even generate some ideas of my own.

--
Paige Miller
FreelanceReinh
Jade | Level 19

Hello Paige,

 

My "biggest" use case of DOSUBL so far was this: For a project I once implemented the Nelder-Mead method in a DATA step (because I don't have a SAS/OR license). This optimization algorithm requires the evaluation of the objective function in several places. This is no problem as long as the value of the function can be computed in terms of common SAS functions, user-defined functions (PROC FCMP) or other calculations in the same DATA step. However, the function evaluation in that use case involved several DATA and PROC steps (combined in a macro). Using rc=dosubl('%macroname(...)'); these steps were called from within the main DATA step. Macro variables were used to transfer function arguments and returned values between the main DATA step and the "side session" invoked by DOSUBL. Most of the total run time was spent in the side sessions because the amount of data processed in each of the function calls was substantial.

Quentin
Super User

@FreelanceReinh , I'm curious in that setting what made you choose DOSUBL instead of CALL EXECUTE?

 

I guess the big difference is that the DOSUBL side session can return values to the step that invoked DOSUBL, which CALL EXECUTE cannot do because of the timing issues.

BASUG is hosting free webinars Next up: Don Henderson presenting on using hash functions (not hash tables!) to segment data on June 12. Register now at the Boston Area SAS Users Group event page: https://www.basug.org/events.
FreelanceReinh
Jade | Level 19

Hi @Quentin,

 

As far as I remember, I hadn't even considered CALL EXECUTE being an option. Wouldn't the CALL EXECUTE code run only after the main DATA step has finished? But the main DATA step had to "respond" to each function value delivered by a DOSUBL/macro call in the sense that various IF conditions used the returned values. For example, if moving in one direction (in the domain of the mathematical function) increased the function value, this would influence the decision where to evaluate the function in the next iteration in order to get closer to the optimum.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 15 replies
  • 1346 views
  • 8 likes
  • 6 in conversation