BookmarkSubscribeRSS Feed
polingjw
Quartz | Level 8
Is there any way to use array variable references in conjunction with first.variable processing? I want to write a datastep that is somewhat analogous to the following:

[pre]
data new;
set old;
by variables:;
array variables{*} variables:;
do i=1 to dim(variables);
if first.variables{i} then do;

end;
end;
run;
[/pre]
[pre]
[/pre]
Of course, this code will produce an error due to the syntax of “first.variables{i}.” Does anyone have any alternatives to make this code work?
24 REPLIES 24
sbb
Lapis Lazuli | Level 10 sbb
Lapis Lazuli | Level 10
Consider using SAS macro code with your BY variable list explicitly referenced as a %LET string, parse the list with %SCAN(...), to generate the specific DATA step DO/END code piece using %DO/%END logic.

As mentioned, you cannot reference an array variable when using FIRST. so you will need to generate the SAS code with explicit BY variables, each mentioned individually as above.

Scott Barry
SBBWorks, Inc.
polingjw
Quartz | Level 8
Thanks Scott. I had thought about writing a macro to accomplish this task, but I was hoping that there would be an easier alternative. I appreciate your advice.
jdopdyke
Calcite | Level 5
* J.D. Opdyke, President, DataMineIt;
* Below is an actual solution to your question;

options label symbolgen fullstimer yearcutoff=1950 nocenter ls = 256 ps = 51 msymtabmax=max mprint mlogic minoperator mindelimiter=' ' cleanup;

data old(sortedby=a b c);
do a=1 to 2;
do b=11 to 12;
do c = 21 to 22;
do d = 100 to 104;
output;
end;
end;
end;
end;
run;
proc print data=old;
run;

*** The code below assumes that what you want to do is simply "output"
The output statement is just a placehold for whatever code you
want to put there.

*** If you need to extract all the byvars from the old dataset,
just use a proc contents and a proc sql
***;

%macro firstdotbyvars(indata=, outdata=, byvars=);

%let num_byvars = %sysfunc(countw(&byvars.));
%do i=1 %to &num_byvars.;
%let byvar&i. = %scan(&byvars.,&i.);
%end;

%macro iffirst;
if first.&byvar1. then do;
output;
end;
%if &num_byvars.>1 %then %do;
%do i=2 %to &num_byvars.;
else if first.&&byvar&i. then do;
output;
end;
%end;
%end;
%mend;

data new;
set old;
by &byvars.;
%iffirst
run;

proc print data=new;
run;
%mend firstdotbyvars;

%firstdotbyvars(indata=old, outdata=new, byvars=a b c);
polingjw
Quartz | Level 8
Thanks for posting this solution. I appreciate the help.
sbb
Lapis Lazuli | Level 10 sbb
Lapis Lazuli | Level 10
I doubt the OP is going to want to use CONTENTS with all variables from a given SAS file in a BY list - more likely a subset list and that's why I would consider the %LET definition (explicitly) to be more appropriate.

Scott Barry
SBBWorks, Inc.
FloydNevseta
Pyrite | Level 9
I think you misunderstood. You can use the output data set of PROC CONTENTS to get the by variables. The sortedby variable contains the numerical order of the by variables, if any. So you can programatically build the list of by variables in the correct order for any dataset without any prior knowledge of it. Quick example that anyone can run:

proc sort data=sashelp.cars out=cars;
by origin make drivetrain;
run;

proc contents noprint data=cars out=cars_contents;
run;

proc sql noprint;
select name
into :byvars separated by ' '
from cars_contents
where sortedby is not null
order by sortedby
;
quit;

%put byvars=&byvars;


> I doubt the OP is going to want to use CONTENTS with
> all variables from a given SAS file in a BY list -
> more likely a subset list and that's why I would
> consider the %LET definition (explicitly) to be more
> appropriate.
>
> Scott Barry
> SBBWorks, Inc.
jdopdyke
Calcite | Level 5
FYI - Using a proc contents, outputting a dataset, and using a proc sql to get all the variables into a macro string will get around Scott's need to explicitly type out all the by-variable names.
jdopdyke
Calcite | Level 5
My actual solution allows the user to do it either way.

The macro allows the user to type them out Scott.

OR if serious automation is required, then if the user has thought ahead and named the key byvariables similarly, say, with a common prefix, then it is a trivial matter to drop those outputted by the proc contents that don’t matter Scott.

And then it is fully automated, which is typically much more appropriate, unless the task is a small one-off little piece of code that can be banged out in 6 minutes.

J.D. Opdyke, President, DataMineIt
data_null__
Jade | Level 19
You can't use the exact syntax you suggest because you can't create an array of mixed data types, or reference FIRST.array[index] as you have already determined. You CAN however create an array of FIRST or LAST variables.

The example doesn't really do any thing by illustrate the syntax.

[pre]
proc plan ordered;
factors a=3 first=2 c=3 i=4 / noprint;
output out=test first cvals=('a' 'b');
run;
quit;

options validvarname=any;
data test2;
set test;
by a first c;
array f
  • 'first.'n:;
    put 'NOTE: ' _n_= @;
    do j = 1 to dim(f);
    if f then put f= @;
    end;
    put;
    run;
    options validvarname=v7;
    [/pre]

    Fixed OUTPUT statement in PROC PLAN.


    Message was edited by: data _null_;
  • polingjw
    Quartz | Level 8
    data _null_,

    Wow! I thought that the validvarname option was only for compatibility with DBMS variable names. This is exactly what I was hoping to find!
    data_null__
    Jade | Level 19
    Without validvarname=any you can still create an array of first.var

    array f
  • first:;

    But there is more possibility of problems from user variables that begin with FIRST. I "never" use validvarname=any except for very special situations so the technique should be mostly safe.
  • chang_y_chung_hotmail_com
    Obsidian | Level 7
    I think data _null_'s solution naturally leads to a simpler solution below. It makes a number of assumptions, of course, but may work out when you have a reasonable idea about what the variable names are in the input dataset.



       /* test data */


       proc plan ordered;


         factors var1=2 var2=2 var3=3 / noprint;


         output out=old;


       run;


       quit;


     


       /* do something when we hit any first.var */


       data new;


         set old;


         by var:;


         array firsts(*) first:;


     


         drop i;


         do i = 1 to dim(firsts);


           put firsts(i) 2. @;


         end;


         put;


       run;


       /* on log


       1 1 1


       0 0 1


       0 0 1


       0 1 1


       0 0 1


       0 0 1


       1 1 1


       0 0 1


       0 0 1


       0 1 1


       0 0 1


       0 0 1


       */




    p.s. Well, ... did not know that data _null_ has already posted a follow up with exactly the same idea moments earlier... 😞

    Message was edited by: chang_y_chung@hotmail.com
    data_null__
    Jade | Level 19
    I had written it your way first, as that was what I remembered from our conversations on SAS-L on this subject. The problem with the simplier approach is user define variables that begin with FIRST using 'FIRST.'n is a bit safer.
    jdopdyke
    Calcite | Level 5
    In fairness, polingjw did state, “Of course, this code will produce an error due to the syntax of “first.variables{i}.” ”

    J.D.’s code above works, as does data _null_’s (although purists (I’d say prudes) might complain about the WARNING: note in data _null_’s SAS .log).

    hackathon24-white-horiz.png

    The 2025 SAS Hackathon has begun!

    It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

    Latest Updates

    How to Concatenate Values

    Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

    Find more tutorials on the SAS Users YouTube channel.

    SAS Training: Just a Click Away

     Ready to level-up your skills? Choose your own adventure.

    Browse our catalog!

    Discussion stats
    • 24 replies
    • 4926 views
    • 0 likes
    • 6 in conversation