BookmarkSubscribeRSS Feed
cieffegi
Calcite | Level 5

Hallo everyboy, i'm relative novice at sas.
I need to create a macro that performs some task until a value in my dataset changes, and then starts again on the subsequent entry.
I try to be more clear:
i have a table made like this:

NAME |     DATE    |    WAGE

henry      202012       100

henry      202011        110

henry      202010        90

mary       202012        110

mary       202011         95

mary       202010        105

 

and so on.

I need the macro to perform a set of task on henry, then on mary and so on. 
How can tell it when the name changes and it has to start from the beginning with the calculations?

I have tryed creating a macrovariable with a list of name separated by a simbol and then scan it, but i have literally hundred thousand names, so it is longer than 65000 char and it is not feasible.

Any other method?

Thank you in advance 

5 REPLIES 5
PaigeMiller
Diamond | Level 26

This is not a task for a macro.

 

PROC SUMMARY, for example, can compute the sums or means or many other statistics on Henry than Mary and so on for all of the values in variable NAME. No macro needed. For example:

 

proc summary data=have nway;
    class name;
    var wage;
    output out=sums sum=;
run;

 Many other SAS procedures can handle this type of grouping by NAME as well.

--
Paige Miller
ballardw
Super User

Macro coding is often not what is needed when using values of variables in a data set.

 

What you describe is BY group processing and in a data step the First and Last variables that come with it.

data have;
input name $;
datalines;
henry
henry
henry
henry
henry
henry
mary
mary
mary
mary
alice
alice
alice
alice
alice
;

data example;
   set have;
   by name notsorted;
   length comment $ 50;
   if first.name then comment=catx(' ','This is the first record for',name);
   if last.name then comment=catx(' ','This is the last record for',name);

run;

The above example has the records grouped by name but not sorted by name as alice is listed after mary. So the option on the BY statement  of NOTSORTED is needed to prevent a BY order error.

When BY is used in a data step SAS creates two automatic variables for each variable on the by statement that are referenced with First.<variablename> and Last.<variablename>. Note the dot between the keyword and the variable name. These variables are valued as 1 for true and 0 for false so can be used in the IF as above.

If a record is the only one for a value of the by variable then both First and Last can be true for the same record (which makes a lot of sense).

 

 

Simple sums and statistics of other variables likely belong in a procedure like Summary or possibly a report. If you need to do accumulation of your own then you would use the First. to reset your values and likely a RETAIN to create variables to hold the accumulated values across the data step boundary.

Quentin
Super User

Can you describe more about the task your macro will perform?   Can you show sample code you have tried, and the desired output?

 

In general a macro won't recognize when a data step variable changes, because the macro language doesn't know about datasets.

 

But it is a common need to use a SAS dataset to drive macro calls.  So, for example, if you have a dataset that list names, as shown, you can use CALL EXECUTE to call a macro.  Since CALL EXECUTE is a data step call routine, you can use data step code: if first.name then CALL EXECUTE a macro call, passing the name as a parameter.

s_lassen
Meteorite | Level 14

You have had a couple of suggestions to change your program to use BY group processing. If that is not possible, I would suggest that you try writing a program to execute the macro calls that you need, e.g.:

%macro DoSomething(name,firstobs,obs);
  data out_&name;
    set have(firstobs=&firstobs obs=&obs);
  run;
%mend;

filename tempsas temp;
data _null_;
  file tempsas;
  set have(keep=name);
  by name;
  if first.name then 
    put '%DoSomething(' name ',' _N_ @;
  if last.name then
    put ',' _N_ ');';
run;

%include tempsas;

The reason I put the parameters FIRSTOBS and OBS in (instead of using a WHERE clause) is that it will make the macros run a lot faster when you are working with large tables.

 

Before submitting the %INCLUDE line you may want to take a look at the generated code (open the TEMPSAS file in an editor), and try submitting one line at a time to see if it works OK.

Kurt_Bremser
Super User

What task do you perform for each name?

Here's a simple example of calculating a running total:

data have;
input name $ date :yymmn6. wage;
format date yymmn6.;
datalines;
henry      202012       100
henry      202011        110
henry      202010        90
mary       202012        110
mary       202011         95
mary       202010        105
;

data want;
set have;
by name;
if first.name
then running_total = wage;
else running_total + wage;
run;

As you can see, no macro processing is needed.

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 1369 views
  • 0 likes
  • 6 in conversation