Hallo everyboy, i'm relative novice at sas.
I need to create a macro that performs some task until a value in my dataset changes, and then starts again on the subsequent entry.
I try to be more clear:
i have a table made like this:
NAME | DATE | WAGE
henry 202012 100
henry 202011 110
henry 202010 90
mary 202012 110
mary 202011 95
mary 202010 105
and so on.
I need the macro to perform a set of task on henry, then on mary and so on.
How can tell it when the name changes and it has to start from the beginning with the calculations?
I have tryed creating a macrovariable with a list of name separated by a simbol and then scan it, but i have literally hundred thousand names, so it is longer than 65000 char and it is not feasible.
Any other method?
Thank you in advance
This is not a task for a macro.
PROC SUMMARY, for example, can compute the sums or means or many other statistics on Henry than Mary and so on for all of the values in variable NAME. No macro needed. For example:
proc summary data=have nway;
class name;
var wage;
output out=sums sum=;
run;
Many other SAS procedures can handle this type of grouping by NAME as well.
Macro coding is often not what is needed when using values of variables in a data set.
What you describe is BY group processing and in a data step the First and Last variables that come with it.
data have; input name $; datalines; henry henry henry henry henry henry mary mary mary mary alice alice alice alice alice ; data example; set have; by name notsorted; length comment $ 50; if first.name then comment=catx(' ','This is the first record for',name); if last.name then comment=catx(' ','This is the last record for',name); run;
The above example has the records grouped by name but not sorted by name as alice is listed after mary. So the option on the BY statement of NOTSORTED is needed to prevent a BY order error.
When BY is used in a data step SAS creates two automatic variables for each variable on the by statement that are referenced with First.<variablename> and Last.<variablename>. Note the dot between the keyword and the variable name. These variables are valued as 1 for true and 0 for false so can be used in the IF as above.
If a record is the only one for a value of the by variable then both First and Last can be true for the same record (which makes a lot of sense).
Simple sums and statistics of other variables likely belong in a procedure like Summary or possibly a report. If you need to do accumulation of your own then you would use the First. to reset your values and likely a RETAIN to create variables to hold the accumulated values across the data step boundary.
Can you describe more about the task your macro will perform? Can you show sample code you have tried, and the desired output?
In general a macro won't recognize when a data step variable changes, because the macro language doesn't know about datasets.
But it is a common need to use a SAS dataset to drive macro calls. So, for example, if you have a dataset that list names, as shown, you can use CALL EXECUTE to call a macro. Since CALL EXECUTE is a data step call routine, you can use data step code: if first.name then CALL EXECUTE a macro call, passing the name as a parameter.
You have had a couple of suggestions to change your program to use BY group processing. If that is not possible, I would suggest that you try writing a program to execute the macro calls that you need, e.g.:
%macro DoSomething(name,firstobs,obs);
data out_&name;
set have(firstobs=&firstobs obs=&obs);
run;
%mend;
filename tempsas temp;
data _null_;
file tempsas;
set have(keep=name);
by name;
if first.name then
put '%DoSomething(' name ',' _N_ @;
if last.name then
put ',' _N_ ');';
run;
%include tempsas;
The reason I put the parameters FIRSTOBS and OBS in (instead of using a WHERE clause) is that it will make the macros run a lot faster when you are working with large tables.
Before submitting the %INCLUDE line you may want to take a look at the generated code (open the TEMPSAS file in an editor), and try submitting one line at a time to see if it works OK.
What task do you perform for each name?
Here's a simple example of calculating a running total:
data have;
input name $ date :yymmn6. wage;
format date yymmn6.;
datalines;
henry 202012 100
henry 202011 110
henry 202010 90
mary 202012 110
mary 202011 95
mary 202010 105
;
data want;
set have;
by name;
if first.name
then running_total = wage;
else running_total + wage;
run;
As you can see, no macro processing is needed.
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.