Hi guys, My challenge is to find a way of enabling SAS to output the number of datasets determined by the number of month macro variables. The source datasets are large and therefore created for each separate month. However, I need to compare summary results of specific variables across the months. The code I've written partially automates the data gathering step. It's the second part which I've not yet found an elegant way of enabling SAS to create the relevant number of datasets, hence my question. First part/section of code declares: %LET CurrentMonth = '30Jun2024'D %LET INTNXIncrement = 6; Two DATA steps follow creating the "MonthKeys" needed because the source dataset names end with the suffix yyyymm, e.g. SourceDS_202405, SourceDS_202404, etc. These two DATA steps create the relevant MonthKeys via an INTNX statement in which the increment macro is called, in this case 6 for six (prior) months, i.e.: DATA DS_1; FORMAT MonthEndDates DATE9.; DO i=1 TO &INTNXIncrement; MonthEndDates = INTNX('MONTH',&CurrentMonth,-i,'E'); OUTPUT; END; DATA DS_2; SET DS_1; IF LENGTH(STRIP(MONTH(MonthEndDates ))) = 1 THEN MonthKeys=YEAR(MonthEndDates )||"0"||STRIP(MONTH(MonthEndDates )); ELSE MonthKeys=YEAR(MonthEndDates )||STRIP(MONTH(MonthEndDates )); RUN; The DO loop outputs the end dates of the 6 months preceding the current month, the second DATA step simply converts these month-end dates to the yyymm format. Finally, those MonthKeys are then converted into macro variables via a PROC SQL step: PROC SQL; SELECT COUNT(*) INTO :NObs FROM DS_2; SELECT MonthKeys INTO :Month_Key1-:Month_Key%LEFT(&NObs) FROM DS_2; QUIT; We now have an array of 6 macro variables in the yyymm format going back 6 months before the current month, i.e.: Month_Key1, Month_Key2... MonthKey6 or 202405, 202404... 202312. So the last step needed is one capable of outputting only the relevant number of datasets (in this case 6). It would know to create datasets until the last macro variable is called and processed. Each step or iteration would refer to a different source dataset distinguished by the suffix and those suffixes are already stored in memory as macro variables. This would avoid writing the code as shown below: PROC SQL; CREATE TABLE NewDS_&Month_Key1 AS SELECT Month_Key, SUM(Var1), SUM(Var2), SUM(Var3) FROM SourceDS_&MonthKey1; GROUP BY 1; . . . CREATE TABLE NewDS_&Month_Key6 AS SELECT Month_Key, SUM(Var1), SUM(Var2), SUM(Var3) FROM SourceDS_&MonthKey6; GROUP BY 1; QUIT; I.e. the total PROC SQL hard-coded steps being dependent on the number of months being analysed, which is no where near ideal. Ultimately, all these grouped/summarised datasets need to be concatenated so that a monthly comparison can be made of each variable. Hence, this last DATA step also needs refinement, instead of the hard-code approach currently used - DATA NewDS_Combined; SET NewDS_&Month_Key1... NewDS_&Month_Key6; RUN; My guess is either a DO loop or a macro DATA step is required to create the "n" number of datasets and then to concatenate/set them. I would be very grateful for your advice. Dan
... View more