Here I have code for 2 variables I am using, one for "complete" and one for "active" for the month of July 2023. I would like to know if there is a way to create a complete and active variable for all months in the dataset using the same variables I used for July. I would like to use this variable to determine how many completes and actives there were for all previous months, by month. The dataset has dates looking back to 2017. Thanks for your help!
Data active;
set work.active;
If A_Date > '31JUL2023'd then active=0;
If End_Date = ' ' then active=1;
If End_Date < '31JUL2023'd then active=0;
If A_Date < '31JUL2023'd AND End_Date >= '31JUL2023'd then active=1;
Else active=0;
Run;
Data closed;
set work.closed;
If D_Date > '31JUL2023'd then complete=1;
else complete=0;
run;
If you have a date variable you can get rows or columns by any interval that is represented by SAS formats (and some others if you make your own format) by associating the grouping format with the date variable.
Example:
data have; /* code just to generate some values to demonstrate*/ do date = '01Jan2020'd to '15Feb2021'd; category= rand('integer',4); value = 10*rand('uniform'); output; end; run; proc tabulate data=have; class category date; format date monyy7.; var value; table category, date*value*(n min mean max) ; run;
After looking at that you might decide that was too many columns and want a calendar quarter instead of month. so change the format.
proc tabulate data=have; class category date; format date yyq.; var value; table category, date*value*(n min mean max) ; run;
For something like this your Proc summary shouldn't be needed. But I am not sure what your "active" and "closed" actually represent in terms of the "month" as I don't really understand what your starting code is designed to do. I follow what it does, it is the WHY. Such as even why you have two data sets to begin with. Likely appropriate to add them much earlier and add an indicator which set something comes from. Then use that info as needed.
I doubt you need a macro variable here. A simple DATA step do-loop will do.
However, I do not understand your logic in the IF statements, and so I am reluctant to advise further without greater explanation about what these IF statements are trying to do, and where the looping fits into this IF statement logic. Does this IF statement logic actually work and provide the right answer for the date 31JUL2023?
Side issue: normally, we want people to just explain the problem they are trying to solve, without specifying that you need a macro variable, or that you need to use function XYZ.
The if statement does give me the result I want for that variable. Here is some more of my code which might help put this into context. I did remove some of it for privacy (I do have a state variable included in my actual code which you won't see here). I created 2 data sets from 2 different files, then summarized and merged the data sets. The proc tabulate gives me exactly what I want for the month of July but now I would like the same results for all of my months in my data set and would like to have all months in the table generated by the Proc Tabulate. Side note- I am aware that I did not do this in the most sophisticated way.
Proc summary data= active;
class Name ;
var active ;
output out=active1 (drop=_freq_ _type_ rename= (sum=Active)) sum=sum ;
run;
Proc summary data= closed ;
class Name ;
var Complete ;
output out=complete1 (drop=_freq_ _type_ rename= (sum=Completes))
sum=sum ;
run;
data Report_data;
merge complete1 active1;
by Name;
if Name =' ' then Name="Total";
Client_Total=(completes+active);
Run;
Proc format;
value missing
.='0'
other=[best12.]
;
run;
proc print data=Report_data noobs;
var Name completes active Client_Total;
where Client_Total >0;
format completes missing. format active missing. Client_Total missing.;
run;
PRoc tabulate data=Report_data F=9.;
class State Name / ;
var Client_Total;
tables (State='Location')*Name, Client_Total / rts=32 ;
run;
If you have a date variable you can get rows or columns by any interval that is represented by SAS formats (and some others if you make your own format) by associating the grouping format with the date variable.
Example:
data have; /* code just to generate some values to demonstrate*/ do date = '01Jan2020'd to '15Feb2021'd; category= rand('integer',4); value = 10*rand('uniform'); output; end; run; proc tabulate data=have; class category date; format date monyy7.; var value; table category, date*value*(n min mean max) ; run;
After looking at that you might decide that was too many columns and want a calendar quarter instead of month. so change the format.
proc tabulate data=have; class category date; format date yyq.; var value; table category, date*value*(n min mean max) ; run;
For something like this your Proc summary shouldn't be needed. But I am not sure what your "active" and "closed" actually represent in terms of the "month" as I don't really understand what your starting code is designed to do. I follow what it does, it is the WHY. Such as even why you have two data sets to begin with. Likely appropriate to add them much earlier and add an indicator which set something comes from. Then use that info as needed.
@Whitlea wrote:
The if statement does give me the result I want for that variable. Here is some more of my code which might help put this into context. I did remove some of it for privacy (I do have a state variable included in my actual code which you won't see here). I created 2 data sets from 2 different files, then summarized and merged the data sets. The proc tabulate gives me exactly what I want for the month of July but now I would like the same results for all of my months in my data set and would like to have all months in the table generated by the Proc Tabulate. Side note- I am aware that I did not do this in the most sophisticated way.
Proc summary data= active; class Name ; var active ; output out=active1 (drop=_freq_ _type_ rename= (sum=Active)) sum=sum ; run; Proc summary data= closed ; class Name ; var Complete ; output out=complete1 (drop=_freq_ _type_ rename= (sum=Completes)) sum=sum ; run; data Report_data; merge complete1 active1; by Name; if Name =' ' then Name="Total"; Client_Total=(completes+active); Run; Proc format; value missing .='0' other=[best12.] ; run; proc print data=Report_data noobs; var Name completes active Client_Total; where Client_Total >0; format completes missing. format active missing. Client_Total missing.; run; PRoc tabulate data=Report_data F=9.; class State Name / ; var Client_Total; tables (State='Location')*Name, Client_Total / rts=32 ; run;
The whole thing can be done without macros. You could write data step do loops to loop through all possible months, and then with all possible dates in the single data set that you call REPORT_DATA, you would then run PROC TABULATE with a BY statement which would do the PROC TABULATE for each month.
However, you need to provide sample data, which can be fake data if necessary, before I would try to write such code.
Please provide some sample data that's representative for your real data via a fully working data step that creates such data. Also show us the desired result based on this sample data and explain the logic to get from Have to Want.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.