Hello, I need to set up a macro to process the following few steps all together.
As you can see the following snippets only process data when FY=FY16/17, I have a couple other FYs: FY17/18, FY18/19, FY19/20, FY20/21 and I need to process this same chunk of code for each FY. Is there a way to set up a macro to do it? I don't want to copy and paste a million rows... please help!
PROC SORT DATA=FINAL(WHERE=(FY='FY16/17')) OUT=SUB1; BY CASELOAD RC2; RUN; DATA SUB1; SET SUB1; BY CASELOAD; IF FIRST.RC2 THEN DCOUNTER=0; IF CASELOAD LT 11 AND CASELOAD NE 0 THEN DO; DELETE=1; DCOUNTER+1; END; LAST_DELETE=IFN(FIRST.RC2,.,LAG(DELETE)); IF LAST_DELETE AND DCOUNTER LT 2 THEN DELETE=1; RUN; PROC SQL; CREATE TABLE WANT1 AS SELECT *, MAX(DCOUNTER) AS MAXD FROM SUB1 ; QUIT; DATA WANT1; SET WANT1; IF MAXD=1 AND DELETE=1 AND LAST_DELETE=1 THEN CASELOAD=CASELOAD*-1; KEEP FY RC2 POS CASELOAD; RUN;
Try just adding FY as the FIRST by variable.
Try running this code to do two fiscal years.
proc sort data=final out=sub1;
by fy caseload rc2;
where fy in ('FY16/17' 'FY17/18') ;
run;
data sub1;
set sub1;
by fy caseload rc2;
if first.rc2 then dcounter=0;
if caseload lt 11 and caseload ne 0 then do;
delete=1;
dcounter+1;
end;
last_delete=ifn(first.rc2,.,lag(delete));
if last_delete and dcounter lt 2 then delete=1;
run;
proc sql;
create table want1 as
select *, max(dcounter) as maxd
from sub1
group by fy
;
quit;
data want1; set want1;
if maxd=1 and delete=1 and last_delete=1 then caseload=caseload*-1;
keep fy rc2 pos caseload;
run;
If it works then just update the WHERE statement to select all of the years you want.
where 'FY16' <=: fy <: 'FY20' ;
The posted code cannot work. You are referencing FIRST. variables that will not exist.
But if you did have working code then why not just add FY as the first BY variable?
Hi @Tom the code works because FY and RC2 are the two grouping variables in the final dataset, and I did try to add FY as an additional sort by variable, but that will change the way the caseload (numeric) variable was ordered. Essentially, I need to order caseload from lowest to highest and then do suppression for privacy purposes. The suppression has to take place in each FY independently.
RC2 is NOT included in the BY statement of the data step that is using FIRST.RC2. You cannot use FIRST.RC2 if RC2 is not on the BY statement.
Example:
204 data want;
205 set sashelp.class;
206 by name ;
207 if first.age;
208 run;
NOTE: Variable first.age is uninitialized.
NOTE: There were 19 observations read from the data set SASHELP.CLASS.
NOTE: The data set WORK.WANT has 0 observations and 5 variables.
Adding FY before CASELOAD in the BY statement will not change how CASELOAD values sort.
When someone says that they "need to set up a macro process", I am very skeptical. I have not gone through you code line by line in detail, but I think what you really need is to create a BY variable which is FY and then let SAS do all the BY processing; no macros needed.
thanks @PaigeMiller I did add FY as the BY variable but given the structure of the dataset, caseload is the numeric value that I need to sort from lowest to highest, and two grouping variables: FY and RC2. It looks like once I add FY as BY variable, the order of the values changed as well. Maybe I messed up somewhere when sorting the data, but essentially what I am trying to achieve is to rank caseload from lowest to highest by RC2 in one FY first, then do a bunch of stuff from there and then repeat the same step for other FYs.
Try just adding FY as the FIRST by variable.
Try running this code to do two fiscal years.
proc sort data=final out=sub1;
by fy caseload rc2;
where fy in ('FY16/17' 'FY17/18') ;
run;
data sub1;
set sub1;
by fy caseload rc2;
if first.rc2 then dcounter=0;
if caseload lt 11 and caseload ne 0 then do;
delete=1;
dcounter+1;
end;
last_delete=ifn(first.rc2,.,lag(delete));
if last_delete and dcounter lt 2 then delete=1;
run;
proc sql;
create table want1 as
select *, max(dcounter) as maxd
from sub1
group by fy
;
quit;
data want1; set want1;
if maxd=1 and delete=1 and last_delete=1 then caseload=caseload*-1;
keep fy rc2 pos caseload;
run;
If it works then just update the WHERE statement to select all of the years you want.
where 'FY16' <=: fy <: 'FY20' ;
@Tom Thanks Tom! You were right, I added FY as a BY variable and figured out a code (without doing macro) that works.
PROC SORT DATA=FINAL OUT=SUB1; BY FY CASELOAD RC2; RUN; DATA SUB1; SET SUB1; BY FY CASELOAD; IF FIRST.RC2 OR FIRST.FY THEN DCOUNTER=0; IF FIRST.FY THEN FYCNT=0; IF CASELOAD LT 11 AND CASELOAD NE 0 THEN DO; DELETE=1; DCOUNTER+1; FYCNT+1; END; LAST_DELETE=IFN(FIRST.RC2,.,LAG(DELETE)); *IF LAST_DELETE AND DCOUNTER LT 2 THEN DELETE=1; RUN; PROC SQL; CREATE TABLE WANT1 AS SELECT *, SUM(DELETE) AS MAXD FROM SUB1 GROUP BY FY ; QUIT; PROC SORT DATA=WANT1; BY FY CASELOAD RC2; RUN; DATA WANT1; SET WANT1; IF MAXD=1 AND LAST_DELETE=1 THEN CASELOAD=CASELOAD*-1; KEEP FY RC2 POS CASELOAD; RUN;
Your code also works fine!
@Tom wrote:
Try just adding FY as the FIRST by variable.
Try running this code to do two fiscal years.
proc sort data=final out=sub1; by fy caseload rc2; where fy in ('FY16/17' 'FY17/18') ; run; data sub1; set sub1; by fy caseload rc2; if first.rc2 then dcounter=0; if caseload lt 11 and caseload ne 0 then do; delete=1; dcounter+1; end; last_delete=ifn(first.rc2,.,lag(delete)); if last_delete and dcounter lt 2 then delete=1; run; proc sql; create table want1 as select *, max(dcounter) as maxd from sub1 group by fy ; quit; data want1; set want1; if maxd=1 and delete=1 and last_delete=1 then caseload=caseload*-1; keep fy rc2 pos caseload; run;
If it works then just update the WHERE statement to select all of the years you want.
where 'FY16' <=: fy <: 'FY20' ;
@Tom wrote:
Try just adding FY as the FIRST by variable.
Try running this code to do two fiscal years.
proc sort data=final out=sub1; by fy caseload rc2; where fy in ('FY16/17' 'FY17/18') ; run; data sub1; set sub1; by fy caseload rc2; if first.rc2 then dcounter=0; if caseload lt 11 and caseload ne 0 then do; delete=1; dcounter+1; end; last_delete=ifn(first.rc2,.,lag(delete)); if last_delete and dcounter lt 2 then delete=1; run; proc sql; create table want1 as select *, max(dcounter) as maxd from sub1 group by fy ; quit; data want1; set want1; if maxd=1 and delete=1 and last_delete=1 then caseload=caseload*-1; keep fy rc2 pos caseload; run;
If it works then just update the WHERE statement to select all of the years you want.
where 'FY16' <=: fy <: 'FY20' ;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.