Hi,
I am trying to clean my data based on below items. Could you help me to write code for this?
| quarter | account | status | DEFault | |
| Q1 | 230 | AC | 0 | |
| Q2 | 230 | CL | 0 | Keep | 
| Q3 | 230 | CL | 0 | Keep | 
| Q4 | 230 | CL | 0 | Remove | 
| Q1 | 230 | CL | 0 | Remove | 
| Q2 | 230 | CL | 0 | Remove | 
| Q3 | 230 | CL | 0 | Remove | 
| Q4 | 230 | CL | 0 | Remove | 
| quarter | account | status | DEFault | ||
| Q1 | 230 | AC | 0 | ||
| Q2 | 230 | CL | 0 | Keep | |
| Q2 | 230 | CL | 0 | Keep | |
| Q3 | 230 | CL | 0 | Keep | |
| Q1 | 230 | CL | 0 | Keep | |
| Q2 | 230 | CL | 1 | Keep | |
| Q3 | 230 | CL | 0 | Remve | Already default | 
| Q4 | 230 | CL | 0 | Remove | Already default | 
I suggest two data steps in succession, the first one looks for any default value, the second one creates the final result:
data have;
input year quarter $ account $ status $ default;
cards;
1 Q1 230 AC 0
1 Q2 230 CL 0
1 Q3 230 CL 0
1 Q4 230 CL 0
2 Q1 230 CL 0
2 Q2 230 CL 0
2 Q3 230 CL 0
2 Q4 230 CL 0
1 Q1 231 AC 0
1 Q2 231 CL 0
1 Q3 231 CL 0
1 Q4 231 CL 0
2 Q1 231 CL 0
2 Q2 231 CL 1
2 Q3 231 CL 0
2 Q4 231 CL 0
;
run;
data
  intermediate
  default (keep=account)
;
set have;
by account year quarter; * year and quarter only to make the step fail if dataset is not sorted correctly;
retain def_flag;
if first.account then def_flag = 0;
if default
then do;
  output intermediate; * we want to keep that one as the last;
  if not def_flag then output default;
  def_flag = 1;
end;
if not def_flag then output intermediate;
drop def_flag;
run;
data want;
merge
  intermediate (in=a)
  default (in=b)
;
by account;
retain counter;
if a;
oldstat = lag(status);
if first.account then counter = .;
else if oldstat = 'AC' and status = 'CL'
then counter = 1;
else if status = 'CL'
then counter + 1;
if counter < 3 or b;
drop counter oldstat;
run;
proc print data=want noobs;
run;Result:
year quarter account status default 1 Q1 230 AC 0 1 Q2 230 CL 0 1 Q3 230 CL 0 1 Q1 231 AC 0 1 Q2 231 CL 0 1 Q3 231 CL 0 1 Q4 231 CL 0 2 Q1 231 CL 0 2 Q2 231 CL 1
Take note:
Thank you very much, this is very helpful. I notice that I have this below conversion, so I have duplicate account, quarters now. How I can deal with you code for making data quarterly?
transform the data into quarterly end in month 2,5,8,11
12,1,2-Q1
3,4,5 -Q2
6,7,8 -Q3
9,10,11 -Q4
year    quarter    account    status    default
  1       Q1         230        AC         0  
  1       Q2         230        CL         0  
  1       Q3         230        CL         0  
  1       Q1         231        AC         0  
  1       Q2         231        CL         0  
  1       Q3         231        CL         0  
  1       Q4         231        CL         0  
  2       Q1         231        CL         0  
  2       Q2         231        CL         1  
Please provide example data in a data step, like I showed you.
You have to divide your code to 2 parts.
data yor_data;
set yor_data;
if _n_=1 then flagship='Keep'; 
retain flagship;
by quarter account;
if first.account then flagship='Keep'; 
if flagship ne 'Remove' then do
  if DEFault=1 then 
    flag= 'keep'
    flagship='Remove' 
end
else
  flag= 'Remove'
  flagship='Remove'
end
run;This is a tricky code. Put your mind for 1 hour, it will be fixed. I do not have the SAS Engine with me to compile the code that's why you are getting this code skeleton. Sorry for the issues. Hope it helps.
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.
