BookmarkSubscribeRSS Feed
skipjimroo
Calcite | Level 5

Hi there,

I I have output from proc UCM with the additive outliers (spikes) and level shifts labelled. (I am running in9.4.) I want to: 

 

  • Extend these labels to the start and end of the increase period. The start period is the month before the increase. The end period is the last month of the increase.
  • Then summarise the data so that only the start and end dates and values are shown.
  • I also want level shifts to take precedence, so that if a level shift and spike both occur during this increase period, then the increase is classed as a level shift.

The input data looks like this

 

data got;
    infile datalines dsd delimiter='|';
format ID $1. Date date9. value 8. valuediff 8. breaktype $12.;
informat date ddmmyy10.; 
   input ID Date Value ValueDiff Breaktype;
    datalines;
A|01/01/2022|1602040||NoBreak
A|01/02/2022|1997341|395301|NoBreak
A|01/03/2022|45544|-1951797|NoBreak
A|01/04/2022|138421|92877|LevelShift
A|01/05/2022|999286|860865|NoBreak
A|01/06/2022|1178399|179113|Spike
A|01/07/2022|881297|-297102|NoBreak
A|01/08/2022|949765|68468|LevelShift
A|01/09/2022|1308423|358658|NoBreak
A|01/10/2022|1004914|-303509|NoBreak
A|01/11/2022|1191019|186105|NoBreak
A|01/12/2022|803331|-387688|NoBreak
B|01/01/2022|921410||NoBreak
B|01/02/2022|1068787|147377|NoBreak
B|01/03/2022|297284|771503|NoBreak
B|01/04/2022|338337|41053|LevelShift
B|01/05/2022|487583|149246|NoBreak
B|01/06/2022|696031|208448|Spike
B|01/07/2022|689942|6089|NoBreak
B|01/08/2022|763417|73475|LevelShift
B|01/09/2022|983677|220260|NoBreak
B|01/10/2022|545737|437940|NoBreak
B|01/11/2022|740081|194344|NoBreak
B|01/12/2022|533809|206272|NoBreak
;
run;

My target data should look like this:

IDbreak_startdatebreak_enddatebreaktypestartbreak_valueendbreak_value
A01/03/202201/06/2022LevelShift455441178399
A01/07/202201/09/2022LevelShift8812971308423
B01/03/202201/06/2022LevelShift297284696031
B01/07/202201/09/2022LevelShift689942983677


I have created some code that works in some circumstances, but not all.

  • It cant handle multiple breaks in the same id.
  • It doesnt let level shifts take precedence over spikes.
  • And it doesnt work if the break label isnt at the start of the increase. I have tried using lag functions, but the issue is that you dont know how many rows to lag by.

data need
(keep=id

startbreakvalue endbreakvalue newbreak
structbreak_startdate structbreak_enddate rename=(newbreak=breaktype));
set got;
by id date breaktype;
retain structbreak_startdate structbreak_enddate
startbreakvalue endbreakvalue newbreak;
format structbreak_startdate structbreak_enddate date9.;

if first.id =1   then do;
startbreakvalue = 99999;
endbreakvalue = 0;
structbreak_startdate = '31dec3500'd;
structbreak_enddate = '1jan1960'd;
newbreak =breaktype;
end;
else do;
if breaktype ne "NoBreak" then do;
newbreak =breaktype;
structbreak_enddate ='31dec3500'd;
structbreak_startdate =intnx('month',date,-1);
startbreakvalue = value-valuediff;
end;

if breaktype = "NoBreak" and valuediff<0 then do;
x=1;
structbreak_enddate =min(structbreak_enddate,intnx('month',date,-1));
endbreakvalue = max(endbreakvalue,value-valuediff);
end;
end;
if last.id = 1;

;
run;

Any help would be greatly appreciated!

 

 

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 0 replies
  • 106 views
  • 0 likes
  • 1 in conversation