BookmarkSubscribeRSS Feed
nash_sas
Fluorite | Level 6

I've a daily timeseries data and looking to find out the step changes occurred in the entire time period of year data. If the outlier occurred on one day, the change remained for a week at the minimum

3 REPLIES 3
Reeza
Super User

I don't believe there's a proc for finding this though it can be done in various ways.

You'll need to explain about your data/issue though. You may want to post this is Statistical Forum instead.

nash_sas
Fluorite | Level 6

My daily data is looking like :


Date response

1/1/2010 5

1/2/2010 5

1/3/2010 9

1/4/2010 5

1/5/2010 5

1/6/2010 9

1/7/2010 9

1/8/2010 9

1/9/2010 9.5

1/10/210 9.5

1/11/2010 9

1/12/2010 9

1/13/2010 7

1/14/2010 6

1/15/2010 5

I am looking to flag the dates with positive or negative step changes in response (let's say above 90 percentile). In the above 15 day data, I am looking to flag the start date of the step change which is  1/6/2010 as shift in response variable continued for a week but not the date of 1/3/2010 as shift occurred only on that date and shift didn't continue to next date. I've tried proc capability for finding out the 95 percentile and 5 percentiles but it flagged all the dates from 1/6/2010 to 1/12/2010 and also the 1/3/2010. So it didn't work to give the beginning date of the step change or dates where the step change occurred.

If I can pull the start date of the week, where 95 and 5 percentiles occurred, that would be great

Haikuo
Onyx | Level 15

Even though I feel this can be done using one-step Hash, but the easier way for me is to generate a intermediate table to first categorize the data, then apply some data step DOW.

data have;

input date :mmddyy10. response ;

format date mmddyy10.;

cards;

1/1/2010 5

1/2/2010 5

1/3/2010 9

1/4/2010 5

1/5/2010 5

1/6/2010 9

1/7/2010 9

1/8/2010 9

1/9/2010 9.5

1/10/2010 9.5

1/11/2010 9

1/12/2010 9

1/13/2010 7

1/14/2010 6

1/15/2010 5

;;;;

proc format;

value res

      low -< 9 = 'low'

        9 - high ='high'

        ;

run;

data want1;

   set have;

     _cat=put(response, res4.);

run;

      

data want;

  do _n_=1 by 1 until (last._cat);

     set want1;

         by _cat notsorted;

  end;

    do _i=1 by 1 until (last._cat);

     set want1;

         by _cat notsorted;

         if _cat='high' and _n_>=7 and _i=1 then flag=1;else flag=.;

         output;

  end;

  drop _:;

run;

Haikuo

Update: FWIW, Here is a Hash solution:

data want_hash;

if _n_=1 then do;

  declare hash h(ordered:'y');

  h.definekey('date');

  h.definedata('date','response');

  h.definedone();

  declare hiter hi('h');

end;

    set have ;

      if response <9 then do;

       _d=date;_r=response;

rc=hi.first();

         do _i=1 by 1 while (rc = 0);

if h.num_items >=7 and _i=1 then flag=1; else flag=.;

output;

rc=hi.next();

          end;

            h.clear();

date=_d;response=_r;output;

      end;

    else h.replace();

      keep date response flag;

run;

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

Multiple Linear Regression in SAS

Learn how to run multiple linear regression models with and without interactions, presented by SAS user Alex Chaplin.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 3 replies
  • 1147 views
  • 0 likes
  • 3 in conversation