BookmarkSubscribeRSS Feed
Amarpal
Calcite | Level 5

I have a dataset with Dates, a variable that defines what the value measures, and the value.  I want to get the year-to-date standard deviation for each date.  So i have something like this, and want to create this Std column...

Date           Measure                    value           std

jan01          Batch Payment          93                 .

jan08          Batch Payment          128             24.64

jan15          Batch Payment          126             19.58

jan01          Cost per Claim           311            

jan08          Cost per Claim           385             18.38

jan15          Cost per Claim           281             16.29

jan01          Pended Claims          22372

jan08          Pended Claims          18389          2816.41

jan15          Pended Claims          15869          3278.81

I tried using this...

data testing;
set test2;

retain count;
by measure;

if first.measure then count=1;
else count=1+count;

array std_weeks[46] _temporary_;

std_weeks[count]=value;

if first.measure then std = .;
else std = std(of std_weeks[*]);

run;

Right now I have 46 weeks of data, I plan to read the max of count into a variable and use that variable in place of the 46.  What I have there works for the first group of measure, but the next measure's std includes values from the previous measure (since the array is set to 46).  Can I make this work with arrays?  Is there another way to achieve the desired result?

 

Thanks,

-Amarpal

3 REPLIES 3
Reeza
Super User
You need to reset your array when the measure changes.

See code here that works for other moving metrics:

https://communities.sas.com/message/244232
Astounding
PROC Star

Here's another approach, just in case this appeals to you more.

 

If you wanted the mean instead of the standard deviation, you could just keep running SUM and running COUNT.  Then on each observation divide to get the mean.  Of course, you would have to re-set the running SUM and running COUNT each time you hit a new MEASURE ... but you're already doing that for the COUNT.

 

Similarly, standard deviation can be expressed as a function of the running SUM, running COUNT, and running SUM OF SQUARED VALUES.  By applying that sort of a formula, you would never have to worry about how many array elements you need.

 

Good luck.

PGStats
Opal | Level 21

The sort of formula @Astounding is referring to would work like this:

 

data testing;
set test2; by measure;
if first.measure then do; n=0; sx=0; sx2=0; end;
if not missing(value) then do;
    sx + value;
    sx2 + value*value;
    n + 1;
    if n>1 then std = sqrt((sx2-sx*sx/n)/(n-1));
    end;
drop n sx sx2;
run;
PG

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 520 views
  • 0 likes
  • 4 in conversation