DATA Step, Macro, Functions and more

finding the midpoint of observations in a three year period

Accepted Solution Solved
Reply
Contributor
Posts: 65
Accepted Solution

finding the midpoint of observations in a three year period

I have a dataset which represents the volume of sales over three years:

data test;
input one two three average;
datalines;
10 20 30 .
20 30 40 .
10 30 50 .
10 10 10 .
;
run;

I'm looking for a way to find the middle point of the three years, the average sale point

the updated dataset would read

data test;
input one two three average;
datalines;
10 20 30 2
20 30 40 1.5
10 30 50 2.1
10 10 10 1.5
;
run;

So essentially looking for what part of the three years the halfway point of the sales occurred.

Appreciate.

EDIT: what I've been trying with the weight and proc means

I've been trying to use proc means and weight function but it doesn't give me the average point of the three years

proc means data=test noprint;
var one two three;
var one+two+three=total;
var (one+two+three)/3=Average; 
var Average/weight=Average_Year;

output out=testa2
    sum(Total) = 
    mean(Total) = ;
run;

Accepted Solutions
Solution
‎03-20-2018 02:38 AM
Esteemed Advisor
Posts: 5,616

Re: finding the midpoint of observations in a three year period

[ Edited ]

So, I guess you want something like

 

data test;
input one two three;
midpoint = sum(one, two, three) / 2;
if midpoint < one then halfSalesPoint = midpoint / one;
else if midpoint < one + two then halfSalesPoint = 1 + (midpoint - one) / two;
else halfSalesPoint = 2 + (midpoint - one - two) / three;
drop midpoint;
datalines;
10 20 30
20 30 40
10 30 50
10 10 10
;

proc print data=test; run;

 

PG

View solution in original post


All Replies
Respected Advisor
Posts: 3,251

Re: finding the midpoint of observations in a three year period


@89974114 wrote:

I'm looking for a way to find the middle point of the three years, the average sale point

the updated dataset would read

data test;
input one two three average;
datalines;
10 20 30 2
20 30 40 1.5
10 30 50 2.1
10 10 10 1.5
;
run;

 


I have to admit I am not following this, I don't see how you have computed the average value.

--
Paige Miller
Contributor
Posts: 65

Re: finding the midpoint of observations in a three year period

Posted in reply to PaigeMiller

so the first value would be 10+20+30=60, then midway is 30 which is 10+20 so 2 years

I'm looking for the average point throughout the three years based on the volume of sales

the second would be 20+30+40=90 / 2 = 45 ,

45-20 = 25 then 25/30 = 5/6th of a year so my mistake the second line should be 1 + 5/6th years

Contributor
Posts: 65

Re: finding the midpoint of observations in a three year period

So i'm thinking if each volume is given a weight based on total volume, how far through the three years is the middle point

Respected Advisor
Posts: 3,251

Re: finding the midpoint of observations in a three year period

so the first value would be 10+20+30=60, then midway is 30 which is 10+20 so 2 years

 

Still not following this at all.

 

the second would be 20+30+40=90 / 2 = 45 ,

45-20 = 25 then 25/30 = 5/6th

 

So in the first example, there is no subtraction happening, but in this example there is a subtraction in the math?

--
Paige Miller
Contributor
Posts: 65

Re: finding the midpoint of observations in a three year period

Posted in reply to PaigeMiller

Maybe think of it like a cumulative frequency curve, as you add all three observations together you move from 0 to 100% of the value, think of the x axis as 1-3 years and the y-axis as 0 to 100%, you are looking for the point that aligns 50% on the y-axis and the middle point of the x-axis.

Solution
‎03-20-2018 02:38 AM
Esteemed Advisor
Posts: 5,616

Re: finding the midpoint of observations in a three year period

[ Edited ]

So, I guess you want something like

 

data test;
input one two three;
midpoint = sum(one, two, three) / 2;
if midpoint < one then halfSalesPoint = midpoint / one;
else if midpoint < one + two then halfSalesPoint = 1 + (midpoint - one) / two;
else halfSalesPoint = 2 + (midpoint - one - two) / three;
drop midpoint;
datalines;
10 20 30
20 30 40
10 30 50
10 10 10
;

proc print data=test; run;

 

PG
Contributor
Posts: 65

Re: finding the midpoint of observations in a three year period

I can't fault that it gives the right answers, appreciate. I am wondering if there is a more efficient solution though as I'll be performing this datastep with millions of entries

Super User
Posts: 13,889

Re: finding the midpoint of observations in a three year period


@89974114 wrote:

I can't fault that it gives the right answers, appreciate. I am wondering if there is a more efficient solution though as I'll be performing this datastep with millions of entries


What is inefficient about the accepted solution?

Contributor
Posts: 65

Re: finding the midpoint of observations in a three year period

There must be a function in the proc means or a proc command which allows you to find the weighted average with respect to the observation length

Super User
Posts: 13,889

Re: finding the midpoint of observations in a three year period


@89974114 wrote:

There must be a function in the proc means or a proc command which allows you to find the weighted average with respect to the observation length


If the actual concern is that your actual problem involves more than 3 variables whose names are actually ordinal data values then you might be looking at an array to hold the listed variables and compare the mean to an iteratively created cumulative total.

 

 

Esteemed Advisor
Posts: 5,616

Re: finding the midpoint of observations in a three year period

I just don't see how this pr/oblem can be expressed as a weighted average. To me, the solution to your problem amounts to finding the inverse of a piecewise linear function.

 

This little datastep should run very fast. I doubt that any proc can do much better.

PG
Contributor
Posts: 65

Re: finding the midpoint of observations in a three year period

I was being unreasonably pedantic yesterday, my apologies.

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 12 replies
  • 182 views
  • 5 likes
  • 4 in conversation