turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- Base SAS Programming
- /
- finding the midpoint of observations in a three ye...

Topic Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

03-19-2018 01:30 PM

I have a dataset which represents the volume of sales over three years:

```
data test;
input one two three average;
datalines;
10 20 30 .
20 30 40 .
10 30 50 .
10 10 10 .
;
run;
```

I'm looking for a way to find the middle point of the three years, the average sale point

the updated dataset would read

```
data test;
input one two three average;
datalines;
10 20 30 2
20 30 40 1.5
10 30 50 2.1
10 10 10 1.5
;
run;
```

So essentially looking for what part of the three years the halfway point of the sales occurred.

Appreciate.

EDIT: what I've been trying with the weight and proc means

I've been trying to use proc means and weight function but it doesn't give me the average point of the three years

```
proc means data=test noprint;
var one two three;
var one+two+three=total;
var (one+two+three)/3=Average;
var Average/weight=Average_Year;
output out=testa2
sum(Total) =
mean(Total) = ;
run;
```

Accepted Solutions

Solution

03-20-2018
02:38 AM

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to 89974114

03-19-2018 06:24 PM - edited 03-20-2018 02:07 PM

So, I guess you want something like

```
data test;
input one two three;
midpoint = sum(one, two, three) / 2;
if midpoint < one then halfSalesPoint = midpoint / one;
else if midpoint < one + two then halfSalesPoint = 1 + (midpoint - one) / two;
else halfSalesPoint = 2 + (midpoint - one - two) / three;
drop midpoint;
datalines;
10 20 30
20 30 40
10 30 50
10 10 10
;
proc print data=test; run;
```

PG

All Replies

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to 89974114

03-19-2018 01:34 PM

@89974114 wrote:

I'm looking for a way to find the middle point of the three years, the average sale point

the updated dataset would read

`data test; input one two three average; datalines; 10 20 30 2 20 30 40 1.5 10 30 50 2.1 10 10 10 1.5 ; run;`

I have to admit I am not following this, I don't see how you have computed the average value.

--

Paige Miller

Paige Miller

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to PaigeMiller

03-19-2018 01:37 PM

so the first value would be 10+20+30=60, then midway is 30 which is 10+20 so 2 years

I'm looking for the average point throughout the three years based on the volume of sales

the second would be 20+30+40=90 / 2 = 45 ,

45-20 = 25 then 25/30 = 5/6th of a year so my mistake the second line should be 1 + 5/6th years

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to 89974114

03-19-2018 01:38 PM

So i'm thinking if each volume is given a weight based on total volume, how far through the three years is the middle point

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to 89974114

03-19-2018 03:49 PM

so the first value would be 10+20+30=60, then midway is 30 which is 10+20 so 2 years

Still not following this at all.

the second would be 20+30+40=90 / 2 = 45 ,

45-20 = 25 then 25/30 = 5/6th

So in the first example, there is no subtraction happening, but in this example there is a subtraction in the math?

--

Paige Miller

Paige Miller

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to PaigeMiller

03-20-2018 02:28 AM

Maybe think of it like a cumulative frequency curve, as you add all three observations together you move from 0 to 100% of the value, think of the x axis as 1-3 years and the y-axis as 0 to 100%, you are looking for the point that aligns 50% on the y-axis and the middle point of the x-axis.

Solution

03-20-2018
02:38 AM

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to 89974114

03-19-2018 06:24 PM - edited 03-20-2018 02:07 PM

So, I guess you want something like

```
data test;
input one two three;
midpoint = sum(one, two, three) / 2;
if midpoint < one then halfSalesPoint = midpoint / one;
else if midpoint < one + two then halfSalesPoint = 1 + (midpoint - one) / two;
else halfSalesPoint = 2 + (midpoint - one - two) / three;
drop midpoint;
datalines;
10 20 30
20 30 40
10 30 50
10 10 10
;
proc print data=test; run;
```

PG

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to PGStats

03-20-2018 02:38 AM

I can't fault that it gives the right answers, appreciate. I am wondering if there is a more efficient solution though as I'll be performing this datastep with millions of entries

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to 89974114

03-20-2018 11:39 AM

@89974114 wrote:

What is inefficient about the accepted solution?

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to ballardw

03-20-2018 11:42 AM

There must be a function in the proc means or a proc command which allows you to find the weighted average with respect to the observation length

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to 89974114

03-20-2018 01:03 PM

@89974114 wrote:

If the actual concern is that your actual problem involves more than 3 variables whose names are actually ordinal data values then you might be looking at an array to hold the listed variables and compare the mean to an iteratively created cumulative total.

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to 89974114

03-20-2018 12:31 PM

I just don't see how this pr/oblem can be expressed as a weighted average. To me, the solution to your problem amounts to finding the inverse of a piecewise linear function.

This little datastep should run very fast. I doubt that any proc can do much better.

PG

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to PGStats

03-21-2018 03:01 AM

I was being unreasonably pedantic yesterday, my apologies.