Calculating funding differences among the years by a factor that changes by year

Reply
Contributor
Posts: 41

Calculating funding differences among the years by a factor that changes by year

Hello,

I am totally new to statistics and I am currently playing with SAS and I am a bit overwhelmed by some of the theory.

I have been presented with a data set containing the following columns concerning investments for institutions:

Type of Institution | Region | [etc...] | Funding Year 1 | Funding Year 2 | Funding Year 3

Each row would contain a nameless record, specifying the type of institution and the funding for each year.

I have been requested to calculate the size of the institution according to the amount of funding. For example

  • Funding equal to or greater than 2,000.000 should be categorised as Size A
  • Funding lower than 2,000.000 and equal or greater than 1,000,000 should be categorised as Size B
  • Funding lower than 1,000,000 should be categorised as Size C

My first tests involved ANOVA and TTests to check if there is difference between funding for each individual year based on different factors such as region, institution type and SIZE.

So when checking for the factor size, I created a calculated column to hold the size value (A, B or C) bases solely in the funding for the year I was testing.

I then noticed the same institution could fall into different sizes among the years, because the funding could vary between the years. Well that is no big deal for this first test because since I am testing each year individually, what counts what the size in that given year.

Now comes the tricky part, which I need some desperate help. I need investigate how the funding has changed across years (Y1-Y2, Y1-Y3, Y2-Y3) and how this has change based on the same factors as before (e.g.: region, type and size).

Very important and mandatory rule: I can't use two way anova! This exercise is not based in two or more factors, but two or more groups.

In order to get the difference between years, for each pair of years I want to test I created a new calculated column to hold the value of year A minus year B. So far so good -- this will be my dependent variable for each test.

This all works fine except for one bit. The SIZE. Because the size can differ between the years, I am not sure what would be the best way to come up with a size that would be ok. First I though, "let me sum the funding of both years and divide by two". Well, but when it comes to the Y1-Y3 pairing, shouldn't I do Y1+Y2+Y3 and then divide by 3?

Another tricky bit is that for some years, the value of the funding is missing. And that would mean that any calculation involving a null value will result in null as well. I am not sure if this would be the right way to go.

Well to sum up I am a total NOOB in stats and I really need some help.

Please please advise.

Cheers everyone,

P.

Frequent Contributor
Posts: 97

Re: Calculating funding differences among the years by a factor that changes by year

Hi,

A sample of Input line(s) and

A sample of output required ..please

ALLU

Contributor
Posts: 41

Re: Calculating funding differences among the years by a factor that changes by year

Hello,

Thanks for that. I can upload the dataset as well if you wish. But here is an example of the observations in a neat table:

TypeRegion

Total

Investment Y1

Students Y1

Total

Investment 2

Students Y2

Total

Investment Y3

Students Y3

Funding

per Student Y1

Funding

per Student Y2

Funding

per Student Y3

Size Y1Size Y2Size Y3

Funding

per Student (Y3-Y1)

Type AAsia$1,593,30714,459$2,701,35416,526$2,543,71114,930$110.19$163.46$180.93322$70.74
Type AEurope$1,006,9894,042$1,013,6323,584$1,010,0103,725$249.13$282.82$272.12333$22.98
Type BEurope$1,008,6155,253$958,2233,770$789,4002,874$192.01$254.17$333.41344$141.40
Type BNorth America$2,092,49711,884$2,864,28213,438$2,286,24814,095$176.08$213.15$203.21222$27.14

For all of my previous tests, I have submitted the funding per student for each year, and used the Size for that given year as the CLASS (the factor variable).

Just to recap, the size of the college is defined by the range of investment for that given year (e.g.: investment equal or over 3000000 equal size 1).

However for this next test, I had to create a new variable with the difference between Year 3 and Year 1, which is the column at the end of the table in purple. I also have to analyse this new column by year. The question lies in which year to use.

  • I have the feeling that I should use the size of the most recent year (year 3) because this is where we are at the moment. But I am not sure about it.
  • For this test I could also create a new variable for size, which could be calculated from the sum of total investments in Y1 Y2 and Y3 divided by 3, or the sum of Y1 and Y2 ivided by two. But then I am also unsure of that as this could create a fictitious size for the observation that isn't the case.

Thanks for trying to help me with this one.

All the best,

P.

Ask a Question
Discussion stats
  • 2 replies
  • 173 views
  • 1 like
  • 2 in conversation