BookmarkSubscribeRSS Feed
Quannguyen
Calcite | Level 5

Hi. I'm very new to SAS.

I have an excel file that I need help with. So I need to add a new column called rate for each deal_id. This rate is equal to the average rate (second column) 125 days prior to the announcement date. The date of the rate is the first column and the announcement date is the third column.

Quannguyen_0-1613054234164.png

 

5 REPLIES 5
Kurt_Bremser
Super User

First of all, provide usable data.

This means SAS data, unless you have a problem getting external data into SAS. If you have trouble importing Excel data into SAS, attaching the Excel file is a good idea, but in that case we mostly suggest to save spreadsheets to csv files and read those with a data step.

 

If you already have SAS data, post that by using a data step with datalines, so we can easily recreate your dataset for developing and testing with a simple copy/paste and submit. This will also remove any doubts about variable types, sizes, formats and so on.

 

From your description, it seems that you want to create a mean of a value, grouped by an identification, where a date column is used for selection against a cutoff value.

Is that cutoff value (announcement_date) unique for the identification value (deal_id)?

Quannguyen
Calcite | Level 5

I attached the excel file. Thank you so much.

Kurt_Bremser
Super User

Apart from a lot of missing values (which constitute the majority of rows in the spreadsheets), the deal_id values are unique, so it makes no sense calculating a mean in any way for them (as there's always only one value).

 

Please rephrase your question in clear and unambiguous manner, as I have no idea what you want to do.

ballardw
Super User

@Quannguyen wrote:

Hi. I'm very new to SAS.

I have an excel file that I need help with. So I need to add a new column called rate for each deal_id. This rate is equal to the average rate (second column) 125 days prior to the announcement date. The date of the rate is the first column and the announcement date is the third column.

 


A generic reminder about "rates": Typically a rate is created by dividing one value by another such as CountX/TotalCount.

If the rate is calculated in such a manner and the denominator is not the same then taking a mean directly without the actual counts is very problematic.

Consider: If I drive a car for 1 mile and get 100 miles per gallon (rate of fuel consumption) and drive another 1000 miles and get 10 miles per gallon the "average" miles per gallon is NOT (100+10)/2 = 55 miles per gallon.

I would need to have total actual fuel consumed / total distance to get accurate average miles per gallon.

 

So, what type of "rate" is involved here?

Quannguyen
Calcite | Level 5

The rate is the data that is collected from the internet, I don't have any specific calculation for it

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 5 replies
  • 600 views
  • 0 likes
  • 3 in conversation