turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- General Programming
- /
- Find maximum value from year 0 to year t continuou...

Topic Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

12-10-2017 06:43 AM - edited 12-11-2017 08:26 AM

Hi everyone,

I have a company ID - year panel data of various variables. Now I have to find the maximum (minimum) values of variable ATQ (total asset) of a company from year 0 (year 0 is firm's beginning year in dataset) to year t (current year).

For example: Let say that I created a new variable MaxATQ. For company ID 001001, the value of MaxATQ in 30/6/1983 is equal to the maximum value of ATQ (of that company) from 31/3/1983 (year 0) to 30/6/1983 (year t). And so on, value of MaxATQ in 30/9/1983 is equal to the maximum value of ATQ from 31/3/1983 to 30/9/1983.

I have been searching the Internet whole day and can not find anything. So could anyone please give me some clue on this? Thank you very much

gvkey | datadate | Company | COGSQ | STD |

1001 | 19830331 | A | 1.258 | NA |

1001 | 19830630 | A | 1.4 | std from QI/1983 to QII/1983 |

1001 | 19830930 | A | 1.5 | std from QI/1983 to QIII/1983 |

1001 | 19831231 | A | 1.6 | |

…….. | …… | ……. | ……. | |

1001 | 19860331 | A | 2.3 | |

1002 | 19840430 | B | 3.5 | |

1002 | 19840731 | B | 6.5 | |

….. | …… | ….. | ….. |

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to trungcva112

12-10-2017 06:59 AM

You can use PROC MEANS with option: ** max = maxATQ **and

**CLASS company ID;**

**VAR ATq;**

**WHERE** datadate between <date created> and <current date>;

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Shmuel

12-10-2017 07:33 AM

Thank you, I tried but it didn't work

My code is:

proc means data=practice;

max(atq)= max_atq;

class GVKEY;

var atq;

where datadate between <date created> and <current date>;

run;

I dont understand the part <date created> and <current date>. What am I supposed to put into <date created> and <current date>? Because each company has different beginning date. Company 1 data begins at 31/3/1983, company 2 begins at 31/3/1982

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to trungcva112

12-10-2017 07:51 AM

Assuming company created date is the lowest value of datadate and current date is the latest one - then you can

just ignore and cancel the where clause.

Did the code worked after removing the where statement ?

if negative - explain what happend and post the log.

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Shmuel

12-10-2017 08:24 AM - edited 12-10-2017 08:25 AM

It still doesnt work when I remove the where statement

The log file below:

data practice;

set 'c:\mysaslib\firms.sas7bdat';

NOTE: There were 1522473 observations read from the data set c:\mysaslib\firms.sas7bdat.

NOTE: The data set WORK.PRACTICE has 1522473 observations and 25 variables.

NOTE: DATA statement used (Total process time):

real time 0.35 seconds

cpu time 0.35 seconds

138 proc means data=practice;

139 max(atq)=max_atq;

---

180**ERROR 180-322: Statement is not valid or it is used out of proper order.**

140 class GVKEY;

141 var atq;

142 run;

NOTE: The SAS System stopped processing this step because of errors.

NOTE: PROCEDURE MEANS used (Total process time):

real time 0.00 seconds

cpu time 0.00 seconds

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to trungcva112

12-10-2017 08:54 AM

Try next code with the right syntax:

```
proc means data=practice ;
output out=want max(atq)=max_atq;
class GVKEY;
var atq;
run;
```

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Shmuel

12-10-2017 09:13 AM - edited 12-10-2017 09:17 AM

Hi. It works but it produces the maximum value of each company for the entire date range, which is not what I need to do

So the problem is: for each company i:

at t=1: I need to find the maximum value of ATQ from t=0 to t=1

at t=2: Max ATQ from t=0 to t=2

at t=3: Max ATQ from t=0 to t=3

and so on .....

at t=T: Max ATQ from t=0 to t=T

So basically the date range (to find max value) expands when time progress. And I need to find different maximum values of ATQ corresponding with different time range (0 to 1, 0 to 2, 0 to 3, ...... , 0 to T)

I think I need to do some kind of loop?. But I can not find any information on how to do it

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to trungcva112

12-10-2017 09:32 AM

What do you mean by t= ?

Is it - first year, first 2 years, first 3 years etc ?

If positive then the use of a data step will be more usefull then proc means:

```
proc sort data=practice; by GVKEY datadate; run;
data temp;
year = year(datadate);
run;
data want;
set temp;
by gvkey year;
retain maxATQ;
if first.gvkey then maxATQ = ATQ;
maxATQ = max(maxATQ, ATQ);
if last.year then output;
run;
```

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to trungcva112

12-11-2017 11:24 AM

you might adapt a solution I offered earlier https://communities.sas.com/t5/General-SAS-Programming/Estimate-standard-deviation-quarter-by-quarte...

That delivered progressively, the STD() for a number stream within each company, over time

Looks like your task is quite similar

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Peter_C

12-11-2017 11:26 AM

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to trungcva112

12-10-2017 09:48 AM

Yes, that's exacly what I mean.

I need to find maximum value in the first year, first 2 years, first 3 years .... and finally the entire time range.

Thank you for giving me some clue. I will try it and give you some feedback as soon as possible

I need to find maximum value in the first year, first 2 years, first 3 years .... and finally the entire time range.

Thank you for giving me some clue. I will try it and give you some feedback as soon as possible

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to trungcva112

12-10-2017 01:20 PM

It's a little puzzling what you are looking for, since you don't actually illustrate what the outcome should be. You describe the problem in terms of years. But the date values you use to illustrate are quarters.

Since your data set is in order by GVKEY DATADATE, here is a program that may help.

data want;

set have;

by gvkey;

if first.gvkey then max_atq = atq;

else max_atq = max(max_atq, atq);

retain max_atq;

run;

Note that you can have many observations per DATADATE. But the MAX_ATQ value is the max found so far. Therefore, it can be different for two observations having the same DATADATE.

This program will move you in the right direction. To make the requirements clearer, you will need to provide easier to read data, and a sample of what you picture the result to be (especially when you have two identical DATADATE values for the same GVKEY). Then you can get a better suggested solution.

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Astounding

12-10-2017 10:55 PM - edited 12-11-2017 12:00 AM

Hi astounding and shmuel. Your code works.

But Now I have to calculate standard deviation of variable X from quarter I/1983 to quarter II/1983, and then quarter I/1983 to quarter III/1983, and so on.....

It just like before:

at t=1, I need to find standard deviation of X from t=0 to t=1

at t=2, std of X from t=0 to t=2

and so on,

Do you have any idea how to do this? Because the previous code could not applied in this case

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to trungcva112

12-11-2017 08:27 AM

I just posted a sample of my dataset. So could anyone help me please?

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to trungcva112

12-11-2017 09:21 AM

Maybe there are some economical or statistical procedures that I'm not familiar with.

You can use the same logic as in computing the maxATQ in order to calculate the standard deviation

using its formula which I dont remeber exactly, that is something like:

- retain **sum_x** - sum of variable x

**sum_sqr_x **- sum the squares of variable x

- then compute the STD using those sums retained and _N_ as cumulative number of observations

If you have one observation per qurter then just output each with the last computed values.

if you want comulative per whole years - output it on** last.year** - as in my post for maxATQ.