Thank you team,
Great. So the next step is to take that text and convert it into a SAS program that you can share so anyone that wants to help you can have your data.
Not sure what your variable names are but perhaps something like this:
data have;
input firm $ date :yymmdd. price volume a b c d ;
format date yymmdd10.;
cards;
10000 1986-01-07 -2.56250 3680 1 1 2.5625 2.5625
10000 1986-01-08 -2.50000 3680 1 1 2.5 2.5
10000 1986-01-09 -2.50000 3680 1 1 2.5 2.5
10000 1986-01-10 -2.50000 3680 1 1 2.5 2.5
10000 1986-01-13 -2.62500 3680 1 1 2.625 2.625
10000 1986-01-14 -2.75000 3680 1 1 2.75 2.75
10000 1986-01-15 -2.87500 3680 1 1 2.875 2.875
10000 1986-01-16 -3.00000 3680 1 1 3 3
10000 1986-01-17 -3.00000 3680 1 1 3 3
10000 1986-01-20 -3.00000 3680 1 1 3 3
;
Make sure to use the menu items that give up pop-up windows to enter/edit the text for your program (Insert SAS Program) or other text (Insert Code). Otherwise the forum will think you meant that as paragraphs of text to be reflowed to fit the viewers browser window size.
DATA HHLL;
SET HH;
BY PERMNO DATE;
retain OHH HH;
OHH = max(ABS( divide (PRC, CFACPR)));
HH=LAG(OHH);
IF FIRST.PERMNO THEN OHH = ABS( divide (PRC, CFACPR));
else if ABS( divide (PRC, CFACPR)) > OHH then OHH = ABS( divide (PRC, CFACPR);
RUN;
Instructions here: https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... will show how to turn an existing SAS data set into data step code that can be pasted into a forum code box using the </> icon or attached as text to show exactly what you have and that we can test code against.
Provide an example of your data and what you expect the result for that sample to look like.
If I want the maximum value of something across all the data the easiest is Proc Means/summary.
Proc summary data=hh ; by permno; var <name of variable with price>;
output out=temp max= H min=L ; run;
Merge that back onto the data.
I cannot tell what that second bit means. If you mean a "rolling" maximum, meaning each week you need the maximum for the previous 52 weeks that is a bit complicated. If you mean the Max (or min or whatever) for a calendar year that may be relatively easy but your bit of "52 weeks" is a tad awkward to deal with from daily data without a concrete example. One year and 52 weeks are not exactly the same as there are 52.143 or 52.286 7 day periods within a calendar year. "Week" definitions are also a bit tricky because they depend on which day to start and how to count things at calendar year boundaries. So you need to work through with some actual example what you expect.
And your code example makes no sense in either context plus the examples in your other posts with division by zero that you seem not to understand that you need to provide what to do when such would occur. Something like
If <variable> ne 0 then <the code that uses the division>; else <some code that makes sense in terms of the problem when the variable is zero>;
Thx!
sample data is as the following
https://drive.google.com/file/d/1DAPXaUmE5xKvJmnfV7vz-MVF9MGVfSDk/view
expect the result for that sample to look like the following
@Irenelee wrote:
Thx!
sample data is as the followinghttps://drive.google.com/file/d/1DAPXaUmE5xKvJmnfV7vz-MVF9MGVfSDk/view
expect the result for that sample to look like the following
That file on google drive is over 200 megabytes. Google even says it is too large to scan for viruses. So why should I download something that is that big.
Subset your data to maybe 2 values of Permno and associated stuff. Then post that.
But I still have no idea what actual rules are involved. I do see a magic word "Rolling". Which means as a minimum you need to very clearly describe what period is involved with "rolling" what value is returned.
I am starting to think that picture of what appears to be a report template of some sort needs some serious expansion as to what MAX means. I see text like "MAX(if prc>HH)". What does that mean in some details. It most programming environments "if prc>hh" is a boolean (logical) expression and returns true/false or numerically 1/0. So "MAX" of an expression that is true/false looks weird.
Pick a specified set of observations. Calculate the results. Describe the calculations you have done to get the desired result and show that desired result, as in actual numbers, given that example data.
what i try is as below
DATA HH;
SET HIGH.HH;
IF CFACPR=0 THEN DELETE;RUN;
PROC SORT
DATA=HH (KEEP =SHRCD EXCHCD PERMNO DATE PRC CFACPR SHROUT CFACSHR
WHERE =( SHRCD IN (10,11) AND EXCHCD IN (1,2,3,31,32,33) ) )
OUT=HL(DROP=SHRCD EXCHCD );
BY PERMNO DATE;
RUN;
DATA HL;
SET HL;
BY PERMNO DATE;
retain H L;
H = max(ABS( divide (PRC, CFACPR)));
L = min(ABS( divide (PRC, CFACPR)));
RUN;
DATA FLAG;
SET HL;
BY PERMNO DATE;
retain FLAG;
if first.permno then do;
counter = 0;
first_date = date;
end;
day_of_year = intnx('day', date, first_date);
if day_of_year <= 365 and not missing(day_of_year) then FLAG = 'UNQUALIFIED';
if PRC > LAG(H) AND date >= '01JAN1963'd then FLAG = MAX;
if PRC < LAG(L) AND date >= '01JAN1963'd then FLAG = MIN;
else FLAG = COMPARISON;
end;run;
How may SAS code create a new Column named FLAG.
Because we will watch the monthly return after the stock is classified into Max or Min portfolio, so the same stock can not be reclassified into Max or Min within one month after it was classified into Max or Min. for example, if a firm is classified into “Max” on 1963/1/1, then this firm can only classified into “Max” again after 1963/2/1.
Thx!
The data I have looks like the following
Photographs of data are impossible to program from. And they hide all kinds of relevant information about the data (is the variable numeric or character? Does it have a display format attached? Which format?)
You appear to have data in SAS already. So it is TRIVIAL to copy it out as TEXT. For example to get the first 10 observations you might do something like:
data _null_;
set have(obs=10);
put (_all_) (+0);
run;
And then just copy the space delimited text from the SAS log.
10000 1986-01-07 -2.56250 3680 1 1 2.5625 2.5625
10000 1986-01-08 -2.50000 3680 1 1 2.5 2.5
10000 1986-01-09 -2.50000 3680 1 1 2.5 2.5
10000 1986-01-10 -2.50000 3680 1 1 2.5 2.5
10000 1986-01-13 -2.62500 3680 1 1 2.625 2.625
10000 1986-01-14 -2.75000 3680 1 1 2.75 2.75
10000 1986-01-15 -2.87500 3680 1 1 2.875 2.875
10000 1986-01-16 -3.00000 3680 1 1 3 3
10000 1986-01-17 -3.00000 3680 1 1 3 3
10000 1986-01-20 -3.00000 3680 1 1 3 3
@Irenelee wrote:
- How may SAS code create a new Column named “H” which is a firm’s historical high price? (every line we have a firm’s id PERMNO, date, close price) vice versa “L”
- How may SAS code create a new Column named “HH” which HH=LAG(H), BUT FIRST YEAR'S PERMNO’s daily HH=INFINITE. (the price have to exist more than one year, for example 1963/1/1’s historical high price have to be have price exsit from 1962/1/1. Because I want to make sure at least 1 year of available daily price data, which is required to be defined historical high stock price may be compared with the 52-week period range.
To create the highest value every seen for a "firm" just make a NEW variable and RETAIN it. Change the value when the price goes up.
data want;
set have;
by firm;
retain H L ;
h= max(h,price);
l = min(l,price);
if first.firm then do;
h=price; l=price;
end;
run;
The second description is too confusing to program from. If the point is to remember WHEN the H and L price occurred then remember that also. In which case the programming gets a little more complicated.
data want;
set have;
by firm;
retain H L H_date L_date;
format h_date l_date date9.;
if first.firm then do;
h=price; h_date=date;
l=price; l_date=date;
end;
else do ;
if h<price then do; h=price; h_date=date; end;
if l > price then do; l=price; l_date=date; run;
end;
run;
You can now test on any date how long it has been since the most recent high occured.
Whether or not that is what you are looking for I have NO idea.
THX!
format h_date l_date date9.;
May I ignore this format setting? and just use original date ?
@Irenelee wrote:
THX!
format h_date l_date date9.;
May I ignore this format setting? and just use original date ?
If you don't attach a date type display format to the new variables then the raw number of days since 1960 will be printed. That will be hard for humans to understand. You can attach any of the dozens of formats that format date values. If you want the dates to look like 2024-04-27 then use yymmdd10. as the format specification in the FORMAT statement.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.