- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Dear SAS Users,
I am writing to you today because I am struggling to find an answer to my problem.
I am working on an imported table which I inserted in my library.
This table is composed of 12 columns : the first column is the date and the 11 others are the prices of different stocks on the given date.
I want to add 11 columns and compute the returns for each of the 11 stocks that I have in my table but I don't know how to do.
I tried this code but it obvisously doesn't work (and I am not surprised).
data=CAC.DATAFIN;
infile data=CAC.DATAFIN datalines = "," truncover;
input return;
format data;
run;
Please find attached an extract of the table I am using 🙂
Could you guys help me to solve this problem ?
If I am not clear please don't hesitate to ask for more info !
Thank you guys for your help,
Maris 🙂
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
If you have a SAS data set you do not reference it with an INFILE. Infile is for external files such as text that you intend to read.
To use an existing data set in a data step you use set.
Warnging: when you use code like
data Somedatasetname; set Somedatasetname;
you completely replace the data set. So new SAS users should use a new data set name on the Data statement to avoid destroying data with logic problems.
You will find that working with SAS you do better with data that looks like:
Date StockName Value.
Then the sort of analysis you request can reference a single variable, such as STockname for grouping.
I have no idea how to calculate a return so you need to tell us exactly how you intend to do that.
Your SAS system should have a data set SASHELP.Stocks you can examine, and perhaps reference for how to use as your current data likely is hard to work with and may require transposition (reshaping ) before analysis.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thank you very much for your answer!
There are some things that you should know for the return :
First there is no return for the first date (here 25 May 2012)
Second the calculation for the return is :
{(Price of the day (t) - Price of the previous day (t-1))/(Price of the previous day (t-1)} * 100
Therefore the return for 28 May 2012 for the CAC will be :
{(3042-3047)/3047} * 100 = -0,16%
I don't know if it would be easier to add a step which would be to calculate the lag of the values, therefore :
(Price of the day (t) - Price of the previous day (t-1))
And add this calculation to a new column for each stock.
Also, I was wondering if maybe if it would be easier to create a new table with the new columns that we want instead of adding new columns so the existing data set would not be destroyed.
I will try to see if SASHELP.Stocks can answer my questions !
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
See if this example makes sense.
data example; input date :date9. stock $ value; format date date9.; datalines; 12Jan2020 IBM 12.34 13Jan2020 IBM 12.56 14Jan2020 IBM 11.34 15Jan2020 IBM 12.00 12Jan2020 ABC 22.34 13Jan2020 ABC 32.56 14Jan2020 ABC 12.34 15Jan2020 ABC 22.00 ; Proc sort data=example; by stock date; run; data want; set example; lv = lag(value); by stock; if first.stock then return=.; else return = 100 * (value -lv)/lv; run;
First, please note that data provided in a data step we can copy and run. So there is something to test code with.
Second this demonstrates how to use data with one record per stock per day. The BY statement, which does require sorting prior to use, has SAS create two automatic variables for each record First. and Last. that indicate whether the present value of the variable is the first record with that value, the last value or neither. These are numeric values 1/0 which SAS uses as True / False.
So we can test if the current record is the first stock and avoid the calculation as we would have incorrect values in most cases or calculate when needed otherwise.
Note that the LAG function typically does not do what you might expect if used in an "if" so is called prior to the "if/then/else" that may attempt to use the value. I leave it to you to experiment and see what happens if you don't follow this pattern.
Your existing data would be much better off TRANSPOSED to the Date Stock Value structure.
One of the main reasons: In the example above if you add 15 new stocks in the Example data set the code in the Want data set does not change. If you have the stock as columns for each one then you need to modify the code to address changing numbers of columns, need to add a bunch of variables and likely much further down the line keep adding complexity. Such as Mean return per calendar month per stock. The example I show would do that calculation with something like:
Proc means data=want min mean max;
by stock date;
format date yymon. ;
var return;
run;
Note that Formats can create groups honored by most of the report, analysis and reporting procedures.
Which could be useful if you have a Stock to Industry format. Then you could modify the Proc Means code above to use that format for the Stock variable and get a summary across the industry. BUT that approach requires a single variable holding the stock name to work.
There is a procedure, Transpose that will rearrange many typical data structures.
Perhaps something like:
Proc transpose data=have out=wanttrans prefix=price name=Stock; by date; var <list of stock name variables goes here>; run;
will get a nicer structure.