09-11-2011 08:35 AM
I was hoping if I can get some help for codes to do the following - there are 3 datasets in the Zipfile that will help make sense of the questons as a reference.
1). In file regresstest.dat, what is the code to regress RE_RF = a + B(RM_RF), but specifically between row 8 to 20 ?
data = regresstest.dat;
model RE_RF = RM_RF - (what to I put here so I control which dates I include in the regression)
2). In file newgg5po (stands for new dataset, government traders, stock 5po)
On row 5 and 6 you see the same date (20051221), column "shoc" (2nd last column) have 2 corresponding values for 2 government trades on that day for that stock. What is the code so I only have 1 row for each date - so that the corresponding "shoc" value for date (20051221) is the sum of the values in row 5 and 6.
Also some days there are no trades - how to I fill these rows so there is a date - with 0 for "shoc"
3). Assume I want to sum row 1 of the "shoc" column from newgg5po with row 20 of the "shoc" column from newgg60n, and then row 2, row 3 .... for row 21,22 . .. and have the output on a new data file. What is the code for that?
Thank you so much in advance,
09-11-2011 10:33 AM
You did not include any SAS datasets in the Zip file. There is just a DLL file, which I decline to expand as I have no idea what it might do to my computer.
1) To limit the observations (rows) included in an analysis you could use a WHERE statement.
2) You could use PROC SUMMARY to create a new files that has the SUM of variables. Use a BY or CLASS statement to make it group be the variables (columns) that uniquely identify the groups. To fill in the zeros it depends one what is missing. If you have an observation already and are just missing a value then use program logic to make the variable be zero instead. If x=. then x=0; or x=coallese(x,0);
3) Not sure what you are talking about here. Normally you would refer to matching of observations (rows) from one dataset with another by referencing the variables (columns) that need to match. So if there are variables that indicate row1 of dataset A should be matched with row2 of dataset B then you can merge be those. But it really sounds like what you want to do is more like your second question. In that case just concatenate the datasets using a DATA step with a SET statement and then use PROC SUMMARY as in the second question.
09-12-2011 08:38 PM
Yeah sorry, I didnt know how to upload data files in reply - so I thought I should re-post with data files uploaded for better reference as requested by a helper in the previous thread.
I was hoping for specific codes as the data files are now available. Thx
09-14-2011 01:11 AM
data = regresstest(firstobs=8 obs=20);
model RE_RF = RM_RF
2) use first.var and last.var
use If x=. then x=0; or x=coallese(x,0); to instead of x missing value.
3)first sub dataset.
merge a b;
the code above is not tested. just a reference.