Hello Everyone. My name is Nancy, and I am a computer and data scientist here at SAS. I did my University thesis on solar energy forecasting and am looking forward to assisting on this project. Here are some findings from that research that you might find helpful, and I am happy to send along the paper. I have also provided some additional references that you might find helpful.
- One important surprise predictor is air quality. I found up to a 10% reduction in energy output at certain times of the year based on poor air quality. Our area here in the east coast of the united states is bathed in particles from coal fired power plants. In summer months the atmosphere mixes more readily and so the impact of these particles are not as pronounced. However, during winter months, the atmosphere is more stratified, and the particles do not mix or dissipate as readily, particularly in early winter. During this time, solar irradiation is decreased due to these particles. Finally, this is very much a locality based predictor. Information related to air quality can generally be collected from government sources; in our country, daily numbers for most localities are freely available from the US Environmental Protection agency.
- You have already mentioned other useful predictors above including irradiance, date/time, temperature, solar azimuth, and windspeed. Wind was particularly interesting as a predictor; in our area, days of high wind produced less output, due to the turbulence and cooling effect associated with wind flowing over the panels. These numbers are available from government weather data sites. Note that we found it to be important to use meteorological data collected as close as possible to the installation site where the panels are located.
Another finding from our research into this area is the modeling techniques that can produce better accuracy with this type of data. We found that Random Forest is a good method for this type of forecasting because it support highly non-linear data that exhibits multi-collinearity. Deep learning methods that incorporate time series, such as RNN and Gradient Boost, are also quite efficient as predictive models, at the expense of more complexity and greater processing time. Since solar energy output produces an abundance of data, it lends itself very well to deep learning techniques.
Here are a few references you might find helpful:
https://www.researchgate.net/profile/Zhenxing-Lei/publication/336184094_A_review_of_deep_learning_for_renewable_energy_forecasting/links/60014485299bf14088975ffd/A-review-of-deep-learning-for-renewable-energy-forecasting.pdf
Abdel-Nasser, M., & Mahmoud, K. (2017). Accurate photovoltaic power forecasting models using deep LSTM-RNN. Neural Computing and Applications , 1-21.
Chandler, D. (2018, August 18). Air pollution can put a dent in solar power. Retrieved from MIT news: http://news.mit.edu/2018/air-pollution-can-put-dent-solar-power-0829
Hickey, H. (2018, July 24). WHY WINTER AIR IN THE EASTERN U.S. IS STILL SO DIRTY. Retrieved from Futurity.org: https://www.futurity.org/winter-air-pollution-emissions-1819872/
NCSU. (2019, May 8). Weather and Climate Database. Retrieved from North Carolina Climate Office: https://climate.ncsu.edu/cronos/?station=REED
NOAA. (2019, May 10). Solar Geometry Calculator. Retrieved from NOAA.gov: https://www.esrl.noaa.gov/gmd/grad/antuv/SolarCalc.jsp
US Environmental Protection Agency. (2019). Air Quality Daily Values Report. Retrieved from US EPA: https://www.epa.gov/outdoor-air-quality-data/air-quality-index-daily-values-report
Zulkifli, H. (2019, Mar 12). Multivariate Time Series Forecasting Using Random Forest. Retrieved from Towards Data Science: https://towardsdatascience.com/multivariate-time-series-forecasting-using-random-forest-2372f3ecbad1
... View more