Clean electricity system based on renewable generation, such as wind power, hydropower, and solar power, is rapidly growing around the world. Based on a report by International Energy Agency (IEA) [1], cumulative solar Photovoltaic (PV) capacity alone reached almost 400 Gigawatt (GW) and generated over 460 Terawat-hour (TWh) in 2017. This represents around 2% of global power output. By the year 2023, the world will have 1 trillion watts of installed solar PV capacity. Solar power penetration is growing and measures up commendably with the major renewable countries, where the electricity generated from PV system was 5.9% of the national electricity production for Australia, 6.5% in Japan and 7.1% in Italy in 2018 [2], and 8.2% in Germany in 2019 [3]. Despite the rapid increase in solar power penetration, there is still substantial room for growth, considering these countries are advanced economies with high energy consumption needs. To promote sustainability while meeting future energy needs, renewable generation such as solar-generated electricity as a solution is not always straightforward. One issue of concern is the intermittency of solar power. The intermittency can be due to the “variability” of solar power generation due to fluctuations in solar radiation caused by frequent changing of cloud conditions during the day, and “uncertainty” due to the electricity generation that is not known with perfect accuracy at multiple timescales, from seconds to minutes to hours. The intermittency issue seriously impacts the system operator's ability to manage the supply and demand in the electricity grid. Therefore, it would be of special interest to investigate ways to predict the electricity generation of the solar PV systems more accurately. This work aims to develop more sophisticated methods of predictions and demonstrate the feasibility of using machine learning techniques to solve energy‐related problems.
References:
[1] https://www.iea.org/topics/renewables/solar/
[2] https://ec.europa.eu/jrc/sites/jrcsh/files/kjna29938enn_1.pdf
Team Name | R.Energetics (means data analytics in renewable energy) |
Track | Energy |
Use Case | Prediction of solar irradiance for solar PV system |
Team Lead | Yong Wee Foo @YWF |
Member | Ese Omats @EseOmats1425 |
Dataset | 2013 to 2016, 2018 to 2020 |
Predictor variables |
Ambient Temperature Air quality data |
Response variable | Solar irradiance |
Modeling Technique | Artificial Neural Networks, Forest, Gradient Boosting, Ensemble |
Short Video | |
Final video for submission | |
Fresh video |
Nicely said
Nancy Rausch would be fantastic to work with this team given her work looking at the impact of air pollution on solar farm output.
Yes, Nancy is welcome to join if she is interested to collaborate. 🙂
Hi, My name is Ese Omatsone and I'd like to join your team on understand clean energy generation and distribution. I live in Calgary, AB, Canada and my background is in petroleum engineering, but I have been working in the past couple of years as an Analyst and learning a lot about renewable energy. I have a very sharp and analytical mind and I'm a very good user of JMP software by SAS for statistical analysis. I'm also very good at creating visualizations that help tell a story or illustrate a path of action.
Hi Ese, you are most welcome to join! am from Singapore and I am an academic staff from the School of Engineering in Nanyang Polytechnic. I do training for students and adult learners in Cloud and Big Data. I use tools like R, Matlab, and Azure ML studio for modelling and SPSS for stats analysis. I am new to SAS and hope to learn how to use the tool in this Hackathon.
Thanks, Yong! I'm reading up on all the references that you have provided so as to become more familiar with the problem of uncertainty and variability in solar power, as you described. What are the other sets of data/information that are pertinent to this problem - that we will use in the modelling exercise?
Hi Ese, may I have your email? I can send you the dataset. I shall include more info in our use case described above. Basically, the dataset consists of predictor variables such as
Ambient Temperature
Relative Humidity
Rain Gauge measurement
Wind Speed
Wind Direction
Atmospheric Pressure
PV panel surface Temperature and the response variable is
Solar irradiance
I have 4 years of data from 2013 to 2016 but I think we sample a subset to work on the modeling.
I have sent you an email, Yong, as well as my mobile phone number. I also responded to the query from David Pope.
Thank you.
Hi, I had worked on a similar project studying this problem and think it’s great you all are looking into this as well given the intermittency of solar. I would just like to offer some advice as I had noticed “cloud coverage” is not one of the variables included in the model. I found from work I’ve done on this problem that cloud coverage is immensely correlated to solar irradiance. I know it may not be easy give you have data from 2013-2016; however, if there is anyway you can get even an estimate of cloud coverage included in your data, by looking through the historical data on wunderground or other weather websites, I would highly suggest doing so. As I said, I know this may be quite difficult and may be impossible if the data is just simply not out there for the locations in your dataset but nonetheless wanted to put the suggestion out there because I do think building better PV forecasting models is really important for adoption. I’m sure regardless you will have a good solution, just want to offer some advice from past experience that I hope will be of help to the team. Best of luck!
Yes, several AI modeling techniques have included cloud coverage as one of the input predictors for PV forecasting. These techniques have produced quite good results. I shall see if I can obtain the data from the relevant weather/meteorological station. Thanks for the suggestion!
Hello Everyone. My name is Nancy, and I am a computer and data scientist here at SAS. I did my University thesis on solar energy forecasting and am looking forward to assisting on this project. Here are some findings from that research that you might find helpful, and I am happy to send along the paper. I have also provided some additional references that you might find helpful.
- One important surprise predictor is air quality. I found up to a 10% reduction in energy output at certain times of the year based on poor air quality. Our area here in the east coast of the united states is bathed in particles from coal fired power plants. In summer months the atmosphere mixes more readily and so the impact of these particles are not as pronounced. However, during winter months, the atmosphere is more stratified, and the particles do not mix or dissipate as readily, particularly in early winter. During this time, solar irradiation is decreased due to these particles. Finally, this is very much a locality based predictor. Information related to air quality can generally be collected from government sources; in our country, daily numbers for most localities are freely available from the US Environmental Protection agency.
- You have already mentioned other useful predictors above including irradiance, date/time, temperature, solar azimuth, and windspeed. Wind was particularly interesting as a predictor; in our area, days of high wind produced less output, due to the turbulence and cooling effect associated with wind flowing over the panels. These numbers are available from government weather data sites. Note that we found it to be important to use meteorological data collected as close as possible to the installation site where the panels are located.
Another finding from our research into this area is the modeling techniques that can produce better accuracy with this type of data. We found that Random Forest is a good method for this type of forecasting because it support highly non-linear data that exhibits multi-collinearity. Deep learning methods that incorporate time series, such as RNN and Gradient Boost, are also quite efficient as predictive models, at the expense of more complexity and greater processing time. Since solar energy output produces an abundance of data, it lends itself very well to deep learning techniques.
Here are a few references you might find helpful:
Abdel-Nasser, M., & Mahmoud, K. (2017). Accurate photovoltaic power forecasting models using deep LSTM-RNN.
Neural Computing and Applications , 1-21.
Chandler, D. (2018, August 18). Air pollution can put a dent in solar power. Retrieved from MIT news:
http://news.mit.edu/2018/air-pollution-can-put-dent-solar-power-0829
Hickey, H. (2018, July 24). WHY WINTER AIR IN THE EASTERN U.S. IS STILL SO DIRTY. Retrieved from Futurity.org:
https://www.futurity.org/winter-air-pollution-emissions-1819872/
NCSU. (2019, May 8). Weather and Climate Database. Retrieved from North Carolina Climate Office:
https://climate.ncsu.edu/cronos/?station=REED
NOAA. (2019, May 10). Solar Geometry Calculator. Retrieved from NOAA.gov:
https://www.esrl.noaa.gov/gmd/grad/antuv/SolarCalc.jsp
US Environmental Protection Agency. (2019). Air Quality Daily Values Report. Retrieved from US EPA:
https://www.epa.gov/outdoor-air-quality-data/air-quality-index-daily-values-report
Zulkifli, H. (2019, Mar 12). Multivariate Time Series Forecasting Using Random Forest. Retrieved from Towards Data
Science: https://towardsdatascience.com/multivariate-time-series-forecasting-using-random-forest-2372f3ecbad1
Thanks @nar_sas for your support and the references. I'm glad you are available to help mentor team R.Energetics @YWF @EseOmats1425
@nar_sas Wow, thanks for the helpful advice and tips! I shall spend some time looking through the references.
Looks great. Nice music.
Looks so good! Nice job team!
Great work!
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!