Published online by Cambridge University Press: 24 July 2018
We developed a flare prediction model based on the supervised machine learning of solar observation data for 2010-2015. We used vector magnetograms, lower chromospheric brightening, and soft-X-ray data taken by Solar Dynamics Observatory and Geostationary Operational Environmental Satellite. We detected active regions and extracted 60 solar features such as magnetic neutral lines, current helicity, chromospheric brightening, and flare history. We fully shuffled the database and randomly divided it into two for training and testing. To predict the maximum size of flares occurring in the following 24 hours, we used three machine-learning algorithms independently: the support vector machine, the k nearest neighbors (kNN), and the extremely randomized trees. We achieved a skill score (TSS) of greater than 0.9 for kNN. Furthermore, we compared the prediction results in a more operational setting by shuffling and dividing the database with a week unit. It was found that the prediction score depends on the way the database is prepared.