You are on page 1of 5

Volume 8, Issue 11, November – 2023 International Journal of Innovative Science and Research Technology

ISSN No:-2456-2165

K-Means Clustering and Artificial Neural Network in


Weather Forecasting
Shubham Choramle; Jaydeep Lokhande; Vidwesh Choudhary; Pruthviraj Mohalkar; Dr. Vitthal Gutte
Department of Computer Engineering and Technology, Dr. Vishwanath Karad MIT WPU University
Address: MIT World Peace University, Sr No 124, Ex Serviceman Colony, Paud Road, Kothrud, Pune-411038
1032210941@mitwpu.edu.in TO jaydeeps.lokhande@gmail.com
1032210967@mitwpu.edu.in TO vidweshchoudhary@gmail.com
1032210657@mitwpu.edu.in TO shubham.choramle1330@gmail.com
1032210772@mitwpu.edu.in TO pruthvirajmoholkar3@gmail.com
vitthalgutte2014@mitwpu.edu.in TO vitthal.gutte@mitwpu.edu.in

Abstract- The research paper's abstract provides a brief professional weather predicting services beat the models, they
introduction to the subject of forecasting the weather, may do so over longer time frames. Discussion of the study's
discusses the applications and limitations of K-Means shortcomings and recommendations for further investigation
clustering and ANNs (artificial neural networks) in follow.
weather forecasting, and identifies a comparison of these
two approaches as the primary goal of the study. It finishes In our Paper two Algorithm are being studied 1. K-mean
by highlighting the significance of integrating different Clustering and 2. Artificial Neural Network (ANN) in weather
methodologies for precise weather forecasting and briefly prediction:
referencing relevant work.  K-means clustering is a well-liked machine learning
approach that is employed for this purpose. K-means
Keywords:- "Weather Forecasting," "K-Means Clustering," clustering refers to the combining of data points that are
"Artificial Neural Networks," "Ml," "Comparative Analysis," alike without the usage of predetermined labels. In
And "Meteorological Data." conclusion, K-means operates in the way outlined below:
 Initialization of the centroids of the 'K' cluster, either
I. INTRODUCTION randomly or by a predetermined method.
 Based on distance, frequently using a Euclidean distance
On a mix of meteorological measurements, historical metric, each information point gets assigned to the nearest
weather information, and meteorologists' expertise, classical centroid.
weather forecasting is predicated. This process includes  By averaging all the data points assigned to each cluster,
gathering and evaluating data from several meteorology centroids are recalculated.
sources of information, run computerized weather models, and  Repeat steps 2 and 3 as necessary to achieve the
employing meteorological expertise to assess the models' appropriate number of iterations or until the centroids
outputs. hardly move at all.
 The basic objective of this technique is to generate 'K'
On the other hand, forecasting of weather powered by different clusters of information points by maximising
machine learning makes use of advanced statistical and variance between clusters and minimising variance within
computational techniques. To forecast the weather, it entails clusters. K-means has several uses in picture segmentation,
utilizing machine learning algorithms that have been taught data analysis, and other circumstances where accumulating
from historical weather data. Similar to traditional forecasting pertinent data is important.
methods, machine learning-based systems incorporate
meteorological data from many sources such as weather  An Artificial Neural Network (ANN), a type of
stations, satellites, and radar. They could, however, also make computational model, is inspired by how the human brain
use of peculiar data sources, such as social networking sites or works. In a word, an ANN is built up of layered networks
Internet of Things (IoT) devices. of linked nodes, or "neurons," that are coupled to one
another. These layers commonly include an input layer,
[1] The use of predictive machine learning techniques for one or more layers that are hidden, and a layer for output.
weather forecasting is covered in this paper. In particular, the
research investigates how to make use of both linear and An ANN's core components are as follows:
functional regression approaches to forecast the highest and  Neurons: These act as computing components that process
lowest temperature for seven consecutive days, given inputs, carry out a weighted summation, and then send the
meteorological information for the previous two days. The output of the result through an activation function.
study's dataset includes meteorological information for  Weights: A weight is assigned to each connection between
Stanford, California, spanning the years 2011 through 2015.
neurons, indicating the strength of the relationship.
The article comes to the conclusion that even while
Through exercise, these weights are changed.

IJISRT23NOV669 www.ijisrt.com 459


Volume 8, Issue 11, November – 2023 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165
 A network can model complex interactions within data IV. METHODS
thanks to the non-linearity introduced by an activation
function.  It is uncommon to use K-Means clustering directly to
improve weather forecasting accuracy. Instead, it is
An ANN's main goal is to learn from data by minimizing frequently used as a part of a larger data analysis or
prediction errors in classification or regression tasks by preprocessing pipeline, which could indirectly enhance
adjusting the weights during training. ANNs are used in a weather forecasting models. The following elements are
variety of artificial intelligence and machine learning fields, included in K-Means clustering's application to weather
such as image identification, natural language processing, forecasting:
autonomous driving, and many more.  Data Preprocessing: K-Means clustering is used in data
preprocessing to find patterns or clusters in historical
II. OBECTIVE meteorological data. The data can be efficiently reduced in
dimension or translated into categorical variables that
To do a comparative analysis of K-Means Algorithm and indicate different weather regimes or conditions by
ANN in Weather Forecasting. grouping together comparable weather patterns. These data
clusters can then act as features for more traditional
III. RELATED WORK weather forecasting models.
 Pattern Identification: A useful approach for locating
There are other Machine Learning algorithm aswell repeating weather patterns or anomalies is K-Means
which are also used for weather prediction: clustering. It can be used, for instance, to find trends in
 Regression Models: Linear and polynomial regression are historical data on temperature, humidity, or air pressure.
effective methods for identifying relationships between Meteorologists' decision-making is improved and their
meteorological variables. By examining historical data understanding of the current meteorological conditions is
together with variables like humidity, air pressure, and time deepened by their ability to recognize these patterns.
of day, they help predict temperature, for example.  Compression of data: K-Means clustering can sometimes
 Time Series Analysis: Time series analysis methods, such be used to efficiently compress large amounts of weather
as ARIMA (AutoRegressive Integrated Moving Average), data into a more manageable size. This is very useful for
are used in the field of weather forecasting to predict and efficient data transmission and storage, especially in
foresee trends, seasonality, and irregular variations in remote areas or with limited resources.
weather-related data.  Feature Engineering: The use of clustering outcomes as
 Random ForestRandom Forest is an ensemble learning features in more intricate machine learning models, such
approach that excels at handling regression as well as as random forests, decision trees, or neural networks,
classification issues when used for weather forecasting. It enhances weather forecasting. When analyzing the data
may be used to forecast various meteorological conditions, using the raw data, it might be challenging to spot hidden
such as rainfall or the possibility of storms. links or trends in the data. These qualities may help you do
 Support Vector Machines (SVM): Based on historical data just that.
and input features, SVMs are used for classification tasks  Data Quality Control: K-Means clustering can be used to
like predicting the likelihood of various weather events identify irregularities or anomalies in meteorological data,
(like rain or snow). assisting in the detection of measurement errors or
 Decision Trees: Decision trees can be used to categorize inconsistencies in the dataset. Data quality assurance. This
weather events and make decisions related to them, such as procedure is important for improving the precision of the
determining if inclement weather at an airport would cause data used as input for prediction models.
aircraft delays.
 Ensemble Methods: To increase forecasting accuracy, While numerical weather prediction (NWP)
multiple machine learning models are combined using mathematical models, data assimilation techniques, and the
techniques like AdaBoost and Gradient Boosting. addition of different sources of data, like data from satellites,
 Models for numerical weather prediction (NWP) The weather observatories, and radar, are the main ways to
complicated mathematical and physical models used in improve forecast accuracy, K-Means clustering can
NWP models, which are not ordinary machine learning nonetheless indirectly improve weather forecasting.
techniques, recreate atmospheric phenomena. They play a
significant role in weather forecasting. [2] The simulations presented in the research are built
 Deep Learning for Image Analysis: Convolutional Neural using data from 2009 and 2010. The simulation is carried out
Networks are used for image analysis in the field of to determine the precision of the strategy suggested in the
weather forecasting, particularly the recognition of cloud study. The experiment is carried out using Java and the Weka
cover in satellite photos. software on a PC with a 2.26 GHz Core i3 processor and 4GB
of memory running Windows 7 Home Basic. Any method's
accuracy can be evaluated by contrasting its present value with
its real value. The simulation outcomes demonstrate the
excellent accuracy rate of the suggested incremental K-means
clustering method for weather forecasting. By contrasting the
actual weather with the forecasted weather, the accuracy of the

IJISRT23NOV669 www.ijisrt.com 460


Volume 8, Issue 11, November – 2023 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165
strategy is evaluated. As a result of the findings, it can be ensemble of models or act as a component inside a broader
concluded that the methodology for forecasting weather is weather forecasting system.
highly accurate. The suggested approach can be strengthened  Integration: By combining several models and data inputs,
even more by utilizing more sophisticated methodologies and a comprehensive weather forecasting system can be
algorithms, according to the paper. The creation of more created using the forecasts generated by an artificial neural
sophisticated weather forecasting algorithms and methods, network (ANN). The use of ANNs as one tool among many
which can be utilized to increase the suggested methodology's in a varied toolkit is a common ensemble method in
accuracy, is part of the planned work's future scope. modern weather forecasting.
 Real-time Data: Current information from weather
The accuracy of the suggested approach for forecasting stations, satellites, and other sources is essential for
the weather using incremental K-mean clustering was accurate weather forecasting. ANNs are able to
calculated to be around 83.3%. By comparing the true worth continuously examine new data and update their
with the present value of the new approach, the accuracy was predictions in light of the most recent information.
calculated. The simulation results demonstrate the high
accuracy rate of the suggested methodology, which makes it a
trustworthy tool for weather forecasting.

Fig 2 ANN Flow Chart

[4] The use of a back propagation neural network system


to forecast rainfall in India is discussed in this paper.
According to the authors, 52% of all jobs in India are in the
agricultural sector, making it the country's most common
occupation. However, there are insufficient irrigation
Fig 1 K-Means Flow Chart infrastructure, since only 52.6% of the area was irrigated in
2009–2010. As a result, farmers continue to rely largely on
 Artificial Neural Networks (ANNs) have been applied to rainfall, especially during the monsoon season. Therefore, it is
weather prediction tasks Although they are only one essential for agriculture planning and management to estimate
component of complex weather forecasting models, rainfall accurately.
artificial neural networks (ANNs) find uses in weather
prediction tasks. The following is an outline of how ANNs For data mining, a form of supervised machine learning
are used in weather prediction: method is used: the back propagation neural network model.
 Data Input: The historical weather data used to train ANNs In this instance, moisture, dew point, and atmospheric pressure
includes elements like temperature, humidity, wind speed, data are used to train the model in order to forecast rainfall.
atmospheric pressure, and more. These informational Using 250 patterns for training and 120 testing patterns, the
components act as the neural network's input features. authors divided their data in half for training and the other third
 Output: Depending on the precise task at hand, the ANN's for testing. They claim to have achieved testing accuracy of
output in the context of weather prediction may vary. It 94.28% and training accuracy of 99.79%.
could entail predicting future values for variables like
temperature, precipitation, wind patterns, or even [3] The usage of artificial neural networks, or (ANN) for
categorizing weather as "rainy" or "sunny." weather forecasting is discussed in the paper. It emphasizes
 Training: The Artificial Neural Network (ANN) is trained that owing to the irregular nature of weather, conventional
using historical data as well as observed weather techniques of forecasting the weather have limits and that
conditions. The main goal is to optimize the weights and ANN can offer more precise forecasts by learning from prior
biases of the network so that it can produce accurate data and modifying its parameters. The PDF also includes
predictions based on the input data. examples of research on temperature, a thunderstorm, rain, and
 Model Complexity: Weather forecasting is fundamentally weather in particular places. These works are all examples of
complex since it takes into account a wide range of research on predicting weather using ANN. The PDF comes to
variables, including atmospheric dynamics, geographic the conclusion that ANN is an effective tool for weather
characteristics, and other elements. To properly handle this prediction and may be used in place of conventional
complexity, ANNs can be incorporated into a thorough meteorological methods.

IJISRT23NOV669 www.ijisrt.com 461


Volume 8, Issue 11, November – 2023 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165
[5 ] They have presented a temperature forecast model [10] In a city catchment in west Sydney, Australia,
that makes use of advanced wireless data gathering techniques. artificial neural networks, or ANNs, are used to forecast
This model mixes artificial neural networks with the strength rainfall, as discussed in the paper file. Based on previously
of statistical software. An artificial neural network integrates known geographical and temporal rainfall patterns, the
smoothly with wireless technologies and statistical software to research analyzes the effectiveness of three different types of
evaluate data, acquire insights from them, and create accurate artificial neural networks (ANN in forecasting rainfall at
temperature projections for the future. numerous sites simultaneously over the next 15 minutes. The
fundamentals of artificial neural networks (ANN and their
[6] By applying the Levenberg Marquardt Back- construction are also covered in the study, along with the
Propagating Feed Forward Neural Network, it was suggested crucial concerns of determining the complexity and lag order
to take a novel method to classifying and predicting weather. of the networks. The authors point out that while none of the
The Levenberg BP is by far the most effective networks accurately predicted the peak rainfall rate, three
backpropagation algorithm available. The Weather different types of networks gave fair predictions of rain one
Classification & Prediction system built around BPNN time step in advance. According to the article, better
fundamentally functions as a predictive toolbox, designed to predictions should be predicted when more data are gathered
capture data such as humidity, temperature, pressure, and the for model training and more control factors are found to be
direction of the wind - these variables function as input added to the network inputs. The research comes to the
neurons to the BP neural network. For the purpose of training conclusion that the network with smaller lags may have
the neural network, both historical and present atmospheric performed marginally better than those with higher lags
data are gathered. because there exists an optimum complexity of the network for
the issue under consideration, given the data at hand.
[7] They developed a neural network-based algorithm to
predict temperature and used the neural network software V. COMPARISON
package, which provides a number of training or learning
options. The Back Propagating neural network technique Table no. 1: ANN vs K-Means
(BPN) was one of these methods used. Utilizing real-time data, ARTIFICIAL NEURAL K-MEANS
the researchers assessed this idea and contrasted the results NETWORK CLUSTERING
with projections made by the meteorological department. The -ANNs are like weather K-means is like a tool
outcomes show that the model has potential for making forecasters who learn from past that helps group similar
accurate temperature predictions. weather data to predict future weather data together.
conditions.
[8] It has been noted that major modes of the complex -They analyze various factors It doesn't predict the
climate factors, in particular, also affect Australian rainfall. To like temperature, humidity, and weather itself but can
improve our comprehension and forecasting abilities, there wind patterns to make organize historical data
have been few attempts to thoroughly evaluate the combined predictions. into clusters based on
impact of these indexes on rainfall. Given the complexity of similarities.
rain as an atmospheric phenomena, traditional linear -ANNs are good at capturing K-means can be used to
approaches might not be able to adequately capture its complex patterns in weather identify regions with
complex features. Using Artificial Neural Networks, or ANN, data, providing detailed similar weather patterns
as a technique, this study aims to identify a non-linear link forecasts. but doesn't make future
between rain in Victoria and the lag indices which affect the weather predictions.
area. The results show that when using the lag indices for
springtime rainfall predictions, ANN modeling produces much VI. LIMITATIONS
better correlations than conventional linear techniques.
Incorporating these indicators into an ANN model results in a  Its restricted consideration of the temporal dimension is a
striking improvement in model correlation, with numbers for problem when using K-means clustering in the context of
all three case study stations in Victoria, Australia, situated in weather forecasting. K-means analyzes fixed parameters
Horsham, Melbourne, Australia, and Orbost, respectively, like temperature and humidity to group similar weather
reaching 99%, 98%, and 43%. situations, but it does not sufficiently account for how these
conditions change dynamically over time. Since weather is
[9] This Paper is a survey article on rainfall prediction constantly changing, K-means is less useful for forecasting
using artificial neural networks. It discusses the challenges and dynamic weather conditions since it struggles to
techniques involved in predicting rainfall accurately. The understand the time-dependent intricacies of weather
article covers various neural network architectures such as patterns and changes.
MLP, BPN, RBFN, SOM, and SVM, and compares them with
other forecasting techniques such as statistical and numerical  Artificial neural networks' (ANNs') probable inability to
methods. The survey concludes that the forecasting techniques effectively handle the complexity of weather systems is a
that use MLP, BPN, RBFN, SOM, and SVM are suitable to limitation when used for weather forecasting. There are
predict rainfall than other forecasting techniques. The article many interconnected components that make up weather,
also provides extensive references in support of the different and ANNs might not be able to fully capture all of them.
developments of ANN research. As a result, they might not provide completely accurate

IJISRT23NOV669 www.ijisrt.com 462


Volume 8, Issue 11, November – 2023 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165
forecasts, especially in cases where weather conditions are [10]. Kin C. Luk, J. E. Ball and A. Sharma, “An Application
harsh or changing quickly. of Artificial Neural Networks for Rainfall Forecasting”,
Mathematical and Computer Modeling, Volume 33, pp
VII. CONCLUSION 683-693, 2001.

Artificial neural networks, or ANNs, have shown to be


effective tools in the field of weather prediction due to their
ability to directly predict future weather conditions using
existing data. K-means clustering typically performs the task
of analyzing and clustering data within meteorological
datasets, as opposed to being able to provide weather
predictions. Weather forecasting systems typically use a
number of strategies, include physical models, statistical
methods, and machine learning algorithms like ANNs, in order
to provide thorough and reliable forecasts.

REFFERENCES

[1]. Mark Holmstrom, Dylan Liu, Christopher Vo, “Machine


Learning Applied to Weather Forecasting”, Stanford
University, December 15, 2016.
[2]. Sanjay Chakraborty, N.K.Nagwani, Lopamudra Dey,
“Weather Forecasting using Incremental K-means
Clustering”, June 18, 2014.
[3]. Abhishek Saxena, Neeta Verma, Dr K. C. Tripathi, “A
Review Study of Weather Forecasting Using Artificial
Neural Network Approach”, International Journal of
Engineering Research & Technology (IJERT), ISSN:
2278-0181, Vol. 2 Issue 11, November – 2013.
[4]. Enireddy.Vamsidhar, K.V.S.R.P.Varma P.Sankara Rao,
Ravikanth satapati, “Prediction of Rainfall Using Back
propagation Neural Network Model”, International
Journal on Computer Science and Engineering, Volume
02, No. 04,pp 1119-1121,2010.
[5]. P.P. Kadu, Prof. K.P. Wagh, Dr. P.N. Chatur, “A Review
on Efficient Temperature Prediction System Using Back
Propagation Neural Network”, International Journal of
Emerging Technology and Advanced Engineering (ISSN
2250-2459), Volume 2, Issue 1, pp 52-55, January 2012
[6]. Arti R. Naik, Prof S. K. Pathan , “Weather Classification
and Forecasting using Back Propagation Feed-Forward
Neural Network”, International Journal of Scientific and
Research Publications, Volume 2,Issue 12,Dec 2012
[7]. Ch. Jyosthna Devi, B. Syam Prasad Reddy, K.Vagdhan
Kumar, B. Musala Reddy, N.RajaNayak, “ANN
Approach for Weather Prediction Using Back
Propagation”, International Journal of Engineering
Trends and Technology, ISSN: 2231-5381, Volume 3,
Issue 1, pp 19-23, 2012
[8]. F. Mekanik and M. A. Imteaz, “A Multivariate Artificial
Neural Network Approach for Rainfall Forecasting: Case
Study of Victoria”, Australia, Proceedings of the World
Congress on Engineering and Computer Science 2012,
Vol I WCECS 2012, October 24- 26, 2012, San
Francisco, USA
[9]. Deepak Ranjan Nayak , Amitav Mahapatra, Pranati
Mishra, “A Survey on Rainfall Prediction using Artificial
Neural Network”, International Journal of Computer
Applications (0975 – 8887),Volume 72, No.16, ,pp 32 -
40,June 2013.

IJISRT23NOV669 www.ijisrt.com 463

You might also like