DATA MANAGEMENT IN
SENSOR NETWORKS
CONTENTS
v INTRODUCTION
v SENSOR NETWORKS
v CHALLENGES
v CONCLUSION
v REFERENCES
INTRODUCTION
Sensor
networks have attracted a lot of attention lately and have been increasingly
adopted in a wide range of applications and diverse environments, from
healthcare and traffic management to weather forecasting and satellite imaging.
A vast amount of small, inexpensive, energy-efficient, and reliable sensors
with wireless networking capabilities is available worldwide increasing the
number of sensor network deployments. Internet or to specific gateways and
provide remote access for management and configuration issues. The adoption of
IPv6 also provides a huge address space for networking purposes in order to
address the large sensor networks on a global scale while concurrently leads to
the rapid development of many useful applications. Thus, it is not unreasonable
to expect that in the near future, many segments of the networking world will
be covered with sensor networks accessible via the Internet. Archived and real
time sensor data will be available worldwide, accessible using standard
protocols and application programming interfaces (APIs).
Nevertheless, as stated
in too much attention has been placed on the networking of distributed sensing
while too little on tools to manage, analyze, and understand the collected
data. In order to be able to exploit the data collected from the sensor network
deployments, to
Map it to a suitable
representation scheme, to extract meaningful information from it and to
increase interoperability and efficient cooperation among sensor nodes, we have
to devise and apply appropriate techniques of data management. Towards this
direction, data aggregation and processing have to be done in a way that
renders it valuable to applications that receive stored or real time input and
undertake specific actions. It is important to note that, special characteristics
of sensor nodes, such as their resource constraints (low battery power, limited
signal processing, limited computation and communication capabilities as well
as small amount of memory) have to be considered while designing data
management schemes.
Sensory data has to be
collected and stored before being aggregated. Various techniques for data
aggregation have been proposed in accordance with the type of the network and
the imposed requirements. However, aggregated data is raw data that has little
meaning by itself. Hence, it is crucial to interpret it according to
information that is relevant to the deployed applications. This will increase
interoperability among different types of sensors as well as provide contextual
information essential for situational knowledge in the sensor network. Towards
this direction, the Open Geospatial Consortium (OGC) recently established the
Sensor Web Enablement (SWE) initiative to address this aim by developing a
suite of sensor related specifications, data models and Web services that will
enable accessibility to and controllability of such data and services via the
Web. The Sensor Web is a special type of Web-centric information infrastructure
for collecting, modeling, storing, retrieving, sharing, manipulating,
analyzing, and visualizing information about sensors and sensor observations of
phenomena.
DATA MANAGEMENT IN SENSOR NETWORKS
The increasing
availability of small-size sensor devices during the last few years and the
large amount of data that they generate has led to the necessity for more
efficient methods regarding data management. In this chapter, we review the
techniques that are being used for data gathering and information management in
sensor networks and the advantages that are provided through the proliferation
of Semantic Web technologies. We present the current trends in the field of
data management in sensor networks and propose a three-layer flexible
architecture which intends to help developers as well as end users to take
advantage of the full potential that modern sensor networks can offer. This
architecture deals with issues regarding data aggregation, data enrichment and
finally, data management and querying using Semantic Web technologies.
Semantics are used in order to extract meaningful information from the sensor’s
raw data and thus facilitate smart applications development over large-scale
sensor networks.
SENSOR NETWORKS
Sensor Nodes: Functionality and
Characteristics
A sensor node, also
known as “mote”, was an idea introduced by the Smart Dust project in early 00's
(2001). Smart Dust was a promising research project that first studied and
supported the design of autonomous sensing and communication micro-computing
devices of size as small as a cubic millimeter (or the size of a “dust
particle”). In other words, this project acted as the cornerstone for the
development of today’s wireless sensor networks. The key functionality of a
modern sensor node, in addition to sensory data gathering, is the partial
processing and transmission of the collected data to the neighbouring nodes or
to some central facility. A modern node could be considered as a microscopic
computer embedding all the units required for sensing, processing,
communicating and storing sensory information, as well as power supply units
able to support such operations. The most important units that are present in a
sensor node are the following:
• The
Processing Unit, that is responsible not only for processing the
collected data, but also for orchestrating the cooperation and synchronization
of all other mote's units towards realizing the promised functionality. Its
operation is most often supported by on-chip memory modules.
• The
Communication Unit, also known as transceiver that enables motes to
communicate with each other for disseminating the gathered sensory data and
aggregating them in the sink nodes (nodes with usually higher hardware
specifications than simple sensor nodes). The
two most popular technologies considered here are either the Radio Frequency
(RF) one, where the unlicensed industrial, scientific and medical (ISM)
spectrum band is worldwide and freely usable by anyone, or the Optical or
Infrared (IR) one, where line-of-sight between communicating nodes is highly
required – making communication extremely sensitive to the atmospheric
conditions.
• The
Power Supply Unit, that provides power for the operation of such tiny
devices. A typical power source does not exceed the 0.5Ah under a voltage of
1.2V and is most commonly a battery or a capacitor. While operations like data
sensing and processing consume some power, the communication between
neighbouring nodes is proved to be the most energy-consuming task.
• The
Sensor Unit, that is responsible for sensing the environment and
measuring physical data. Sensors are sensitive electronic circuits turning the
analog sensed signals into digital ones by using Analog-to-Digital converters.
There is a large variety of sensors available today with the most popular of
them being able to sense sounds, light, speed, acceleration, distance,
position, angle, pressure, temperature, proximity, electric or magnetic fields,
chemicals, and even weather-related signals. Such units must be able to provide
the accuracy the supported application demands, while consuming the lowest
possible energy. Modern sensor nodes are required to be inexpensive,
multifunctional, cooperative, microscopic, as well as able to cope efficiently
with low power supplies and computational capacity.
Sensor Networks Topologies
When a number of
sensor nodes are clustered together, a special type of autonomic and power
efficient network is formed, a so-called
Wireless Sensor Network (WSN). WSNs are mainly consisting of the Sensor Nodes,
the Sink Nodes that aggregate the measured data from a number of Sensor Nodes
and the Gateway Nodes that interconnect the Sink Nodes with the network
infrastructure (e.g. Internet) and route the traffic to proper destinations.
There are cases where the Sink Nodes have embedded network interfaces for data
forwarding and thus
coincide with the Gateway Nodes. Regarding
the topology of the sensor network, it may form either a single-hop network
where each Sensor Node sends directly the data to the Sink Node through a star
topology, or a multi-hop network where each Sensor Node relies on its
neighbours to forward its sensory data to the respective Sink Node.
Application Areas
Sensor networks have been
adopted in a wide set of scenarios and applications where proper data
management can be deemed of high importance. Some of the application areas
where deployments of sensor networks with advanced capabilities are popular are
the following:
• Health
Monitoring: Biometric sensors are usually used for collecting and
monitoring data regarding patients, administrating issues in hospitals,
provision of patient care as well as for supporting the operation of special
chemical and biological measurement devices (e.g. blood pressure monitoring).
The collected sensory data are also stored for
historical reasons in order to be used for
further survey on disease management and prognosis. In many cases, efficient
representation and correlation of the acquired data
enable doctors and students to extract
useful conclusions.
• Meteorology
and Environment Observation: Environmental sensors are used for weather
forecasting, wildfire and pollution detection as well as for agricultural
purposes. Special observation stations collect and transmit major parameters
which are used in the procedure of decision making. For example, in
agriculture, air temperature, relative humidity, precipitation and leaf wetness
data are needed for applying disease prediction models, while soil moisture is
crucial for proper irrigation decisions towards understanding the progress of
water into the soil and the roots.
• Industrial
applications: Different kind of sensors are deployed for serving
industries including aerospace, construction, food processing, environmental
& automotive. Applications are being developed for tracking of products and
vehicles in transportation companies, satellite imaging, traffic management,
monitoring the health of large structures such as office buildings and several other
industry-specific fields.
• Smart
Homes: Home automation applications are being developed in order to
support intelligent artifacts and make the users’ life more comfortable.
Special sensors are attached to home appliances while the created sensor
network can be managed or monitored by remote servers accessible via the
Internet (e.g. user’s office, police office and hospital etc). Sensor networks
also play a significant role on facilitating assisted living for the elderly or
persons in need of special care.
• Defense
(Military, Homeland Security): Sensors are also used for military
purposes in order to detect and gain as much information as possible about
enemy movements, explosions, and other phenomena of interest. Battlefield
surveillance, reconnaissance (or scouting) of opposing forces, battle damage
assessment and targeting are some of the fields where large sensor networks
have been already deployed.
Sensor Web: Data and Services in a Sensor
Network
The term
Sensor Web is used by the Open Geospatial Consortium (OGC) for the description
of a system that is comprised of diverse, location aware sensing devices that
report data through the Web. In a Sensor Web, entire networks can be seen as
single interconnected nodes that communicate via the Internet and can be
controlled and accessed through a web interface. Sensor Web focuses on the
sharing of information among nodes, their proper interpretation and their
cooperation as a whole, in order to sense and respond to changes of their
environment and extract knowledge. Hence, one could say that the process of
managing the available data is not just a secondary process simply enhancing
the functionality of a
Sensor Web, but rather the reason of
existence of the latter. Data storage can be either external (all data are
collected on a central infrastructure), local (every node stores its data
locally) or data-centric (a certain category of data is stored to a predefined
node). External storage is not considered as a viable solution, because of the
high energy cost of data transmission from each sensor node to the central infrastructure.
Local storage overcomes this drawback, since every node stores only self
generated data. The option of local storage is also referred to as Data-Centric
Routing (DCR), where a routing algorithm is needed in order to answer a query
or to perform an aggregation, focusing on minimizing the cost of communication
between sensor nodes.
Knowledge Management in Sensor Networks
As stated
earlier, the rapid development and deployment of sensor technology involves
many different types of sensors, both remote and in-situ, with diverse
capabilities. The absence of ontological infrastructures for high-level rules
and queries restricts the potential of end users to exploit the acquired
information, to match events from different sources and to deploy smart
applications which will be capable of following semantic-oriented rules.
Current efforts at the OGC Sensor Web Enablement (SWE) aim at providing
interoperability at the service interface and message encoding levels. Sensor
Web Enablement presents many opportunities for adding a real-time sensor
dimension to the Internet and the Web. It is focused on developing standards to
enable the discovery, exchange, and processing of sensor observations. The
functionality that OGC aims to supply a Sensor Web with, includes discovery of
sensor systems, determination of a sensor’s capabilities and quality of
measurements, access to sensor parameters that automatically allow software to
process and geo-locate observations, retrieval of real-time or time-series
observations, tasking of sensors to acquire observations of interest and
subscription to and publishing of alerts to be issued by sensors or sensor
services based upon certain criteria.
Technologies
and standards issued by the World Wide Web Consortium (W3C) will be used in
this context to implement the Semantic Sensor Web (SSW) vision, an extension of
the Sensor Web, where sensor nodes will be able to discover their respective
capabilities and exchange and process data automatically without human
intervention. Components playing a key role in Semantic Sensor Web are
ontologies, semantic annotation, query languages and rule languages. Rules can
be defined using SWRL (Semantic Web Rule Language) and additional knowledge can
be extracted by applying rule-based reasoning. Moreover, complex queries
written in SPARQL Query Language for RDF - a W3C recommendation (or equivalently,
a standard for the Web) - can be submitted to the Sensor Web for meaningful knowledge
extraction and not just for simple retrieval of sensor readings. The application
of these technologies will transform the Sensor Web Enablement service standards
to Semantic Web Service interfaces, enabling sensor nodes to act as autonomous agents
being able to discover neighbouring nodes and communicate with each other.
Current Approaches
Many approaches are
available today for managing sensor networks, regarding Data Management in
Sensor Networks using Semantic Web Technologies 107 especially the aggregation
and processing of data and several architectures have been proposed that
provide services to the end user through the exploitation of the collected
data. Existing approaches combine data from sensors in order to carry out
high-level tasks and offer to the end user a unified view of the underlying
sensor network. They usually provide a software infrastructure that permits
users to query globally distributed collections of high bitrate sensors’ data
powerfully and efficiently. Following this approach, the SWAP framework
proposes a three tier architecture comprising a sensor, knowledge and a
decision layer, each one of them consisting of a number of agents. Special care
is taken for the semantic description of the services available to the end
user, allowing the composition of new applications.
CHALLANGES FOR THE DATABASE COMMUNITY
Given the view of
the sensor network as a huge distributed database system where each sensor node
corresponds to a database site that holds part of the data, we would like or
adapt existing techniques from distributed and heterogeneous database systems
for the sensor network environment. But at close investigation, we can
distinguish four major differences between sensor networks and traditional
distributed and heterogeneous database systems. Physical Characteristics.
Sensor networks have physical characteristics that are very different from
regular desktop computers or dedicated equipment in data centers. Sensors might
fail at any time; the networking layer might only provide very weak quality of
service, and the sensor nodes have strict resource limitations such as limited memory,
computational and battery power. Query processing has to be aware of these
physical constraints. One way of thinking about such constraints is the
analogous interaction with the operating systems in traditional database
systems.
Database systems bypass the operating
system buffer to have direct control over the disk. For a sensor network
database system, the analogous resource is the networking layer, and for
intelligent resource management we have to ensure that the query processing
layer is tightly integrated with the networking layer. We can distinguish
several types of queries in a sensor network. Long-running queries deal with
the status of the sensor network over a user-defined time period. Other queries
are ad-hoc or snapshot queries that query the current status of the sensor
network. Strategies for evaluating these two types of queries are likely to be
very different: In long-running queries, we can pay up-front a higher cost that
can be amortized over the lifetime of the query.
Due
to inherent resource limitations of sensor networks, users should be able to
trade off the accuracy of a query answer versus the quantity of resources used
to compute the query answer. As a simple example, assume that the sensor
network consists of N temperature sensor nodes. To accurately compute the
average temperature, all sensor nodes need to be contacted and their
temperatures aggregated. But the user might be sufficiently confident in the
average of M << N sensor readings, given that the sensors chosen are a
random sample of the overall set of sensors. Computing the average of M sensor
readings requires much less energy since only a small subset of all sensors is
contacted. This is analogous to the computation of an aggregate through a
sample in a database. Further research is necessary to understand these
tradeoffs in detail as it is not obvious how to select a truly random sample of
sensor nodes that satisfy a given geographic constraint without complete
knowledge about the sensor network at a central query optimization node. Data
Streams. Sensors produce data continuously in data streams, and sensor nodes
have only limited memory and computational resources. We need to develop new
query processing techniques for the online processing of data streams that do
not assume that relations are materialized on secondary storage. Important for
data stream processing will be intelligent data reduction at individual sensor
nodes through the computation of stream aggregates. In addition to the
computation of such statistics, we need to be able to process these synopsis
data structure themselves when we combine synopsis data from several sensors.
Many sensor networks
will include actuators — devices that allow manipulation of properties of the
physical world; simple examples are temperature controls, door locks, or light
switches. Scalable, distributed trigger management is a considerable research
challenges for large-scale monitoring and control sensor systems.
Data Layer
This layer handles
raw sensor data discovery, collection and aggregation in a central entity. Efficient
data aggregation is crucial for reducing communication cost, thereby extending
the lifetime of sensor networks. Based on the topology of the network, the
location of sources and the aggregation function, an optimal aggregation
structure can be constructed. Optimal aggregation can be defined in terms of
total energy consumption, bandwidth utilization and delay for transporting the
collected information from simple nodes to the sink nodes. Data gathering can
be realized following structured or structure-free approaches:
• Structured approaches are suited for data
gathering applications where the sensor nodes are following a specific strategy
for forwarding the data to the sink nodes. Due to the unchanging traffic
pattern, structured aggregation techniques incur low maintenance overhead and
are therefore suited for such applications. But, in case of dynamic environments,
the overhead of construction and maintenance of the structure may outweigh the
benefits of data aggregation. Furthermore, structured approaches are sensitive
to the delay imposed from the intermediate nodes, the frequency of the data transmission
and the size of the sensor network. The central entity is responsible for the discovery
of new nodes and the specification of the data acquisition policy. The data acquisition
can be event-based where data are sent from the source and a method is called to
collect them (serial ports, wireless cameras) or polling based where the
central node periodically queries the data from the managed sensors.
Processing Layer
Due to the raw nature
of sensory data and the fact that it cannot provide us with high-level
information extraction, several XML-based models are being used in order to
interpret it. This will leverage its usability, allow further processing and
finally make it meaningful for the end user. Proper processing is necessary,
especially in cases of aggregation of data from many heterogeneous sources and
the need for discovery of possible correlations among the aggregated data.
Furthermore, the processed data can be distributed to other network devices
(e.g. PDAs) without the need for sensor-specific software. Different XML
templates can interpret in a different way the sensory data according to the
application related view. The aggregated data has to be processed and
integrated in a manner that shortens the data
exchanging transactions. Integrating the
data and transforming it into an XML (possibly a Sensor) format makes it
meaningful for the end user. Initially, the Processing Layer integrates the
bulk of the incoming data.
It is not
necessary neither optimal, in certain cases, to maintain the total amount of
data. Consider for instance a sensor network consisting of some dozens of
sensors measuring the temperature over a field. While keeping track of the temperature
levels is useful, processing every single datum originating from every single
sensor is not needed. Such practice would overload the network, augment its
maintenance needs and consequently decrease its autonomicity. Moreover, the
volume of the archived information would soon require a substantial storage
capacity. Aggregated reports (such as the maximum or an average of the values
reported) may be sufficient to describe the conditions that are present in the
area of interest. Subsequently, the integrated information collected by the
sensors has to be forwarded to the upper Semantic Layer. In order for this to
be achieved, the information needs to be encapsulated in messages suitable for
further machine processing.
Semantic Layer
The Semantic Layer abstracts the processed
outputs from the heterogeneous, low-level data sources such as sensors and
feature extraction algorithms, combined with metadata, thus enabling context
capturing in varying conditions. Context annotation is configured through
application-specific ontologies and it can be automatically initiated without
any further human intervention. It must be noted that the Semantic Layer is not
an indispensable part of sensor network architecture, in the same way that
semantics do not need necessarily to be part of systems.
CONCLUSIONS
This paper outlines a
research program that addresses fundamental problems in sensor networks:
Data streams,
uncertainty about sensor measurements, query processing, and trigger
management. While developing techniques that address the three problems above,
we must not forget that scalability of the techniques with the size of the
network, the data volume, and the query workload is an intrinsic consideration
to any design decision. I believe that sensor networks are a research area with
challenging data management problems for years to come.
REFERENCES
[1] M. Balazinska, A. Deshpande, M. J.
Franklin, P. B. Gibbons, J. Gray, M. Hansen, M. Liebhold, S. Nath, A. Szalay,
V. Tao, Data Management in the Worldwide Sensor Web, IEEE Pervasive
Computing, p. 30-40 (2007).
[2] K.W. Fan, S. Liu, P. Sinha,
Structure-Free Data Aggregation in Sensor Networks, IEEE Transactions on
Mobile Computing, p.929-942 (2007). 116 A. Zafeiropoulos, D.E. Spanos, S.
Arkoulis et al.
[3] V. Cantoni, L. Lombardi, P. Lombardi,
Challenges for Data Mining in Distributed Sensor Networks, 18th
International Conference on Pattern Recognition (ICPR'06), p. 1000-1007
(2006).
[4] P. Sridhar, A.M. Madni, M. Jamshidi,
Hierarchical Data Aggregation in Spatially Correlated Distributed Sensor
Networks, World Automation Congress (WAC '06), p.1-6
(2006).
[5] K. Romer, F. Mattern, The Design Space
of Wireless Sensor Networks, IEEE Wireless Communications, p. 54-61
(2004).
[6] S. Rajeev, A. Ananda, C. M. Choon, O.
W. Tsang. Mobile, Wireless, and Sensor Networks - Technology, Applications, and
Future Directions, John Wiley and Sons,
2006.
[7] C. Reed, M. Botts, J. Davidson, G.
Percivall, OGC® Sensor Web Enablement: Overview and High Level Architecture, IEEE
Autotestcon, 2007, p.372-380 (2007).
No comments:
Post a Comment