Over the past few years, mesh networks have become more popular, following the trend to create more wireless things. As with other technology trends, as mesh networking has developed, so has a plethora of different mesh networking technologies and architectures. This article is intended to bring order to the mess of mesh networks. First, we'll discuss network basics, focusing on the specifics of wireless. Second, we'll present criteria for evaluating the different wireless mesh networking technologies. The second part of this article will present an overview of five different mesh-related technologies including their key characteristics, network architectures, strengths, and limitations. Finally, we'll aggregate the information and create an evaluation overview of these different technologies, including when they should be applied.
Network topologies describe the interaction and interconnection of the participants; how they communicate with each other and how they establish paths among each other. Network topologies are not always what they seem. In wired networks, topologies generally follow the path of the wires, so if devices are wired in a ring, then the network topology is a ring. In wireless networking we all share the same air space so the path and access method are not always obvious. For example, is a WiFi access point a star topology or a bus?
Before we begin our discussion, let's review some common terminology that will be used throughout this article.
- DSSS—Direct sequence spread spectrum. This is a method of signal encoding that distributes information over a wide section of the spectrum using a pseudorandom code. Because of the wide spread, the signal appears to be noise for those without the spreading code.
- FHSS—Frequency hopping spread spectrum. Similar to DSSS, the big differences are its use of a more constrained spreading algorithm and the fact that it changes channels as a function of time, theoretically making the transmission more immune to interference.
- TSMP—Time synchronized mesh protocol. This mesh protocol uses time slots to allocate parts of the spectrum for communication between two nodes. Because time slots differ between pairs, interference is minimized because channel access is controlled by time slot.
- AODV—Ad-hoc on-demand distance vector routing algorithm. AODV is a pure on-demand route acquisition algorithm—nodes that do not lie on active paths do not maintain any routing information nor do they participate in any periodic routing table exchanges. Furthermore, a node does not have to discover and maintain a route to another node until the two need to communicate.
- Cluster-Tree—A region-based mesh network routing algorithm in which routes are formed and maintained between clusters of nodes. Route discovery is completed and maintained between the clusters, providing access to the children of each cluster.
- ISM—Industrial Scientific and Medical bands. This describes license-free frequency bands. Generally we refer to the 2.4 GHz band, although in North America the ISM bands also include those at 900 MHz, 5.8 GHz, and 433 MHz. The 2.4 GHz band is used worldwide.
- IPv6—Internet Protocol Version 6. This is the latest version of the popular IP or Internet Protocol. With Version 6, the IP address structure, routing, and class of service change. IPv6 is part of the TCP/IP suite of protocols sponsored by the Internet Engineering Task Force (IETF).
- PAN ID—Personal Area Network Identifier. This is the term for the network name assigned to a particular personal area network (PAN).
- CSMA—Carrier sense multiple access. This protocol defines the channel access technique used by Ethernet, WiFi, and bus-oriented networks. It provides a method for detecting collisions and retransmits packets to acquire a communications channel.
- TDMA—Time division multiple access. This protocol defines the channel access technique used by TSMP and GSM networks in which a communications channel is divided into time slots. Each node is allocated a specific time slot for communication.
Basic Network Types. Figure 1 shows three common network topologies: star, bus, and ring. In the ring, nodes are connected from one to the next. Communications messages are forwarded around the ring in either a clockwise and/or counterclockwise fashion. As a message is forwarded, the node checks to see if the message is meant for itself; if so, it keeps the message, if not, it forwards it on. It is most common in cabled networks (wire or fiber), but could conceivably be used in a wireless fashion as well, although this would not be practical unless it were used over long distances.
Figure 1. Common network topologies
In the bus, all nodes share a common communications medium and compete to use it. Typically this means some kind of CSMA-type approach. Since a common medium is used, collisions and retransmissions increase as traffic increases (traffic loading). In the wireless case, open space is often a shared medium, so even if routing is handled in a star, ring, or other topology, open space often appears as a bus. More on this later.
In the star, nodes are connected through a master central node that is responsible for looking at each message and forwarding it out on the proper communications link. The Ethernet switch is a common version of this in wired networking. In terms of wireless communication, the WiFi access point is a familiar example, routing all messages through the access point. However, even though messages are routed through the access point, open space is accessed via CSMA, a bus-type protocol.
Figure 2. In a full mesh, each of the nodes is connected to all the others
Another important characteristic is that some or all nodes may be routers and some or all nodes may be end points. Typically, full interconnection is not achieved unless the network is very small. Full interconnection gets very complex very quickly.
Figure 3 illustrates three different instantiations of mesh networks. The green nodes are end devices, the yellow nodes are routers (which may also be end devices) and the purple node is the network coordinator responsible for allowing nodes to join and depart from the mesh. Note that one instantiation of a mesh can be a star—a mesh with one router and the rest as end points. The cluster-tree network is a combination of near full connectivity among routers with end points hanging off individual routers. A peer-to-peer mesh generally gives equal rights to all nodes, including routing and end point functionality.
Figure 3. Different implementations of mesh networks
Although mesh networks aren't really practical for wired networks, it is important to look at the other differences between wired and wireless domains. Wireless makes it possible to have more connections because it is neither practical nor cost effective to create a full mesh with wires. However, wires are predictable, reliable, and well understood. Wireless forces the sharing of an already noisy, uncontrollable medium called open space. While wireless gives us more flexibility, the uncertainty of wireless drives the need for more connection paths…and more complexity.
When evaluating wireless networks, particularly mesh networks, there are a number of hard problems that need to be solved.
- Accessing the medium. Since we all share open space, listening is more important than talking. If everyone talks at once, listening is difficult. So radios must be good listeners if they are going to have a chance to get a word in edgewise, so to speak.
- Discovering routes. Determining paths in a wireless mesh network is difficult because the environment is dynamic. In this case, there are two choices: Plan the trip in advance, or take it one step at a time. Oftentimes doing both is best; this usually involves retracing one's steps and repeating well-traveled routes.
- Adapting to a changing environment. In a wireless world, paths to nodes can disappear and reappear due to changing signal or traffic conditions.
- Sleeping and Waking. Once we go wireless, the next step often involves finding a way to do without the traditional power cable. This means using batteries and requiring effective power management. The most common way of handling power management is to put the nodes to sleep when they are not being used, which sounds well and good until it is time to wake up.
How to Compare
Given that we now understand the basics of networks—in particular mesh networks—how do we compare them? The criteria used to address this question are security, reliability, power management, scalability, data movement, and cost.
Security. This is as much about the perception of threat as actual threat. Nonetheless, we can evaluate security using the traditional factors that are well understood in the industry. The first is encryption—protecting the information itself. Modern encryption wants at least AES128 as an algorithm (using a 128-bit key). The next is authentication, which is validating that the users (or nodes) are who they say they are and this is typically handled by key exchange or an authenticated certificate. The last is authorization, which involves the granting of permission based on having the right key or certificate. Other factors are associated with the ease of distributing and configuring the authorization and authentication mechanisms.
Reliability. The best way to think of reliability is the ability of a message to be delivered to the desired destination on time. If the message always arrives at the destination when expected, the network is very reliable. We also want the message to arrive at the destination, even if it is a bit late. The components for evaluating reliability for wireless mesh networks involve the following:
- Frequency agility—Detecting potential interference and adapting the network around it.
- Message loss potential—A measure of whether messages get lost in the shuffle. With all the rerouting and different paths, the network must be very careful to ensure messages don't get lost and that duplicate messages following different routes are discarded.
- Adaptability—Best described as the network's ability to change the routing to accommodate for nodes disappearing while still preventing lost messages. This is most effective if done quickly.
- Single points of failure—Are there any single points of failure, what is the risk of them failing, and how is recovery handled?
Power Management. The most frequently asked question when discussing wireless sensor networks is how long will the batteries last, because everyone wants to keep maintenance needs low. Within the context of network architecture, power management is analyzed in terms of end nodes, router nodes, and network coordinators. It is most important to have low-powered, power-efficient end nodes because they are most likely to be located far from traditional wired power sources. In terms of routers, battery-powered routers or routers that sleep extend the flexibility of the architecture. Finally, the network coordinator is usually powered.
In the context of nodes that can sleep, we look at their average power consumption. This is best assessed by looking at how they wake up, how frequently they wake up, total transmitting time, and total listening time. Because the most power is consumed when radios transmit, it is important to keep this to a minimum.
Scalability. How big can the network get before it fails, at least on a practical level? All the networks have large physical limits (in terms of physical addresses) in the tens of thousands of nodes, but the practical design of the network is always much smaller. This is because scalability is related both to reliability mechanisms and to the nature of the application. If a network never experiences problems that cause rerouting, then network routing tables will never change, meaning cached routes will always work and that there will be few retransmissions or reroutes because of failures. The end result is a very stable network that can be very large.
The other aspect concerns the type and volume of data and these can be categorized as dribble data, bursty data, or streaming data—and they mean just what the names suggest. Dribble data are periodic, infrequent, and slow, while streaming data are constant, and so on. A network can be very large if the traffic is made up of dribble data because the flow follows consistent patterns, with plenty of bandwidth. Sleeping networks do well with dribble data, but scale poorly with streaming data.
Data Movement. Now we look in more detail at data flow to assess raw capacity. There is a classic trade-off in needs: Does the application require lots of data with low latency or does it require dribble data with long, nondeterministic latency? In evaluating networks for data movement, consider a combination of the following five variables: data rate, latency, packet size, fragmentation, and range.
Cost. Cost is measured by the individual unit cost as well as the cost to maintain the network. In this context, maintenance is often difficult to quantify and deployment cost is often forgotten. It is easiest to quantify those variables that are most perceptible, namely the actual purchase cost of a transceiver system per node. This becomes more complicated when trading off the number of battery-powered or sleeping nodes. For example, if you assume all end points need to be sleeping end points, then a point-to-multipoint system is not practical due to range. In contrast to a network that has sleeping routers, a network that doesn't have sleeping routers will need to deploy powered routers in addition to the end points. Even if all radios cost the same, more radios are needed in the powered router system. However, if power is available, then this becomes a non-issue. The cheapest radio may not be the best for the application. The other point is that the cost of the radio tends to be looked at in relation to the cost of the device to which it is connected.
Next month we'll discuss point-to-multipoint, ZigBee 2007, Wireless HART, 6LoWPAN, and Digi Mesh mesh networking technologies in greater detail.
Editor's note. The figures in parts 1 and 2 of this article are numbered consecutively.