Network coding offers new commmunications paradigm across multiple disciplines
The idea of network coding (NC, also known as netcode) is readily illustrated by the puzzle shown in Figure 1.1 Each arrow in the butterfly network is a channel for transmitting one bit of data. To receivers R1 and R2 the source intends to deliver a message represented by the two bits a and b. In straightforward fashion, bit a reaches node V and receiver R1 while bit b reaches V and R2. The dilemma concerns which bit node V should forward to node W upon receiving both data bits. If it forwards bit a, then receiver R1 will not get bit b. If it instead forwards b, R2 will not receive a.
The solution is for node V to send out a bit indicating whether a and b are equal; for example, bit 0 when they are equal or otherwise bit 1. This new bit is called the binary sum or the exclusive OR between a and b in computer logic. When the new bit is forwarded from node W, it is used by receiver R1 to decode the value of b and by R2 for the value of a. While the dilemma confounds the conventional store-and-forward mechanism (i.e., copying), the solution suggests a fundamental concept that blends coding with data forwarding. Partial hints along these lines have appeared sporadically in the literature, including redundant array of independent disks (RAID),2 access to multi-channel LAN,3 and satellite networks.4 The concept was formalized as ‘network coding’ over a generic abstract network and its superiority to store-and-forward has been mathematically proven.5 Advantages can be quantified in terms of bit rate, latency, energy consumption, and other transmission variables.
What types of communications networks can benefit from NC? One general scenario is broadcasting and data transmission in nature (such as wireless communication) including ad hoc and sensor networks. Another prospective use involves separate transmissions competing for a channel. To benefit from NC, the main effort must be to encode at intermediate nodes and to decode at receivers. Complexity and latency of coding and decoding are minimal when the code is linear. Indeed, purely linear codes enable NC to achieve the best possible outcome,1,6 which assures feasibility of implementation in practical hardware and software applications. To date, all NC applications refer to linear network coding, despite their considerable variety.
Until now, the two most prominent applications have been in wireless communications and peer-to-peer (P2P) content distribution. A relatively simple wireless scenario is as follows. Two stations are located at a distance of double the transmission range. Between them lies a relay transmitter, at which a function cycle consists in the exchange of one packet from each station to the other. In the conventional store-and-forward model a cycle would require four time slots: two to receive a packet from each station and two to transmit to each station. With NC, the cycle can consist of just three time slots, replacing the two transmissions by broadcasting the binary sum of the two received packets. This reduction translates into 33% gain in bandwidth efficiency, solely through mathematical manipulation.
The dominant traffic on the Internet is currently P2P content distribution. Some nodes in a P2P network download blocks of data from the server. Meanwhile, every node maintains connections with neighboring peers, with which it exchanges blocks. Since a node transmits a block to different peers at different times, this is logically an asynchronous form of broadcasting, and hence it has a good fit with network coding. In fact, NC is the theoretic underpinning of Avalanche, the recently announced Microsoft alternative to BitTorrent.7 Among other benefits, it enhances Avalanche's information rate by 20 to 30%.8
To understand this P2P application, imagine a video segment partitioned into 100 4kb blocks at the server. Linear network coding adopts a field such as GF(256) as the alphabet and interprets every byte of information as an element within it. A packet transmitted over the P2P network is a linear combination over GF(256) of the 100 blocks. The 100 coefficients associated with the packet constitute a 100-dim vector over GF(256), which is called the global encoding kernel.9 According to the random network-coding scheme,10,11 the kernel is stored in the packet header, a 0.25% overhead in this instance. A node produces a packet by random linear combination of all blocks it currently stores and sends to a peer. Eventually, when the accumulated packet headers include 100 linearly independent global encoding kernels, a receiver can decode the video segment. Linear network coding theory guarantees linear independence among the first 100 received global encoding kernels when the field of alphabet is large enough and the linear combinations are by design. With 101 or more kernels under the random network scheme over the moderately large field GF(256), the probability of successful decoding is essentially 1.
Network coding theory represents one of the most important recent breakthroughs in network communications. It has stimulated multi-disciplinary research with a great variety of potential applications, including multicast, peer-to-peer communications, wireless networks, sensor networks, personal communications, B2C commerce, and other modalities. Since initial publications by researchers at The Chinese University of Hong Kong, NC has rapidly generated research that has had a fundamental impact upon information theory, coding theory, networking, switching theory, wireless communications, computer science, cryptography, operations research, and matrix theory. It is not unreasonable to say that NC represents a new transmission paradigm that may prove dominant in future communications technology.
Research supported in part by grants No. CUHK4231/04E and 414005 from the Research Grants Council of the Hong Kong Special Administrative Region, China.