The idea is to have a lightweight alternative to a database where all the data resides in main memory across multiple machines, rather than on disk.
Goal is to create a single logical web cache by implementing a shared cache at a large scale requires spreading the cache over multiple machines.
DHTs characteristically emphasize the following properties:
1. Decentralization: the nodes collectively form the system without any central coordination.
2. Scalability: the system should function efficiently even with thousands or millions of nodes.
3. Fault tolerance: the system should be reliable (in some sense) even with nodes continuously joining, leaving, and failing.
A key technique used to achieve these goals is that any one node needs to coordinate with only a few other nodes in the system – most commonly, Θ(logn) of the n participants (see below) – so that only a limited amount of work needs to be done for each change in membership.
The structure of a DHT can be decomposed into several main components.[2][3] The foundation is an abstract keyspace, such as the set of 160-bit strings. A keyspace partitioning scheme splits ownership of this keyspace among the participating nodes. An overlay network then connects the nodes, allowing them to find the owner of any given key in the keyspace.
An overlay network and the underlay network on top of which the overlay network is built. Messages between the nodes in the overlay network logically follow the ring topology of the overlay network, but physically pass through the links and routers that form the underlay network.
DHT dynamically decide which node is responsible for which items. If the nodes currently responsible for certain items are removed from the system, the DHT self-manages by giving other nodes the responsibility over those items. Thus, nodes can continuously join and leave the system. The DHT will ensure that the routing tables are updated, and items are redistributed, such that the basic operations still work. This joining or leaving of nodes is referred to as churn or network dynamism
Another key feature of DHTs is that they are fault-tolerant. This implies that lookups should be possible even if some nodes fail. This is typically achieved by replicating items. Hence failures can be tolerated to a certain degree as long as there are some replicas of the items on some alive nodes. faulttolerance and the accompanying replication are self-managed by the system. This means that the system will automatically ensure that whenever a node fails, some other node actively starts replicating the items of the failed node to restore the replication degree.
Functionality of DHTs
Range Queries In some applications, it might be useful to ask the DHT to find values associated to all keys in a numerical or an alphabetical range. F
Group Communication The routing information which exists in DHTs can be used for group communication. This is a dual use of DHTs, whereby they are not really used to do lookups for items, but rather just used to facilitate group communication among many hosts. For instance, the routing tables in the DHT can be used to broadcast a message from one node to every other node in the overlay network. The advantage of this is that every node gets the message in few time steps, while every node only needs to forward the message to a few other nodes.
Comments
Post a Comment