Monday, October 12, 2009

Review: Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications

Most P2P network, all computers in the network, known as nodes, are considered to be equivalent in their capacity for sharing resources with other nodes. The paper "Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications" [1], presents Chord as a distributed lookup service that is scalable and decentralized which may be used as the basis for general purpose P2P systems. It also discuss that Chord can be used in mutual web caching used to find the cache that contains the desired web data, with its URI as the key for the DHT. Chord gives a way of locating documents while placing few limitations on the applications that use it, to prove this it outlines Chord’s functionality is useful in the development of P2P applications.

As we all know, all system has its own strengths and weaknesses. This paper also explains some of the shortcomings this Chord system. Chord doesn’t consider that some data information could be larger than others data during the load balancing. Like for example in web caching, most cached data could be textual web data, mix together with some large videos wherein problems can be occurred for the nodes that have to host the videos. Thus, researchers should be careful on how to map this information and data to nodes and at what granularity to store documents. Another problem of the Chord implementation is the organization of data based on distributed hash tables for such applications. In order to improve the performance, researchers exploit a property of the Chord lookup algorithm: the paths that searches for a given successor take through the Chord ring are likely to intersect. These intersections are more likely to occur near the intention of the search where each step of the algorithm makes a smaller ‘hop’ through the identifier space and provide an opportunity to cache data. Thus, improve the Chord performance on the lookup service.

The performance of existing P2P systems have been limited by rigid infrastructures that attempt to find some solutions for many problems. Chord implementation is used to separate the problem of location from the problems of data distribution, wherein P2P systems are able to decide where to compromise and as a result offer better reliability and security. Also we can use the concept of Chord, to provide a more centralized mapping between keys and the nodes that contain them. This is costly in the system, costly in terms of resources but it also provides one advantage that is it reduces the time required to find the node for a data.

References:
[1] Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, and Hari Balakrishnan. Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications. ACM SIGCOMM ‘01. August 2001.


Review: Looking up data in P2P Systems

The paper entitled "Looking up data in P2P Systems" [1], presents in the early part some attractive reasons why we are using Peer to Peer Systems. It states that it gives us a way to make us some computation and resources of a certain machine across the World Wide Web. It also explains the idea of centralized and decentralized P2P systems. Decentralized systems are models which featured by non-existence of a centralized directory, wherein nodes were randomly join to an existing network and requests are broadcasted using the process of flooding. In centralized systems, there is the existence of a central server which provides the directory service to all the nodes connected on the network. Connected nodes have the ability to download files from other peer nodes after they receive the location of files from the central server or computer. One of the disadvantages of this kind of system is that it has a single of point of failure. If the central servers downs, no file transfer will occur among the nodes on the network.

Another idea of this paper is that most P2P systems haven't seen widespread adoption for problems other than illegal file sharing of some music and videos like downloading torrents, which is in the first place a piracy act. This is not the only problem P2P system is facing as a distributed system. One of these is the so called lookup problem, wherein given a data item X stored at some dynamic set of nodes in the system, it should be found. There are many algorithms were invented to provide solutions to this problem like DHT algorithm. But this algorithm has its on strengths and weaknesses. Come to think of it, why many distributed systems are using systems that has a single point of failure. Maybe it is because it should be realize that the performance and cost of making these systems entirely symmetrical is not thoroughly justified. Instead of this we use replication algorithm to some critical centralized services to provide reliability and security.

Hopefully, we could appreciate this fully symmetric system when we don't want to have a central point. Thus, this leaves a justification in using these symmetrical systems, like ad hoc file sharing which are illegal in our piracy laws.

Consequently, this paper is valuable in presenting the DHT algorithms may be proven to be a building block for large distributed system on the World Wide Web.

References:
[1] Hari Balakrishnan, M. Frans Kaashoek, David Karger, Robert Morris, and Ion Stoica. Looking Up Data in P2P Systems., Vol. 46, No. 2. February 2003.

Thursday, July 2, 2009

Review: On Inferring Autonomous System Relationships in the Internet

The paper, On Inferring Autonomous System Relationships in the Internet [1] presents that Border Gateway Protocol (BGP) coordinates the interdomain routing the Internet. In others words, it formalizes a way to infer the relationship among autonomous systems in the internet. This protocol allows each AS to choose its own policy, wherein these policies are constrained by the commercial agreements between administrative domains. Because of this, the result of autonomous relationships is inherent part of the Internet structure.

It also presents some heuristics algorithms to conclude the autonomous relationships from BGP routing tables. They perform a survey with ASs' network administrators to collect information on the actual connectivity and policies of the surveyed ASs. The results of survey find that there exist a new AS relationship inference techniques achieve high levels of accuracy.

Generally this paper gives as a proof wherein we can review and improve on the existing heuristics. They proposed that the results of the AS inference should be made publicly available. Wherein this is great, in order it can be used in some other future research.

References:[1] Lixin Gao,"On Inferring Autonomous System Relationships in the Internet, vol. 9, no. 6, pp.733-744, December 2001.

Review: Interdomain Internet Routing

The paper, InterDomain Internet Routing by Hari Balakrishnan, and Nick Feamster [1] presents that routing protocols are defined by a set of message formats for describing the reachability information and preference for network addresses along with rules for processing this messages. These routing protocols play a vital role in networking, wherein, they ensure that information can be sent between computers connected to the ISP/network. They also facilitate routing policy implementation in a scalable manner within the network. One of the most important routing protocols is the Border Gateway Protocol (BGP) which is an inter-domain routing protocol. It ties up the various routers at the boundary between ISPs together, to make sure that a user of one network can reach a resource where it resides to a different network. Wide-area routing architecture is divided into autonomous systems (ASes) that exchange reachability information. BGPs are used within each ASes while Interior Gateway Protocols (IGPs), concerned with optimizing a path metric.

The good idea regarding this paper [1] is that this presents the discussion from an economic viewpoint into the technical points of BGP networking mechanisms and implementations. Wherein there are many policies in an interdomain routing protocol are driven by economic objectives which involve selecting of routes and reachability of information.

Generally, [1] is an important text on mechanism design which has been developed to understand and model the behavior of Internet Service Providers on the World Wide Web. Wherein one of the captivating ideas about BGP is that even though each ISP is an important component and there is little dependence between the ISPs, BGP is still doing its job very well.

References:
[1] H. Balakrishnan and N. Feamster , “Interdomain Internet Routing,” MIT Lecture Notes, (January 2009).

Thursday, June 25, 2009

Review: The Design Philosophy of the DARPA Internet Protocols

The design philosophy of the DARPA internet protocols by David D. Clark [1], presents that the current design of the protocols is the result evolution, and the features of TCP/IP are decided by early goals of Internet architecture. DAPRA Internet Architecture fundamental goal was to develop an effective technique for multiplexed utilization of existing interconnected network. In order to achieve this goal, they established detailed set of goals for the development of the Internet architecture. These goals have a set of priorities, which highlighted the design decisions within the architecture. The most important goal was continuity of communication despite loss of networks, which decided that the Internet should use datagram network design. In order to support multiple types of communication services, Transmission Control Protocol (TCP) and Internet Protocol (IP) are separated into two layers and IP becomes the based protocol of protocol stack. Interconnectivity among variety of network technologies also leads the Internet architecture becomes to be TCP and IP layers. Flexibility between a number of services is achieved by making a basic assumption, that network can transport a packet or datagram and should be delivered with reasonable reliability. All features of TCP/IP protocol stack are tailored to the goals of early Internet.

I also understand that Internet architecture is triumphant to meet the most important goals of early demand, but was not designed to meet future priorities. It doesn’t thoroughly satisfy the other goals in the architecture like accountability and distributed management. There should be an improvement to be done in order to design the protocols within the said set of goals effectively. Detailed understanding on the layers of Internet could be used in achieving these so called limitations. But I don’t mean that they should immediately satisfy all priorities, but rather the ability to adapt to meet the priorities of future. Aside from these priorities, performance should be put into consideration in the designing phase. Because of today’s enormous network systems, performance in terms of speed and reliability could be a factor that could determine the success or failure of this redesigning.

We still rely on a version of TCP/IP today. It has proven to be very challenging to improve the present protocols, if not entirely infeasible, to change the underlying architecture of the Internet at this point. The push to switch to IPV6 is one example of the difficulties involved. By implementing this, we can achieve more IP addresses than the current IPV4.

Scientists should recognize that in ever changing environment of the Internet priorities often change. It is not the ability to immediately satisfy all priorities, but rather the ability to adapt to meet the priorities of the future. It was a large step to move from networks which had been traditionally circuit switched to the packet switched Internet. The use of datagrams gives us more flexibility in dealing with unlike systems then continuous streams would allow. It is also important to realize that sessions can still be used in the datagram model by creating a virtual circuit. The issue of survivability which was once so important to the military is pretty much non-existent due to the extreme redundancy built into the topology of the Internet today. Much more important today is the issue of performance. Performance was not a large issue when the Internet was created, but could now be considered the most important issue. TCP/IP has adapted to fit the current day needs, but it is evident by the ordering of priorities when created that the protocol was not designed for the present day Internet.

Generally, the paper [1] is a good material for me to understand the Internet design philosophies.

Reference:
[1] D. D. Clark, "The design philosophy of the DARPA Internet protocols," ACM SIGCOMM Computer Communication Review, vol. 18, issue 4, August 1988.

Review: End-to-End Arguments in System Design

End-to-end arguments in system design by J.H. Saltzer, D.P. Reed and D.D. Clark [1] presents a design principle called end-to-end argument, which proposes that functions that were usually developed at low levels modules in a distributed system should be excluded from those modules. Instead, it recommends that these functions should be placed on the higher level modules, closer to the application that uses the function, in order to provide the essential functionality. It also states that when a module consists of low level functions, there are instances that it may not be able to execute the functionality as what the higher level application requires to do. But on the other hand, there are cases that low level functions provide better efficiency and performance. Thus, this principle should not be treated as an absolute but as a valuable guide for placing functionality in a communication system.

The end to end arguments are a set of design principles on how the Internet has been designed [2]. But due to the expansion of Internet, from individual to commercialization purpose, there is a rising threat of rethinking the Internet's original design principles. Few numbers of new requirements have emerged for the Internet and its applications and these will be fulfilled through the addition of new mechanism in the core of the network.

While multiple forces seem to support the alteration within the Internet mechanism that may be inconsistent with the end to end arguments, we can’t refute the contribution of end to end arguments in the preservation of flexibility and openness of the Internet. They also allow new applications to be developed. End to end implementations are supported by the need to ensure appropriate service, and to facilitate network transparency, and decentralism. Consequently, the simplicity of placing certain functionality to lower-level modules is applicable to certain service and can have an impact in terms of its performance.

References:
[1] J. H. Saltzer, D.P. Reed and D. D. Clark, "End-to-end arguments in system design," ACM Transactions in Computer Systems, vol. 2, no. 4, pp. 277-288, November 1984.
[2] M. S. Blumenthal and D. D. Clark, "Rethinking the design of the Internet: The end to end arguments vs. the brave new world," ACM Transactions on Internet Technology, vol. 1, no. 1, pp. 70-109, August 2001.