From braden Wed Nov 30 13:38:02 1988 Received-Date: Wed, 30 Nov 88 13:38:02 PST Received: from braden.isi.edu by venera.isi.edu (5.54/5.51) id AA00884; Wed, 30 Nov 88 13:38:02 PST Date: Wed, 30 Nov 88 13:37:42 PST From: braden Posted-Date: Wed, 30 Nov 88 13:37:42 PST Message-Id: <8811302137.AA02855@braden.isi.edu> Received: by braden.isi.edu (5.54/5.51) id AA02855; Wed, 30 Nov 88 13:37:42 PST To: end2end-interest Subject: Still more Minutes of the E2E TF Status: R END-TO-END TASK FORCE Minutes of the 12th Meeting November 3-4 at MIT LCS MINUTES TAKEN BY: Bill Nowicki 1. INTRODUCTION The meeting was held in the Laboratory for Computer Science building on the MIT Campus. The major excitement was the discovery of a virus (more properly, a "worm") that had infected not only MIT, but thousands of machines throughout the Internet. Through Dave Clark we received blow-by-blow descriptions of the progress of its eradication. Present: Unable to Attend: Bob Braden (ISI, Chair) Lorenzo Aguilar (SRI) Eric Cooper (CMU) Gerd Beling (FGAN) Dave Cheriton (Stanford) Jon Crowcroft (UCL) Dave Clark (MIT) Steve Deering (Stanford) Joel Emer (DEC/MIT) Van Jacobson (LBL) Bill Nowicki (Sun) Craig Partridge (BBN) 2. STATUS REPORTS Bill Nowicki - Sun The manager and some of the principal designers of NFS have started a new company (Legato Systems), but it is not known exactly what they are planning to do. NFS v3 is back to the original plan; copies of the spec were circulated to the NFS vendor group, and it is available on request. Both v3 and NFS v2 should be put out in RFC format. Eric Cooper asked whether anyone was thinking about extending or changing the socket interface to support transactions. The answer appeared to be "no." Sun is developing the TCP/IP software for System V release 4.0, which will use the AT&T streams framework rather than the BSD sockets framework, but will emulate the socket interface to user programs. Bill has identified some laboratories for doing network performance testing, but the actual testing still needs to be done. Someone has contracted with Sun to port a ProNet-80 driver to the Sun-4/260, which should provide an interesting test-bed for extending network performance beyond Ethernet speeds. Bill noted that the etherfind command in SunOS 4.0 has most of the capabilities of Van's tcpdump. Bill is looking at recasting the program into a more general form and adding a capability of saving packets in a file. Cooper suggested an RFC be written to describe a standard packet trace file format, to allow us to develop more modular trace analysis tools and to exchange traces. ACTION: Nowicki. A proposed paper on datagram transport issues (discussing NFS, Domain Name Service, and SNMP) was discouraged, because of philisophical conflicts with the members of the task force. Eric Cooper - CMU Eric is spending 100% time on the CMU Nectar project, building a high-performance host interface that is generally similar to Dave Clark's proposed architecture. They are building an electronic/ fiber-optic crossbar switch for 100 Mbps point-to-point. VME boards are being debugged, with a SPARC processor, some fast static RAM, and the AMD fiber-optic chip set. This technology could also be applied to FDDI ring interfaces, or to DS3 speed packet switches. Eric has had a paper accepted for ASPLOS [Reference??] Some version of MACH has Van's TCP improvements. [Bill Nowicki mentioned that Sun installed Van's header prediction but saw no performance improvement.] An implementation of NFS for Mach was obtained from NeXT and incorporated into the CMU Mach release. There is some interest in using Multicast for replicated file systems, but it is currently not being stressed. Craig Partridge - BBN (+Harvard, CSNET, etc.) BBN's multicast funding went away in August. The code was sent to Stanford for release. David Waitzman's RFC on his distance- vector-based multicast routing has been submitted to Postel for publication [see bibliography below]. For the experimental implementation, it was necessary to "tunnel" multicast datagrams through gateways that did not understand the class D addresses. This was done by using a Loose Source Route IP option, with the class D address hidden inside the option. This was necessary because "most" gateways discard packets containing unknown option types (!) BBNCC is close to completing an enhanced SPF routing algorithm in the Butterfly gateways, to handle multicast routing. Craig feels that an SPF base is much better than distance-vector base for multicast routing. There was a question if this can be published, since the work was done at the request of a three-letter U.S. Government Agency. ACTION: Braden and Partridge, ask Hinden about publication. It would be very desirable to get multicast routing into OSPFIGP. ACTION: Partridge -- get Alan Dahlbom (?) to give paper on SPF multicast routing at next IETF meeting. Craig would like to work on performance of data representation layers, and on transient IP address assignment. Dave Cheriton - Stanford Steve Deering was unable to attend since he is working furiously on his thesis. Cheriton plans a release of the Unix (4.3BSD and SunOS 4.0) code for multicasting and VMTP on November 15, 1988. The multicasting code will include the BBN code. He is trying to resolve Stanford licensing issues on this release; there is hope that it can be easier (e.g. anonymous FTP). This release will include a major cleanup of VMTP code and many simplifications by the use of recursion (as described at previous meetings.) There is only one new system call added to Unix, but no security or streaming. A graduate student is looking at VMTP performance improvements using "header prediction", including "response prediction." He is also looking at caching the mapping of process ID's to IP addresses; obtaining these mappings requires multicasting. Some measurements have shown multicast traffic to be about 2% of the received packets, up from 1% in previous measurements, but still a very small part of the total traffic. A "pigeon mode" (random (packet) droppings) in VMTP has helped to make the implementation more robust. VMTP is using Van's RTT estimator to set the packet rate. The question was discussed: is it important or meaningful to distinguish between read and write operation times; the concensus was "no." There was some discussion about a flashy demo of multicasting, such as a nation-wide mazewar game. ACTION: Cheriton, Clark. Finally, Dave is trying to complete an FDDI NAB board. Joel Emer - DEC/MIT Joel's work at Mercury is being wrapped up. He distributed two papers (see biblography). One interesting demonstration of this mechanism was an Emacs with RPC in it, that could query a database of New York Times and Associated Press wire service articles. Ultrix 3.0 will have the XTCP with Van's congestion control, and should be shipping Real Soon Now. DEC is serious about OSF (Open Software Foundation.) OSF is looking for a standard RPC package, and is looking seriously at Apollo's NCS. Partidge reminded us that Apollo's data description layer has the N X M problem. Dave Clark - MIT Dave distributed copies of a paper "An Analysis of TCP Processing Overhead" (see bibliography). This paper discusses the limits of TCP performance, and concludes that memory speeds become a bottleneck rather than TCP protocol overhead. On a 10 MIPS processor, it was estimated that TCP protocol processing could run at 800 Mbps, while most memory systems will require 4 cycles per byte, resulting in about 32 Mbps maximum. Dave would like to verify these figures experimentally (!) A new Professor Tennenhouse has been hired from Cambridge; he is interested in practical network problems. He is discussing collaboration with Bellcore, building a fibre net using Bellcore's Batcher-Banyan switching fabrics. Using the MIT simulation package, experiments have been done on adaptive NETBLT rate control algorithms, similar to the slow-start and congestion control policies of Jacobson for TCP. Although the rates can be made to converge, fairness is not necesarily guaranteed without putting state into gateways. Lixia Zhang is working on the problem of measuring flows in gateways to see if they meet contracted rates; she sees a sharp "phase transition" at 81%-82% load. The host multicast code has been installed in the MIT Unix kernels, but it has not been stressed. Clark wants to have some private T1 lines across the country to experiment with real gateways. Cheriton indicated interest in collaboration on experiments. The MIT gateway code can handle about 1200 packets per second on a MicroVAX, and Dave Boggs is working on a simple T1 card. The MIT gateways will have SNMP and OSPFIGP, so they could provide a suitable testbed. 3. DISCUSSION 3.1 "CONNECTION-ORIENTED IP" There have been a number of proposals for adding state at the IP level; terms which have been used include "soft state", "streams", and "connection-oriented IP". Bob Braden raised this topic to determine whether there are research topics in this area that ought to be of concern to the End-to-End Task Force. Is it something that can and should be addressed only in the context of high-speed networking, or is there a separable set of issues? This issue is mixed up with ST, which apparently provides a valuable service that IP cannot. Dave Clark gave the background on the ST issue. ST provides the multi-media (packetized video and voice) transport for teleconferencing. As an expedient and with the agreement of the IAB, BBN put the ST protocol into the Butterfly gateways, but now "some people love it," and the IETF has started a working group on connection-oriented IP. It was noted that an important player in this effort, Ross Callon, favors connection-oriented protocols as a general model. Cheriton claimed that it only makes sense to look at totally new architectures, while Braden wanted to investigate intermediate approaches, including modifications to current IP architecture. Cheriton: given that the real problem is how to provide the different types of service that real-time voice and video require, one solution would be just to increase the raw data rates; "there is no magic". Clark: since offered load will rise to fill any capacity, control of that offered load is essential. Partridge: the Arpanet was over-controlled in some ways, and too much overhead was wasted sending the control information ("thrashing"). Clark: when switches are the bottlenecks, then of course it is important to keep the control overhead very small. Cheriton: need more memory in switches. Clark: real resource bottlenecks are either switch (CPU) or link (bandwidth) capacity, not memory (buffers). Clark: we need a model of how to control resource overloads in the network. One extreme is circuit switching, a very long term allocation of resources. Another approach would be to put all the state in the packet (e.g. source routing or Batcher-Banyan bits), but then the gateways have no control. Cheriton: has graduate student looking at rate-based hop-by-hop flow control in gateways. There was some discussion on the difference between "soft state" and "caches": "soft state" may be visible to the higher layers, even if it can be recreated after a crash, while caches must be totally transparent without changing the architecture. Clark: soft state is part of the architecture, while caches are "just" performance improvement; however, caches will become important. His measurements show that IP-level processing in current gateways is only 10% of total, so caches not expected to help much today. However, as we add additional complexity to gateway processing for accounting, policy-based routing, QOS routing and queueing, etc., caches will become vital for performance. We need more data on locality to figure out where to put caches. He notes that OMB Circular 3, which mandates usage accounting for all shared telecommunications systems used by more than one government agency, is real requirement. The topic of gateway state was structured somewhat into three areas: * Cacheing for gateway performance * History-based allocation to support policy-based routing * Multiple QOS, e.g., controlled latency for video, speech. The second item led to another discussion of Fair queueing vs. random dropping. More people are interested in random dropping, especially as the range of compute powers on the network (Cray-2 vs. PC) broadens. Clark: does not see any driving force behind ST capabilities in IP. Braden suggested vendors and users will want packet video on a large scale, and that if we don't do something, there will be a proliferation of ST-like protocols from vendors, especially for use on isolated corporate internets. Clark claimed that real-time video across country was more interesting than down the hall, but Cooper said that higher speed services are always available locally first. There was some feeling that IETF would be the right place to address this issue, since it is specific to video. Finally, the discussion focussed on the possible role for the End-to-End Task Force. As a general model, engineering of the current Internet architecture is consigned to the IETF, while other task forces are supposed to be taking longer-range views. In practice, of course, task force boundaries are fuzzy. Braden pointed out that the End-to-End Task Force early took on IP multicasting, which noticeably stretched its host-centric charter. He suggested that if there were areas which are long-term and researchy in character and which are of interest to the current membership, we should consider taking them on. On the other hand, tackling new topics may suggest new members for the Task Force. Clark offered the following list of important topics for new work: 1. high-performance host interfaces 2. multiple types of service 3. congestion control 4. accounting 5. administration 6. security 7. addressing It was observed that the TF already has 3 members actively working on high performance host interfaces, so we OUGHT to take (1) on officially. There was some feeling that the End-to-End TF should consider officially adopting (2) and (3) as well, and consider additional members who would contribute to these topics. The other topics seem to fall clearly into the bailiwick of another TF. Since there were many absentees at this meeting, the chair postponed a final decision on these issues. 3.2 NATIONAL FILE SYSTEM The CMU ITC is promoting the use of the the Andrew File System across domains (called cells) as a national file system. There was a recent conference at CMU discussing this, and the main issue was concern on IBM's licensing policy. Andrew has its own RPC protocol, with actual file transfer as a side-effect. A new protocol called RX was invented, but not much was known about it. Drew Perkins was a contact. ACTION: Cooper. 3.3 IP TIME-TO-LIVE Braden presented a set of foils on the problems with TTL as it is currently defined in IP and used by TCP to bound segment lifetime. There is an immediate issue of what the Host Requirements RFC should say about it, and there is a general transport protocol issue of how to bound segment lifetime to provide reliable delivery. TTL is used for two different purposes: 1. Transport layer life-time limit A. bounding segment lifetime to prevent wrap 2**32 wrap: DS1 ~ 1 Day, FDDI ~ 6 minutes B. prevent confusion with packet from previous connection 2. Internet layer routing loop prevention The specification links reassembly timeouts to TTL on incoming fragments, but on long-delay connections you want LONGER reassembly timeouts, not shorter ones. In general, maintaining TTL as a timer is awkward issue for gateways since it is a "layer violation" -- once the IP layer has put it into the output driver's data-link or physical layer queue, the device driver usually knows nothing about the IP layer itself. Some possible fixes were discussed: A. Use as a hop count, hope nobody sends 2**32 bytes. B. Expand sequence number space beyond 2**32. There was zero enthusiasm for this solution. Kludgy way: a session layer that opens a new TCP connection (with different port pair) if first threatens to wrap around. C. Engineer gateways to bound delays. Again, zero enthusiasm, although Braden suggested that t his could be the real-world answer to the problem. D. Expiration time in each segment. Cheriton: this is the architecturally-correct solution, so that bounded segment lifetime will be an end-to-end transport-level property. Note: this requires synchronized clocks in gateways and hosts. Someone suggested milliseconds, but seconds might be good enough. Also probablistic synchronization might be good enough, especially if higher levels would prevent replays (e.g., by authentication). Clark: if did a new architecture, how much easier would it be if assume synchronized clocks? Suspects quite a bit. Cooper: can eliminate three-way handshakes; transactions are natural limit; replicated database updates easier. Clark: mechanism must provide well-defined graceful recovery (e.g., simply lose efficiency, not reliability) if lose synchronization. Braden: but what should be in Host Requirements RFC? The Internet Architect went into deep-hum mode, and then offered the following proposal, which was agreed to: (1) Think carefully about TCP exposures if TTL is not available to bound segment lifetime. ACTION: Braden. (2) Assuming (1) does not upset us too much, declare TTL is a hop count. (3) The TF to think about synchronized clocks, etc. 3.4 TOOL DEMOS Tim Shepard (who was famous after appearing on the front page of the Boston Globe that morning, along with somebody else also named Tim Shepard who died) gave some demos of the tool that takes packet traces and draws modified "Jacobson diagrams". Currently the tool produces only hard copy graphs, but an interactive version is being developed. He showed us a number of bizarre examples. Andrew Heybey gave some demonstrations of the network simulator running on X.11 on a Color VAXstation. He is working on simulating rate control. Eman Hashem was also doing simulations to detect oscilatory phenomena, varying retransmission algorithms and other parameters. Slow start works well with or a few connections, but there are more interesting interactions between many simultaneous connections, including a remarkably constant sinusoidal oscillation with a 5-second period (Van should see the graph!) 3.5 MULTICASTING Several people were interested in multicast transport protocols. There is commercial interest in the news service companies to use efficient multicast to distribute their information. Braden wanted to understand the status of incorporation of multicast support into a Berkeley release, but the question had to be tabled since Van was absent. ACTION: Jacobson. 3.6 VMTP Cheriton asked if VMTP should be proposed as a draft Internet standard, and if not, why not, and if so, how soon? Several issues were raised. * Too complex, too many features. Cooper: e.g., wants to be able to replace the security parts but use the rest. Cheriton replied that since the recursive design (as described in the RFC of February 1988) is now used, there are only two packet types on the wire, and other functions like security are built on top of these, so this issue should not stand in the way. * Rate-based Flow Control not tested over Internet Braden: could test VMTP between Stanford and ISI over Arpanet, WideBand Net, and the Los Nettos - BARRnet connection. ACTION: Braden, Cheriton. * No independent implementation. It was pointed out that there are already two implementations (V-System and Unix), and a third ought not to be required for a DIS status, although it might be required before final standard status. Cheriton discussed the proposal to use the 32-bit bitmask used in selective retransmission with 1024 bytes per bit instead of 512. Since Arpanet and SATnet are going away, only MILnet was of concern. Clark: could avoid the pain by a different ACK model: return left edge plus mask, where mask names first 32 segments from left window edge. This would allow the protocol to send an indefinite amount of data. Cheriton was not enthused. The group generally thought that the 512/1024 choice should not make a big difference. Nowicki: an interesting bake-off would be between an implementation of NFS using VMTP transport, and the current experimental UDP-based NFS. It was estimated to be about one person-month of work for a "competant kernel hacker". Authentication without encryption is not available in VMTP. According to Clark there are laws that specify that in many international transactions, data must NOT be encrypted. The lack of encryption hardware, and concern for integrity of data only and not disclosure (e.g. world-read databases), suggest that this compromise would be useful. Cheriton disagrees and wants the total solution. Steve Kent at BBN is the contact for the Privacy Task Force, and there is a working group formed in IETF to add security to SNMP. 4. NEXT MEETING A Video conference was tenatively scheduled for Friday February 24. This will be a three way meeting between ISI, BBN, and SRI. The next face-to-face meeting will be sponsored by ISI, tenatively June 7-8. Several suggestions were offered for venues, including a beach house at Malibu. 5. ACTION ITEMS Nowicki Try again to get high resolution timer information out to the people who need it. The person to contact is Chris Rigatuso, rifrig@Sun.COM. Please send him email and Cc me so that we can narrow down how many we are talking about. I think it will be one or two. Nowicki Propose a standard packet trace file format, after discussing it with other tracers in the research community. Cooper, Cheriton Produce reference on TOCS paper on timestamps for packet lifetime limitation. Braden, Partridge Ask Hinden about the avilability of the documentation on the SPF multicast routing implementation done for the butterfly gateways. Partridge Get Alan Dahlbom (?) to give paper on SPF multicast routing at next IETF meeting. Cheriton Inform Task Force of progress on Stanford release conditions. Cheriton Send a message when VMTP&multicast release is available. Braden, Cheriton Set up some Internet VMTP tests at ISI. Cooper Get new VMTP version installed into MACH and test. Cooper Obtain documents for RX protocol Cheriton, Clark Develop multicast demo, such as mazewar. Braden Write up resolved TCP TTL issues for the Host Requitements RFC. 6. BIBLIOGRAPHY Clark, D., Romkey, J., and Salwen, H., "An Analysis of TCP Processing Overhead", 13th Local Area Network conference ?? Emer, J., "Mercury Communication Subsystem", Common System Design Note 27, MIT Laboratory for Computer Science, August 29, 1988. Emer, J. and Weihl, W., "Integrated Interactive Access to Heterogeneous Distributed Services", ??? Feldmeier, D., "Estimated Performance of a Gateway Routing Cache", TM-352, MIT Laboratory for Computer Science, March 1988. Waitzman, D., Partridge, D., and Deering, S., "Distance Vector Multicast Routing Protocol", RFC-1075, November 1988.