BBoard-ID: 7621 BB-Posted: Tue, 25 Oct 88 2:06:08 EDT To: tcp-ip@sri-nic.ARPA Subject: 4BSD TCP Ethernet Throughput Date: Mon, 24 Oct 88 13:33:13 PDT From: Van Jacobson Many people have asked for the Ethernet throughput data I showed at Interop so it's probably easier to post it: These are some throughput results for an experimental version of the 4BSD (Berkeley Unix) network code running on a couple of different MC68020-based systems: Sun 3/60s (20MHz 68020 with AMD LANCE Ethernet chip) and Sun 3/280s (25MHz 68020 with Intel 82586 Ethernet chip) [note again the tests were done with Sun hardware but not Sun software -- I'm running 4.?BSD, not Sun OS]. There are lots and lots of interesting things in the data but the one thing that seems to have attracted people's attention is the big difference in performance between the two Ethernet chips. The test measured task-to-task data throughput over a TCP connection from a source (e.g., chargen) to a sink (e.g., discard). The tests were done between 2am and 6am on a fairly quiet Ethernet (~100Kb/s average background traffic). The packets were all maximum size (1538 bytes on the wire or 1460 bytes of user data per packet). The free parameters for the tests were the sender and receiver socket buffer sizes (which control the amount of 'pipelining' possible between the sender, wire and receiver). Each buffer size was independently varied from 1 to 17 packets in 1 packet steps. Four tests were done at each of the 289 combinations. Each test transferred 8MB of data then recorded the total time for the transfer and the send and receive socket buffer sizes (8MB was chosen so that the worst case error due to the system clock resolution was ~.1% -- 10ms in 10sec). The 1,156 tests per machine pair were done in random order to prevent any effects from fixed patterns of resource allocation. In general, the maximum throughput was observed when the sender buffer equaled the receiver buffer (the reason why is complicated but has to do with collisions). The following table gives the task-to-task data throughput (in KBytes/sec) and throughput on the wire (in MBits/sec) for (a) a 3/60 sending to a 3/60 and (b) a 3/280 sending to a 3/60. _________________________________________________ | 3/60 to 3/60 | 3/280 to 3/60 | | (LANCE to LANCE) | (Intel to LANCE) | | socket | | | buffer task to | task to | | size task wire | task wire | |(packets) (KB/s) (Mb/s) | (KB/s) (Mb/s) | | 1 384 3.4 | 337 3.0 | | 2 606 5.4 | 575 5.1 | | 3 690 6.1 | 595 5.3 | | 4 784 6.9 | 709 6.3 | | 5 866 7.7 | 712 6.3 | | 6 904 8.0 | 708 6.3 | | 7 946 8.4 | 710 6.3 | | 8 954 8.4 | 718 6.4 | | 9 974 8.6 | 715 6.3 | | 10 983 8.7 | 712 6.3 | | 11 995 8.8 | 714 6.3 | | 12 1001 8.9 | 715 6.3 | |_____________________________|__________________| The theoretical maximum data throughput, after you take into account all the protocol overheads, is 1,104 KB/s (this task-to-task data rate would put 10Mb/s on the wire). You can see that the 3/60s get 91% of the the theoretical max. The 3/280, although a much faster processor (the CPU performance is really dominated by the speed of the memory system, not the processor clock rate, and the memory system in the 3/280 is almost twice the speed of the 3/60), gets only 65% of theoretical max. The low throughput of the 3/280 seems to be entirely due to the Intel Ethernet chip: at around 6Mb/s, it saturates. (I put the board on an extender and watched the bus handshake lines on the 82586 to see if the chip or the Sun interface logic was pooping out. It was the chip -- it just stopped asking for data. (The CPU was loafing along with at least 35% idle time during all these tests so it wasn't the limit). [Just so you don't get confused: Stuff above was measurements. Stuff below includes opinions and interpretation and should be viewed with appropriate suspicion.] If you graph the above, you'll see a large notch in the Intel data at 3 packets. This is probably a clue to why it's dying: TCP delivers one ack for every two data packets. At a buffer size of three packets, the collision rate increases dramatically since the sender's third packet will collide with the receiver's ack for the previous two packets (for buffer sizes of 1 and 2, there are effectively no collisions). My suspicion is that the Intel is taking a long time to recover from collisions (remember that you're 64 bytes into the packet when you find out you've collided so the chip bus logic has to back up 64 bytes -- Intel spent their silicon making the chip "programmable", I doubt they invested as much as AMD in the bus interface). This may or may not be what's going on: life is too short to spend debugging Intel parts so I really don't care to investigate further. The one annoyance in all this is that Sun puts the fast Ethernet chip (the AMD LANCE) in their slow machines (3/50s and 3/60s) and the slow Ethernet chip (Intel 82586) in their fast machines (3/180s, 3/280s and Sun-4s, i.e., all their file servers). [I've had to put delay loops in the Ethernet driver on the 3/50s and 3/60s to slow them down enough for the 3/280 server to keep up.] Sun's not to blame for anything here: It costs a lot to design a new Ethernet interface; they had a design for the 3/180 board set (which was the basis of all the other VME machines--the [34]/280 and [34]/110); and no market pressure to change it. If they hadn't ventured out in a new direction with the 3/[56]0 -- the LANCE -- I probably would have thought 700KB/s was great Ethernet throughput (at least until I saw Dave Boggs' DEC-Titan/Seeq-chip throughput data). But I think Sun is overdue in offering a high-performance VME Ethernet interface. That may change though -- VME controllers like the Interphase 4207 Eagle are starting to appear which should either put pressure on Sun and/or offer a high performance 3rd party alternative (I haven't actually tried an Eagle yet but from the documentation it looks like they did a lot of things right). I'd sure like to take the delay loops out of my LANCE driver... - Van ps: I have data for Intel-to-Intel and LANCE-to-Intel as well as the Intel-to-LANCE I listed above. Using an Intel chip on the receiver, the results are MUCH worse -- 420KB/s max. I chose the data that put the 82586 in its very best light. I also have scope pictures taken at the transceivers during all these tests. I'm sure there'll be a chorus of "so-and-so violates the Ethernet spec" but that's a lie -- NONE OF THESE CHIPS OR SYSTEMS VIOLATED THE ETHERNET SPEC IN ANY WAY, SHAPE OR FORM. I looked very carefully for violations and have the pictures to prove there were none. Finally, all of the above is Copyright (c) 1988 by Van Jacobson. If you want to reproduce any part of it in print, you damn well better ask me first -- I'm getting tired of being misquoted in trade rags.