From van@lbl-csam.arpa Sun Jan 31 00:03:16 1988 Posted-Date: Sun, 31 Jan 88 00:01:12 PST Received-Date: Sun, 31 Jan 88 00:03:16 PST Received: from LBL-CSAM.ARPA by venera.isi.edu (5.54/5.51) id AA19754; Sun, 31 Jan 88 00:03:16 PST Received: by lbl-csam.arpa (5.58/1.18) id AA11040; Sun, 31 Jan 88 00:01:14 PST Message-Id: <8801310801.AA11040@lbl-csam.arpa> To: Jon Crowcroft Cc: end2end-interest@venera.isi.edu Subject: Re: measurements In-Reply-To: Your message of Fri, 29 Jan 88 15:59:31 GMT. Date: Sun, 31 Jan 88 00:01:12 PST From: Van Jacobson Status: R Jon - Your latest measurements are absolutely fascinating. I was (barely) able to ftp the goony & cnuce tests (I sure hope purple isn't running the new tcp -- if it is, looks like I've got a lot of tuning left to do). I think I understand most of what's there except for a clock mystery on the goony test. The clock mystery is that the ttcp output in your message said the goony test took 693 seconds for a throughput of 2.88 kBps. The tcpdump output says the test took 770 sec for a throughput of 2.65 kBps. I think there's evidence in the trace for the 2.65 kBps rate so I'll make a wild guess that it's the correct number (but I don't understand why the ttcp & tcpdump output agree exactly for the cnuce test and disagree by 10% on the goony test. And the relative offsets between ego's clock and the clock of whatever was running tcpdump changed between the two tests: The clocks read 16:07 & 16:01 at the start of the goony test then 17:51 & 18:18 at the start of the cnuce test. Are you running timed or something that could change the clock while the test was running? I did check the tcpdump trace to make sure the clock was monotone with no jump discontinuities but it's not possible to check the ttcp machine since it printed only two numbers.) (BTW, my real reason for believing the 2.65 kBps rate is that I have a model that predicted you would get 2.667 kBps throughput for an echo test under ideal conditions.) I think the situation is better than you stated for the Goony test, even if we belive the lower rate. With a user data size (mtu) of 420 bytes, there should be 488 bytes going through the simp for each data packet (2 frags, 1 w/20 byte tcp hdr, both w/20 byte ip hdr & 4 byte butterfly hdr = 420 + 20 + 2*(20 + 4) = 488). There's also an ack for every 2 data packets that contributes another 44 bytes (20 tcp, 20 ip, 4 butterfly). If we charge each data packet for 1/2 an ack, the simp sees 488 + 44/2 = 510 bytes for every 420 bytes of user data. Or, the total unidirectional throughput through the simp is 21% higher than the data throughput. 2.65kBps * 8 = 21 kbps * 1.21 = 26 kbps. So, the simp should have been handling around 52 kbps (which doesn't seem too shabby). Since your trace included all the outbound traffic, we can cross check this. I put together an awk script to compute total bits sent to the simp vs. time (the script is at the end of this message). A graph of its output shows a very nice line whose slope turns out to be 26 kbps with a couple of brief (~30sec) excursions to 30 kbps. The script labels bits due to packets vs. bits due to acks so you can plot them in different colors (at least, that's what I did). This shows a wonderful example of self-organizing behavior: At around 20sec, the connection has gotten to equilibrium. Because of the way slow-start works, the acks & data are pretty well interleaved. But because of cookie crumb effects, the acks almost immediately start diffusing towards the "late" end of the rtt slot and, by 100sec, all the sends show up at the early end of the slot & all the acks at the late end (you can see this as a "staircase" effect in the plot that is barely noticiable at 20sec and grows to be substantial at 120sec). We can measure the average amount of the slot occupied by the acks (which seemed to be 1.6 sec) and the number of acks (which was 8 at equilibrium). We know that there's one ack generated for every 2 packets and the acks are generated at exactly the rate that data packets come out of the simp. Thus 16 packets times 488 bytes/packet times 8 bits/byte in 1.6 sec = 39,040 bps = 38 kbps. This is amazingly close to Karen's 37 kbps theoretical max for one 64kb channel. The agreement will be comforting if goonhilly-echo uses one channel & a real mystery if it uses two channels. This flow separation of the sends & acks (which showed up in your earlier data) is where I got the 2.65 kBps prediction. We say that at equilibrium the sends & the acks are going to occupy non-overlapping parts of an rtt slot of length R. Say that at equilibrium we generate n packets of total size P bits. If the channel bandwidth is B, the ack portion of the slot must have duration nP/B. Two packets will be sent on the receipt of each ack so, if the ack spacing is preserved as the acks flow back through the channel, the send slot will also have duration nP/B. However, because of the PODA piggybacking & the symmetry of an echo test, the acks get jammed together & come out the other end only one packet time apart rather than two (I can draw a picture of why this should happen but I won't even attempt it on a terminal. But measure the slope of the steep part of the stairstep & you'll see it's true). Thus the duration of the send slot is half the ack slot. Thus we have R = 1/2 nP/B + nP/B = 3/2 nP/B but the effective throughput for the test is the rtt divided by the amount of data, or nP/R. Substituting the above expression for the R, the nP's cancel and we end up saying the effective throughput for an echo test will be 2/3 B. That says that for a 38 kbps channel you should have gotten 25.33 kbps. A least- squares fit to the tcpdump data says you got 26.43 kbps (and we'll agree not to quibble about the extra 2% since it's an error in the right direction). The cnuce data could have been interesting (since it wasn't an echo test) but was limitted by the receiver advertising only a 4KB window. The average send-to-ack time was 2.6 sec so the maximum possible throughput was 4/2.6 = 1.5 kBps. Several short portions of the trace (short = ~30 sec) seemed to go at this rate but, overall, it looks like you got the dismal 1.1 kBps because of a very high error rate (2% or 97 packets resent for 4973 sent). With this high error rate, it's going to be hard to get good throughput even when the window is a reasonable size. - Van ps: It seems like a crime that your satnet connection is being cut off in two weeks, just at the time you're getting all these great measurements and things are starting to make sense. Is there any way to get an extension? --------------------------- # bps.awk: # given a tcpdump ftp trace, output one line for each send # in the form #

# where

indicates whether this was a data packet or ack # packet, is the time packet was sent (in seconds with # zero at time of first packet) and is the number of Kbits # seen so far. We compute total bits. I.e., appropriate sizes # for the internet headers of data packets & ack packets are added # to the measured amount of user data in the packet. NR == 1 { n = split ($1,t,":") tim = t[1]*3600 + t[2]*60 + t[3] tzero = tim OFS = "\t" hdrsize = 68 ahdrsize = 44 } { # convert time to seconds n = split ($1,t,":") tim = t[1]*3600 + t[2]*60 + t[3] if ($6 !~ /^ack/) { # get amount of data in the packet i = index($6,"(") # add it to the amount of header & update the running total nbytes += substr($6,i+1,length($6)-i-1) + hdrsize printf "p\t%7.2f\t%g\n", tim-tzero, nbytes*(8/1024) } else { # acks are pure header nbytes += ahdrsize printf "a\t%7.2f\t%g\n", tim-tzero, nbytes*(8/1024) } }