On the Self-Similar Nature of Ethernet Traffic (Extended Version) Will E. Leland, Murad S. Taqqu, Walter Willinger, and Daniel V. Wilson One-line summary: Overview/Main Points * Self-Similarity Definition: Intuitively: o time series is bursty at all time-scales, compared with classical models which smooth off at large time-scales (eg. exponential or Poisson arrival processes, stochastic models of packet traffic, ...) o adding more sources adds to the burstiness Mathematically: see paper for exact definitions. Implications of mathematical defintion: o variance of sample mean var(X^(m)) decays as m^(-beta), with 0 < beta < 1. o autocorrelation function decays hyperbolically in beta as well, meaning autocorrelation function is non-summable (this is long-range dependence. o spectral density is unbounded at the origin - f(lambda) ~ lambda^(-gamma), with gamma = 1 - beta. Hurst effect: E[R(n)/S(n)] ~ n^H, where H is Hurst parameter, and H = 1 - (beta/2). It is known (and can be mathematically demonstrated) that H=0.5 corresponds to non-selfsimilar data, while 0.5 < H < 1.0 corresponds to selfsimilar data. Larger H implies larger burstiness. * Detecting Self-Similarity: o Variance-time plots: since var(X^(m)) ~ m^(-beta), plotting log(var(X^(m)) vs. log(m) will yield a straight line or slope -beta. As H = 1 - (beta/2), can deduce H. o R/S analysis (Pox plots): plotting log(R(n)/S(n)) vs. log(n) will yield a straight line of slope H. o Periodogram analysis: an estimate of H (known as Whittle's approximate MLE) can be calculated along with a confidence interval, based on the power spectrum density function properties. Few details given in paper. o Proof by picture, of course. o Ethernet Traffic: + High-quality Ethernet packet logs from Bellcore. Logs have microsecond resolution for periods of 27 hours, taken on four different days across a 4 year period. + All three tests gave similar values for H - Ethernet traffic is therefore self-similar. + Both LAN and WAN traffic showed similar H values. + The busier times of day yielded more bursty traffic (seen by higher H values). + The type and load of traffic (increased by web and exponential growth of internet) seemed to not affect the self-similarity property, although slight perturbations were noticed in H and attributed to host-to-host vs router-to-router, diskless vs. diskfull workstations, and other such things. + Low-load WAN traffic was non-selfsimilar. (Inconclusive, really.) o Causative Factors: Mandelbrot showed if have a large number of ON/OFF source clients, with ON/OFF periods having a heavy-tailed distribution, then the aggration of of the clients' traffic would be self-similar. (Heavy tail: P[U >= u] ~ u^(-alpha), 0 < alpha < 2.) In a later paper, this hypothesis is verified. o Modelling Self-Similarity: + Fractional Gaussian noise, and fractional autoregressive integrated moving-average (ARIMA) processes. Essentially time series that satisfy certain properties regarding autocorrelation functions, can be shown that they are self-similar. Expensive to compute, reaonable approximations can be now made. + Model ON/OFF sources, aggregate. Behaves like fractional Brownian motion (related to fractional Gaussian noise) o Measuring Burstiness: + H. + Index of dispersion (for counts): for time interval L, IDC(L) = var(arrivals during L) / (expected value of arrivals during L). Valid, self-similar process have monotonically increasing (linear) IDC. + peak-to-mean ratio: peak bandwidth / mean bandwidth. But over which timescale?? (Visual picture of whitenoise.) + coefficient of variation: standard deviation of interarrival time : mean of interarrival time ratio. Infinite variance of interarrival time for heavy-tailed! Relevance Is revolutionizing assumptions and understanding of networks, and time-series/processes in computer systems in general. Self similar stuff now includes networks (WAN, LAN, ..), file system traffic, HTTP traffic, ... Flaws o overly mathematical for a CS paper (but necessarily so) o self-similar definitions have limits as t goes to infinity; we have finite processes. How much does this affect results? o to date, lack of practical implications ----------------------------------------------------------------------- ====== On the Self-Similar Nature of Ethernet Traffic ---------------------------------------------- Leland et al., SIGMETRICS '93 Self-similar phenomena (first defined by Mandelbrot in '65) display statistically identical structure across all time scales. In the case of Ethernet, this manifests as the lack of a "typical" burst length. The authors wrote a custom monitoring system to collects all seen Ethernet packets, along with accurate timestamps. One doesn't need to make any a priori decisions regarding their experiment, except the amount of saved data per packet. They got 4 data sets totaling on the order of 10^8 packets. Set #1 was collected on an isolated LAN with diskless workstations; set #2 came 2 months later, after a large upgrade. In both these sets, 95% of the traffic was intra-LAN. 3 months later they monitored an inter-LAN that served as an external connection as well. After 2 years, they got set #4 by monitoring a backbone segment (thus, mostly inter-router traffic). They found that real Ethernet traffic at Bellcore is very different from what is predicted by formal models for packet traffic, packet train models, and fluid flow models. Obviously, it's also very different from isochronous traffic typical in the telephone system. Ethernet traffic was highly self-similar, whereas all the models would predict white noise-like distributions for aggregated traffic. The Hurst parameter (H) is a measure of self-similarity. When H=0.5 you get white noise [?], but for Ethertraffic it's 0.75-0.85 and H goes up as utilization of the network increases (i.e., traffic becomes increasingly self-similar). An important consequence is that, as the number of Ethernet users increases, the traffic does not become smoother (as predicted by formal models), but rather it becomes even burstier. Mandelbrot showed how self-similar processes can be viewed as an aggregation of many simple renewal reward processes. The renewal rewards for each such process turns out to be the amount of traffic generated by a single user during successive time intervals, described by a heavy tailed distribution (i.e., no characteristic length of busy period or packet train). Common measures of burstiness include: - index of dispersion (variance in number of arrivals / expected value of arrivals over a given time interval); this index increases monotonically for Ethertraffic, but stays constant or converges rapidly in the formal models. - peak-to-mean ratio: inappropriate measure for self-similar traffic, because it could literally have any value, depending on the time interval you're looking at. - coefficient of variation (standard deviation of interarrival times / expected value of interarrival times): again, any value is possible, depending on number of samples. Switched Megabit Data Service (SMDS) buffers and caps the rate at which traffic is delivered into a LAN, to reduce burstiness. As it turns out, increasing the buffer size reduces overall packet loss only very slowly (in contrast to the exponential decrease predicted by Poisson models) and invariably increases packet delay (in spite of the limit predicted by formal models). This is all explained by self-similarity, which further explains the observation that buffering doesn't help manage congestion (losses are concentrated and greatly exceed the background loss rate). The only tractable models that are reasonably accurate: self-similar stochastic model and deterministic chaotic maps. Discussion issues: 1. CSMA/CD <--> nature (remember classroom example) log-log plots... 1/f noise: cities by rank, fjords by size, Etherbursts, fractals, English lang by rank, WWW traffic, co. revenue as func of rank 2. Why is this relevant/irrelevant to Van Jacobson's paper?