Class/Object

com.twitter.algebird.DecayingCMS

CMS

Related Docs: object CMS | package DecayingCMS

Permalink

final class CMS extends Serializable

The idealized formula for the updating current value for a key (y0 -> y1) is given as:

delta = (t1 - t0) / halflife y1 = y0 * 2^(-delta) + n

However, we want to avoid having to rescale every single cell every time we update; i.e. a cell with a zero value should continue to have a zero value when n=0.

Therefore, we introduce a change of variable to cell values (z) along with a scale factor (scale), and the following formula:

(1) zN = yN * scaleN

Our constraint is expressed as:

(2) If n=0, z1 = z0

In that case:

(3) If n=0, (y1 * scale1) = (y0 * scale0) (4) Substituting for y1, (y0 * 2(-delta) + 0) * scale1 = y0 * scale0 (5) 2(-delta) * scale1 = scale0 (6) scale1 = scale0 * 2^(delta)

Also, to express z1 in terms of z0, we say:

(7) z1 = y1 * scale1 (8) z1 = (y0 * 2(-delta) + n) * scale1 (9) z1 = ((z0 / scale0) * 2(-delta) + n) * scale1 (10) z1 / scale1 = (z0 / (scale1 * 2(-delta))) * 2(-delta) + n (11) z1 / scale1 = z0 / scale1 + n (12) z1 = z0 + n * scale1

So, for cells where n=0, we just update scale0 to scale1, and for cells where n is non-zero, we update z1 in terms of z0 and scale1.

If we convert scale to logscale, we have:

(13) logscale1 = logscale0 + delta * log(2) (14) z1 = z0 + n * exp(logscale1)

When logscale1 gets big, we start to distort z1. For example, exp(36) is close to 2^53. We can measure when n * exp(logscale1) gets big, and in those cases we can rescale all our cells (set each z to its corresponding y) and set the logscale to 0.

(15) y1 = z1 / scale1 (16) y1 = z1 / exp(logscale1) (17) y1 = z1 * exp(-logscale1)

Linear Supertypes
Serializable, Serializable, AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. CMS
  2. Serializable
  3. Serializable
  4. AnyRef
  5. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Instance Constructors

  1. new CMS(cells: Array[Vector[Double]], logScale: Double, timeInHL: Double)

    Permalink

Value Members

  1. final def !=(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  3. def +(other: CMS): CMS

    Permalink
  4. final def ==(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  5. def add(t: Long, k: K, n: Double): CMS

    Permalink
  6. final def asInstanceOf[T0]: T0

    Permalink
    Definition Classes
    Any
  7. def bulkAdd(items: Iterable[(Long, K, Double)]): CMS

    Permalink
  8. val cells: Array[Vector[Double]]

    Permalink
  9. def clone(): AnyRef

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  10. final def eq(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  11. def equals(any: Any): Boolean

    Permalink
    Definition Classes
    CMS → AnyRef → Any
  12. def finalize(): Unit

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  13. def get(k: K): DoubleAt

    Permalink
  14. final def getClass(): Class[_]

    Permalink
    Definition Classes
    AnyRef → Any
  15. def getScale(t: Double): Double

    Permalink
  16. def hashCode(): Int

    Permalink
    Definition Classes
    CMS → AnyRef → Any
  17. def innerProductRoot(that: CMS): DoubleAt

    Permalink

    Returns the square-root of the inner product of two decaying CMSs.

    Returns the square-root of the inner product of two decaying CMSs.

    We want the result to decay at the same rate as the CMS for this method to be valid. Taking the square root ensures that this is true. Without it, we would violate the following equality (assuming we had at() on a CMS):

    x.innerProduct(y).at(t) = x.at(t).innerProduct(y.at(t))

    This is why we don't support innerProduct, only innerProductRoot.

  18. final def isInstanceOf[T0]: Boolean

    Permalink
    Definition Classes
    Any
  19. def l2Norm: DoubleAt

    Permalink
  20. def lastUpdateTime: Long

    Permalink
  21. val logScale: Double

    Permalink
  22. final def ne(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  23. final def notify(): Unit

    Permalink
    Definition Classes
    AnyRef
  24. final def notifyAll(): Unit

    Permalink
    Definition Classes
    AnyRef
  25. def range: (DoubleAt, DoubleAt)

    Permalink

    Provide lower and upper bounds on values returned for any possible key.

    Provide lower and upper bounds on values returned for any possible key.

    The first value is a lower bound: even keys that have never been counted will return this value or greater. This will be zero unless the CMS is saturated.

    The second value is an upper bound: the key with the largest cardinality will not be reported as being larger than this value (though it might be reported as being smaller).

    Together these values indicate how saturated and skewed the CMS might be.

  26. def scale(x: Double): CMS

    Permalink
  27. final def synchronized[T0](arg0: ⇒ T0): T0

    Permalink
    Definition Classes
    AnyRef
  28. val timeInHL: Double

    Permalink
  29. def toString(): String

    Permalink
    Definition Classes
    CMS → AnyRef → Any
  30. def total: DoubleAt

    Permalink

    Get the total count of all items in the CMS.

    Get the total count of all items in the CMS.

    The total is the same as the l1Norm, since we don't allow negative values.

    Total is one of the few non-approximate statistics that DecayingCMS supports. We expect the total to be exact (except for floating-point error).

  31. final def wait(): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  32. final def wait(arg0: Long, arg1: Int): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  33. final def wait(arg0: Long): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )

Inherited from Serializable

Inherited from Serializable

Inherited from AnyRef

Inherited from Any

Ungrouped