Class ScalarQuantizer

java.lang.Object
org.apache.lucene.util.quantization.ScalarQuantizer

public class ScalarQuantizer extends Object
Will scalar quantize float vectors into `int8` byte values. This is a lossy transformation. Scalar quantization works by first calculating the quantiles of the float vector values. The quantiles are calculated using the configured confidence interval. The [minQuantile, maxQuantile] are then used to scale the values into the range [0, 127] and bucketed into the nearest byte values.

How Scalar Quantization Works

The basic mathematical equations behind this are fairly straight forward and based on min/max normalization. Given a float vector `v` and a confidenceInterval `q` we can calculate the quantiles of the vector values [minQuantile, maxQuantile].

   byte = (float - minQuantile) * 127/(maxQuantile - minQuantile)
   float = (maxQuantile - minQuantile)/127 * byte + minQuantile
 

This then means to multiply two float values together (e.g. dot_product) we can do the following:

   float1 * float2 ~= (byte1 * (maxQuantile - minQuantile)/127 + minQuantile) * (byte2 * (maxQuantile - minQuantile)/127 + minQuantile)
   float1 * float2 ~= (byte1 * byte2 * (maxQuantile - minQuantile)^2)/(127^2) + (byte1 * minQuantile * (maxQuantile - minQuantile)/127) + (byte2 * minQuantile * (maxQuantile - minQuantile)/127) + minQuantile^2
   let alpha = (maxQuantile - minQuantile)/127
   float1 * float2 ~= (byte1 * byte2 * alpha^2) + (byte1 * minQuantile * alpha) + (byte2 * minQuantile * alpha) + minQuantile^2
 

The expansion for square distance is much simpler:

  square_distance = (float1 - float2)^2
  (float1 - float2)^2 ~= (byte1 * alpha + minQuantile - byte2 * alpha - minQuantile)^2
  = (alpha*byte1 + minQuantile)^2 + (alpha*byte2 + minQuantile)^2 - 2*(alpha*byte1 + minQuantile)(alpha*byte2 + minQuantile)
  this can be simplified to:
  = alpha^2 (byte1 - byte2)^2
 
  • Field Details

    • SCALAR_QUANTIZATION_SAMPLE_SIZE

      public static final int SCALAR_QUANTIZATION_SAMPLE_SIZE
      See Also:
    • SCRATCH_SIZE

      static final int SCRATCH_SIZE
      See Also:
    • alpha

      private final float alpha
    • scale

      private final float scale
    • bits

      private final byte bits
    • minQuantile

      private final float minQuantile
    • maxQuantile

      private final float maxQuantile
    • random

      private static final Random random
  • Constructor Details

    • ScalarQuantizer

      public ScalarQuantizer(float minQuantile, float maxQuantile, byte bits)
      Parameters:
      minQuantile - the lower quantile of the distribution
      maxQuantile - the upper quantile of the distribution
      bits - the number of bits to use for quantization
  • Method Details

    • quantize

      public float quantize(float[] src, byte[] dest, VectorSimilarityFunction similarityFunction)
      Quantize a float vector into a byte vector
      Parameters:
      src - the source vector
      dest - the destination vector
      similarityFunction - the similarity function used to calculate the quantile
      Returns:
      the corrective offset that needs to be applied to the score
    • quantizeFloat

      private float quantizeFloat(float v, byte[] dest, int destIndex)
    • recalculateCorrectiveOffset

      public float recalculateCorrectiveOffset(byte[] quantizedVector, ScalarQuantizer oldQuantizer, VectorSimilarityFunction similarityFunction)
      Recalculate the old score corrective value given new current quantiles
      Parameters:
      quantizedVector - the old vector
      oldQuantizer - the old quantizer
      similarityFunction - the similarity function used to calculate the quantile
      Returns:
      the new offset
    • deQuantize

      void deQuantize(byte[] src, float[] dest)
      Dequantize a byte vector into a float vector
      Parameters:
      src - the source vector
      dest - the destination vector
    • getLowerQuantile

      public float getLowerQuantile()
    • getUpperQuantile

      public float getUpperQuantile()
    • getConstantMultiplier

      public float getConstantMultiplier()
    • getBits

      public byte getBits()
    • toString

      public String toString()
      Overrides:
      toString in class Object
    • reservoirSampleIndices

      private static int[] reservoirSampleIndices(int numFloatVecs, int sampleSize)
    • fromVectors

      public static ScalarQuantizer fromVectors(FloatVectorValues floatVectorValues, float confidenceInterval, int totalVectorCount, byte bits) throws IOException
      This will read the float vector values and calculate the quantiles. If the number of float vectors is less than SCALAR_QUANTIZATION_SAMPLE_SIZE then all the values will be read and the quantiles calculated. If the number of float vectors is greater than SCALAR_QUANTIZATION_SAMPLE_SIZE then a random sample of SCALAR_QUANTIZATION_SAMPLE_SIZE will be read and the quantiles calculated.
      Parameters:
      floatVectorValues - the float vector values from which to calculate the quantiles
      confidenceInterval - the confidence interval used to calculate the quantiles
      totalVectorCount - the total number of live float vectors in the index. This is vital for accounting for deleted documents when calculating the quantiles.
      bits - the number of bits to use for quantization
      Returns:
      A new ScalarQuantizer instance
      Throws:
      IOException - if there is an error reading the float vector values
    • fromVectors

      static ScalarQuantizer fromVectors(FloatVectorValues floatVectorValues, float confidenceInterval, int totalVectorCount, byte bits, int quantizationSampleSize) throws IOException
      Throws:
      IOException
    • fromVectorsAutoInterval

      public static ScalarQuantizer fromVectorsAutoInterval(FloatVectorValues floatVectorValues, VectorSimilarityFunction function, int totalVectorCount, byte bits) throws IOException
      Throws:
      IOException
    • extractQuantiles

      private static void extractQuantiles(float[] confidenceIntervals, float[] quantileGatheringScratch, double[] upperSum, double[] lowerSum)
    • gatherSample

      private static void gatherSample(float[] vectorValue, float[] quantileGatheringScratch, List<float[]> sampledDocs, int i)
    • candidateGridSearch

      private static float[] candidateGridSearch(List<ScalarQuantizer.ScoreDocsAndScoreVariance> nearestNeighbors, List<float[]> vectors, float[] lowerCandidates, float[] upperCandidates, VectorSimilarityFunction function, byte bits)
    • findNearestNeighbors

      private static List<ScalarQuantizer.ScoreDocsAndScoreVariance> findNearestNeighbors(List<float[]> vectors, VectorSimilarityFunction similarityFunction)
      Parameters:
      vectors - The vectors to find the nearest neighbors for each other
      similarityFunction - The similarity function to use
      Returns:
      The top 10 nearest neighbors for each vector from the vectors list
    • getUpperAndLowerQuantile

      static float[] getUpperAndLowerQuantile(float[] arr, float confidenceInterval)
      Takes an array of floats, sorted or not, and returns a minimum and maximum value. These values are such that they reside on the `(1 - confidenceInterval)/2` and `confidenceInterval/2` percentiles. Example: providing floats `[0..100]` and asking for `90` quantiles will return `5` and `95`.
      Parameters:
      arr - array of floats
      confidenceInterval - the configured confidence interval
      Returns:
      lower and upper quantile values