1 Haziran 2021 Salı

FloatVector Sınıfı - Vector API - Java 16 İle Geliyor

Giriş
Vector API için IntVector, FloatVector gibi sınıflar var. Şu satırı dahil ederiz
import jdk.incubator.vector.FloatVector;
import jdk.incubator.vector.IntVector;
import jdk.incubator.vector.VectorOperators;
import jdk.incubator.vector.VectorSpecies;
Vector API Nedir?
Açıklaması şöyle
Vector computation consists of a sequence of operations on vectors. The Vector API is used to express vector computations that can be reliably compiled at runtime to the optimal vector instructions on supported CPU architectures, resulting in better performance than equivalent scalar computations. The goal of the vector API is to provide users with concise, easy-to-use and platform-independent expression of a wide range of vector calculations.

Açıklaması şöyle
To explain how the Java Vector API abstraction works, we need to explore different CPU architectures and provide a basic understanding of data-parallel computation. However, the concept is not so new. It has existed in C# for a while and has proven to be a good approach to leveraging data-parallel computation on modern hardware architectures.

In contrast to a regular computing operation, as in 1+1, where two “pieces of data” are added in one operation, a data-parallel operation is executing a simple operation (e.g., +) on multiple “pieces of data” at the same time. This mode of operation is called SIMD (Single Instruction, Multiple Data), whereas the traditional way of execution is called SISD (Single Instruction, Single Data). The performance speed-up results from applying the same operation on more than one “piece of data” within one CPU cycle. As a simple example: Instead of adding each element of an array A with each element of an array B, we take chunks of array A and array B and operate simultaneously. The two modes are illustrated below and should provide evidence of why SIMD should increase computational performance.
Şeklen şöyle


Ancak bir problem var. Her CPU mimarisi kendi gerçekleştirimi ile geliyor. JVM'in işlemci mimarisine göre çalışması lazım. 

Açıklaması şöyle
Java supports auto vectorization for arithmetic algorithms , which means the JIT compiler transforms some scalar operations to vector operations , when It sees fit. But the developer has no control over this.
With the new Vector APIs , developers can perform vector operations explicitly and make better use of multiple core capabilities.
fromArray metodu
Örnek 
Şöyle yaparız
float[] a = {1f, 2f, 3f, 4f};
float[] b = {5f, 8f, 10f, 12f};

FloatVector first = FloatVector.fromArray(FloatVector.SPECIES_128, a, 0);
FloatVector second = FloatVector.fromArray(FloatVector.SPECIES_128, b, 0);

FloatVector result = first
  .add(second)
  .pow(2)
  .neg();

//result = [-36.0, -100.0, -169.0, -256.0]

float[] resultArray = new float[4];
result.intoArray(resultArray,0);
fma işlemi
Örnek
Normalde şöyle yaparız
// FMA: Fused Multiply Add: c = c + (a * b)
public static float scalarFMA(float[] a, float[] b){
  var c = 0.0f;

  for(var i=0; i < a.length; i++){
    c = Math.fma(a[i], b[i], c);
  }
  return c;
}
Vector API ile şöyle yaparız
private static final VectorSpecies<Float> SPECIES = FloatVector.SPECIES_PREFERgrey;

public static float vectorFMA(float[] a, float[] b){
  var upperBound = SPECIES.loopBound(a.length);
  var sum = FloatVector.zero(SPECIES);
  var i = 0;
  for (; i < upperBound; i += SPECIES.length()) {
    // FloatVector va, vb, vc
    var va = FloatVector.fromArray(SPECIES, a, i);
    var vb = FloatVector.fromArray(SPECIES, b, i);
    sum = va.fma(vb, sum);
  }
  var c = sum.reduceLanes(VectorOperators.ADD);
  for (; i < a.length; i++) { // Cleanup loop
    c += a[i] * b[i];
  }
  return c;
}


Hiç yorum yok:

Yorum Gönder