Discussion:
Why is LambdaMetafactory 10% slower than a static MethodHandle but 80% faster than a non-static MethodHandle?
Geoffrey De Smet
2018-02-19 10:42:51 UTC
Permalink
Hi guys,

I ran the following JMH benchmark on JDK 9 and JDK 8.
Source code and detailed results below.

Benchmark on JDK 9        Score
staticMethodHandle          2.770
lambdaMetafactory          3.052    // 10% slower
nonStaticMethodHandle   5.250    // 90% slower

Why is LambdaMetafactory 10% slower than a static MethodHandle
but 80% faster than a non-static MethodHandle?


Source code (copy paste ready)
====================

import java.lang.invoke.CallSite;
import java.lang.invoke.LambdaMetafactory;
import java.lang.invoke.MethodHandle;
import java.lang.invoke.MethodHandles;
import java.lang.invoke.MethodType;
import java.util.concurrent.TimeUnit;
import java.util.function.Function;

import org.openjdk.jmh.annotations.Benchmark;
import org.openjdk.jmh.annotations.BenchmarkMode;
import org.openjdk.jmh.annotations.Fork;
import org.openjdk.jmh.annotations.Measurement;
import org.openjdk.jmh.annotations.Mode;
import org.openjdk.jmh.annotations.OutputTimeUnit;
import org.openjdk.jmh.annotations.Scope;
import org.openjdk.jmh.annotations.State;
import org.openjdk.jmh.annotations.Warmup;

//Benchmark on JDK 9     Mode  Cnt  Score   Error  Units
//staticMethodHandle     avgt   30  2.770 ± 0.023  ns/op // Baseline
//lambdaMetafactory      avgt   30  3.052 ± 0.004  ns/op // 10% slower
//nonStaticMethodHandle  avgt   30  5.250 ± 0.137  ns/op // 90% slower

//Benchmark on JDK 8     Mode  Cnt  Score   Error  Units
//staticMethodHandle     avgt   30  2.772 ± 0.022  ns/op // Baseline
//lambdaMetafactory      avgt   30  3.060 ± 0.007  ns/op // 10% slower
//nonStaticMethodHandle  avgt   30  5.037 ± 0.022  ns/op // 81% slower

@Warmup(iterations = 5, time = 1, timeUnit = TimeUnit.SECONDS)
@Measurement(iterations = 10, time = 1, timeUnit = TimeUnit.SECONDS)
@Fork(3)
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
@State(Scope.Thread)
public class LamdaMetafactoryWeirdPerformance {

    //
************************************************************************
    // Set up of the 3 approaches.
    //
************************************************************************

    // Unusable for Java framework developers. Only usable by JVM
language developers. Baseline.
    private static final MethodHandle staticMethodHandle;

    // Usuable for Java framework developers. 30% slower
    private final Function lambdaMetafactoryFunction;

    // Usuable for Java framework developers. 100% slower
    private final MethodHandle nonStaticMethodHandle;

    static {
        // Static MethodHandle setup
        try {
            staticMethodHandle = MethodHandles.lookup()
                    .findVirtual(Dog.class, "getName",
MethodType.methodType(String.class))
                    .asType(MethodType.methodType(Object.class,
Object.class));
        } catch (NoSuchMethodException | IllegalAccessException e) {
            throw new IllegalStateException(e);
        }
    }

    public LamdaMetafactoryWeirdPerformance() {
        try {
            MethodHandles.Lookup lookup = MethodHandles.lookup();

            // LambdaMetafactory setup
            CallSite site = LambdaMetafactory.metafactory(lookup,
                    "apply",
                    MethodType.methodType(Function.class),
                    MethodType.methodType(Object.class, Object.class),
                    lookup.findVirtual(Dog.class, "getName",
MethodType.methodType(String.class)),
                    MethodType.methodType(String.class, Dog.class));
            lambdaMetafactoryFunction = (Function)
site.getTarget().invokeExact();

            // Non-static MethodHandle setup
            nonStaticMethodHandle = lookup
                    .findVirtual(Dog.class, "getName",
MethodType.methodType(String.class))
                    .asType(MethodType.methodType(Object.class,
Object.class));
        } catch (Throwable e) {
            throw new IllegalStateException(e);
        }
    }

    //
************************************************************************
    // Benchmark
    //
************************************************************************

    private Object dogObject = new Dog("Fido");


    @Benchmark
    public Object _1_staticMethodHandle() throws Throwable {
        return staticMethodHandle.invokeExact(dogObject);
    }

    @Benchmark
    public Object _2_lambdaMetafactory() {
        return lambdaMetafactoryFunction.apply(dogObject);
    }

    @Benchmark
    public Object _3_nonStaticMethodHandle() throws Throwable {
        return nonStaticMethodHandle.invokeExact(dogObject);
    }

    private static class Dog {
        private String name;

        public Dog(String name) {
            this.name = name;
        }

        public String getName() {
            return name;
        }

    }

}


With kind regards,
Geoffrey De Smet
Vladimir Ivanov
2018-02-19 12:00:43 UTC
Permalink
Geoffrey,

In both staticMethodHandle & lambdaMetafactory Dog::getName is inlined,
but using different mechanisms.

In staticMethodHandle target method is statically known [1], but in case
of lambdaMetafactory [2] compiler has to rely on profiling info to
devirtualize Function::apply(). The latter requires exact type check on
the receiver at runtime and that explains the difference you are seeing.

But comparing that with nonStaticMethodHandle is not fair: there's no
inlining happening there.

If you want a fair comparison, then you have to measure with polluted
profile so no inlining happens. In that case [3] non-static
MethodHandles are on par (or even slightly faster):

LMF._4_lmf_fs avgt 10 20.020 ± 0.635 ns/op
LMF._4_lmf_mhs avgt 10 18.360 ± 0.181 ns/op

(scores for 3 invocations in a row.)

Best regards,
Vladimir Ivanov

[1] 715 126 b org.lmf.LMF::_1_staticMethodHandle (11 bytes)
...
@ 37 java.lang.invoke.DirectMethodHandle$Holder::invokeVirtual
(14 bytes) force inline by annotation
@ 1 java.lang.invoke.DirectMethodHandle::internalMemberName (8
bytes) force inline by annotation
@ 10 org.lmf.LMF$Dog::getName (5 bytes) accessor




[2] 678 117 b org.lmf.LMF::_2_lambdaMetafactory (14 bytes)
@ 8 org.lmf.LMF$$Lambda$37/552160541::apply (8 bytes) inline (hot)
\-> TypeProfile (6700/6700 counts) = org/lmf/LMF$$Lambda$37
@ 4 org.lmf.LMF$Dog::getName (5 bytes) accessor


[3] http://cr.openjdk.java.net/~vlivanov/misc/LMF.java

static Function make() throws Throwable {
CallSite site = LambdaMetafactory.metafactory(LOOKUP,
"apply",
MethodType.methodType(Function.class),
MethodType.methodType(Object.class, Object.class),
LOOKUP.findVirtual(Dog.class, "getName",
MethodType.methodType(String.class)),
MethodType.methodType(String.class, Dog.class));
return (Function) site.getTarget().invokeExact();
}

private Function[] fs = new Function[] {
make(), make(), make()
};

private MethodHandle[] mhs = new MethodHandle[] {
nonStaticMethodHandle,
nonStaticMethodHandle,
nonStaticMethodHandle
};

@Benchmark
public Object _4_lmf_fs() throws Throwable {
Object r = null;
for (Function f : fs {
r = f.apply(dogObject);
}
return r;
}

@Benchmark
public Object _4_lmf_mh() throws Throwable {
Object r = null;
for (MethodHandle mh : mhs) {
r = mh.invokeExact(dogObject);
}
return r;
}
Post by Geoffrey De Smet
Hi guys,
I ran the following JMH benchmark on JDK 9 and JDK 8.
Source code and detailed results below.
Benchmark on JDK 9        Score
staticMethodHandle          2.770
lambdaMetafactory          3.052    // 10% slower
nonStaticMethodHandle   5.250    // 90% slower
Why is LambdaMetafactory 10% slower than a static MethodHandle
but 80% faster than a non-static MethodHandle?
Source code (copy paste ready)
====================
import java.lang.invoke.CallSite;
import java.lang.invoke.LambdaMetafactory;
import java.lang.invoke.MethodHandle;
import java.lang.invoke.MethodHandles;
import java.lang.invoke.MethodType;
import java.util.concurrent.TimeUnit;
import java.util.function.Function;
import org.openjdk.jmh.annotations.Benchmark;
import org.openjdk.jmh.annotations.BenchmarkMode;
import org.openjdk.jmh.annotations.Fork;
import org.openjdk.jmh.annotations.Measurement;
import org.openjdk.jmh.annotations.Mode;
import org.openjdk.jmh.annotations.OutputTimeUnit;
import org.openjdk.jmh.annotations.Scope;
import org.openjdk.jmh.annotations.State;
import org.openjdk.jmh.annotations.Warmup;
//Benchmark on JDK 9     Mode  Cnt  Score   Error  Units
//staticMethodHandle     avgt   30  2.770 ± 0.023  ns/op // Baseline
//lambdaMetafactory      avgt   30  3.052 ± 0.004  ns/op // 10% slower
//nonStaticMethodHandle  avgt   30  5.250 ± 0.137  ns/op // 90% slower
//Benchmark on JDK 8     Mode  Cnt  Score   Error  Units
//staticMethodHandle     avgt   30  2.772 ± 0.022  ns/op // Baseline
//lambdaMetafactory      avgt   30  3.060 ± 0.007  ns/op // 10% slower
//nonStaticMethodHandle  avgt   30  5.037 ± 0.022  ns/op // 81% slower
@Warmup(iterations = 5, time = 1, timeUnit = TimeUnit.SECONDS)
@Measurement(iterations = 10, time = 1, timeUnit = TimeUnit.SECONDS)
@Fork(3)
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
@State(Scope.Thread)
public class LamdaMetafactoryWeirdPerformance {
    //
************************************************************************
    // Set up of the 3 approaches.
    //
************************************************************************
    // Unusable for Java framework developers. Only usable by JVM
language developers. Baseline.
    private static final MethodHandle staticMethodHandle;
    // Usuable for Java framework developers. 30% slower
    private final Function lambdaMetafactoryFunction;
    // Usuable for Java framework developers. 100% slower
    private final MethodHandle nonStaticMethodHandle;
    static {
        // Static MethodHandle setup
        try {
            staticMethodHandle = MethodHandles.lookup()
                    .findVirtual(Dog.class, "getName",
MethodType.methodType(String.class))
                    .asType(MethodType.methodType(Object.class,
Object.class));
        } catch (NoSuchMethodException | IllegalAccessException e) {
            throw new IllegalStateException(e);
        }
    }
    public LamdaMetafactoryWeirdPerformance() {
        try {
            MethodHandles.Lookup lookup = MethodHandles.lookup();
            // LambdaMetafactory setup
            CallSite site = LambdaMetafactory.metafactory(lookup,
                    "apply",
                    MethodType.methodType(Function.class),
                    MethodType.methodType(Object.class, Object.class),
                    lookup.findVirtual(Dog.class, "getName",
MethodType.methodType(String.class)),
                    MethodType.methodType(String.class, Dog.class));
            lambdaMetafactoryFunction = (Function)
site.getTarget().invokeExact();
            // Non-static MethodHandle setup
            nonStaticMethodHandle = lookup
                    .findVirtual(Dog.class, "getName",
MethodType.methodType(String.class))
                    .asType(MethodType.methodType(Object.class,
Object.class));
        } catch (Throwable e) {
            throw new IllegalStateException(e);
        }
    }
    //
************************************************************************
    // Benchmark
    //
************************************************************************
    private Object dogObject = new Dog("Fido");
    public Object _1_staticMethodHandle() throws Throwable {
        return staticMethodHandle.invokeExact(dogObject);
    }
    public Object _2_lambdaMetafactory() {
        return lambdaMetafactoryFunction.apply(dogObject);
    }
    public Object _3_nonStaticMethodHandle() throws Throwable {
        return nonStaticMethodHandle.invokeExact(dogObject);
    }
    private static class Dog {
        private String name;
        public Dog(String name) {
            this.name = name;
        }
        public String getName() {
            return name;
        }
    }
}
With kind regards,
Geoffrey De Smet
_______________________________________________
mlvm-dev mailing list
http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Jochen Theodorou
2018-02-19 12:28:06 UTC
Permalink
Post by Vladimir Ivanov
Geoffrey,
In both staticMethodHandle & lambdaMetafactory Dog::getName is inlined,
but using different mechanisms.
In staticMethodHandle target method is statically known [1], but in case
of lambdaMetafactory [2] compiler has to rely on profiling info to
devirtualize Function::apply(). The latter requires exact type check on
the receiver at runtime and that explains the difference you are seeing.
But comparing that with nonStaticMethodHandle is not fair: there's no
inlining happening there.
I actually never dared to ask, what kind of information is really
provided by the java compiler here to make the static version so fast?
Is it because the static final version becomes a member of the class
pool? Is the lambdafactory so fast, because here the handle will become
the member of the pool of the generated class? And is there a way for me
to bring nonStaticMethodHandle more near to staticMethodHandle, short of
making it static?

bye Jochen
Vladimir Ivanov
2018-02-19 13:31:42 UTC
Permalink
Post by Jochen Theodorou
Post by Vladimir Ivanov
In both staticMethodHandle & lambdaMetafactory Dog::getName is
inlined, but using different mechanisms.
In staticMethodHandle target method is statically known [1], but in
case of lambdaMetafactory [2] compiler has to rely on profiling info
to devirtualize Function::apply(). The latter requires exact type
check on the receiver at runtime and that explains the difference you
are seeing.
But comparing that with nonStaticMethodHandle is not fair: there's no
inlining happening there.
I actually never dared to ask, what kind of information is really
provided by the java compiler here to make the static version so fast?
Java compiler doesn't do anything special in that case. All the "magic"
happens during JIT-compilation: JIT-compiler extracts method handle
instance from static final field (as if it were a constant from class
constant pool) and inlines through MH.invokeExact() down to the target
method.
Post by Jochen Theodorou
Is it because the static final version becomes a member of the class
pool? Is the lambdafactory so fast, because here the handle will become
the member of the pool of the generated class? And is there a way for me
In that particular case, no method handles are involved.
LambdaMetafactory produces a class file w/o any method handle constants.
The target method is directly referenced from bytecode [1].
Post by Jochen Theodorou
to bring nonStaticMethodHandle more near to staticMethodHandle, short of
making it static?
CallSites are the best you can get (JITs treat CallSite.target as
constant and aggressively inlines through them), but you have to bind
CallSite instance either to invokedynamic call site or put it into
static final field.

If such scheme doesn't work for you, there's no way to match the
performance of invocations on constant method handles.

The best thing you can do is to wrap method handle constant into a newly
created class (put it into constant pool or static final field) and
define a method which invokes the method handle constant (both indy &
MH.invokeExact() work). The method should either implement a method from
super-interface or overrides a method from a super-class (so there's a
way to directly reference it at use sites). The latter is preferable,
because invokevirtual is faster than invokeinterface. (LambdaMetafactory
does the former and that's the reason it can't beat MH.invokeExact() on
non-constant MH).

Best regards,
Vladimir Ivanov

[1]
final class org.lmf.LMF$$Lambda$37 implements java.util.function.Function
...
Constant pool:
...
#19 = Methodref #15.#18 //
org/lmf/LMF$Dog.getName:()Ljava/lang/String;
...

public java.lang.Object apply(java.lang.Object);
descriptor: (Ljava/lang/Object;)Ljava/lang/Object;
flags: (0x0001) ACC_PUBLIC
Code:
stack=1, locals=2, args_size=2
0: aload_1
1: checkcast #15 // class org/lmf/LMF$Dog
4: invokevirtual #19 // Method
org/lmf/LMF$Dog.getName:()Ljava/lang/String;
7: areturn
Jochen Theodorou
2018-02-19 14:13:33 UTC
Permalink
On 19.02.2018 14:31, Vladimir Ivanov wrote:
[...]
Post by Vladimir Ivanov
CallSites are the best you can get (JITs treat CallSite.target as
constant and aggressively inlines through them), but you have to bind
CallSite instance either to invokedynamic call site or put it into
static final field.
And that really extends to MutableCallsite? In a dynamic language where
you depend on the instance types you cannot do all that much with a
non-mutable callsite.

[...]
Post by Vladimir Ivanov
The best thing you can do is to wrap method handle constant into a newly
created class (put it into constant pool or static final field) and
define a method which invokes the method handle constant (both indy &
MH.invokeExact() work). The method should either implement a method from
super-interface or overrides a method from a super-class (so there's a
way to directly reference it at use sites). The latter is preferable,
because invokevirtual is faster than invokeinterface. (LambdaMetafactory
does the former and that's the reason it can't beat MH.invokeExact() on
non-constant MH).
that is indeed something to try, nice idea. Now finding the time to
actually do it :(

bye Jochen
Vladimir Ivanov
2018-02-19 14:47:45 UTC
Permalink
Post by Jochen Theodorou
[...]
Post by Vladimir Ivanov
CallSites are the best you can get (JITs treat CallSite.target as
constant and aggressively inlines through them), but you have to bind
CallSite instance either to invokedynamic call site or put it into
static final field.
And that really extends to MutableCallsite? In a dynamic language where
you depend on the instance types you cannot do all that much with a
non-mutable callsite.
Yes, it covers all flavors of CallSites. In case of
Mutable/VolatileCallSite, JIT-compiler records a dependency on CallSite
target value and invalidates all dependent nmethods when CallSite target
changes. It doesn't induce any overhead at runtime and allows to reach
peak performance after every CallSite change (due to recompilation), but
it doesn't favor regularly changing CallSites (manifests as continuous
recompilations at runtime).

Best regards,
Vladimir Ivanov
Post by Jochen Theodorou
[...]
Post by Vladimir Ivanov
The best thing you can do is to wrap method handle constant into a
newly created class (put it into constant pool or static final field)
and define a method which invokes the method handle constant (both
indy & MH.invokeExact() work). The method should either implement a
method from super-interface or overrides a method from a super-class
(so there's a way to directly reference it at use sites). The latter
is preferable, because invokevirtual is faster than invokeinterface.
(LambdaMetafactory does the former and that's the reason it can't beat
MH.invokeExact() on non-constant MH).
that is indeed something to try, nice idea. Now finding the time to
actually do it :(
bye Jochen
Remi Forax
2018-02-19 19:07:06 UTC
Permalink
----- Mail original -----
Envoyé: Lundi 19 Février 2018 15:47:45
Objet: Re: Why is LambdaMetafactory 10% slower than a static MethodHandle but 80% faster than a non-static MethodHandle?
Post by Jochen Theodorou
[...]
Post by Vladimir Ivanov
CallSites are the best you can get (JITs treat CallSite.target as
constant and aggressively inlines through them), but you have to bind
CallSite instance either to invokedynamic call site or put it into
static final field.
And that really extends to MutableCallsite? In a dynamic language where
you depend on the instance types you cannot do all that much with a
non-mutable callsite.
Yes, it covers all flavors of CallSites. In case of
Mutable/VolatileCallSite, JIT-compiler records a dependency on CallSite
target value and invalidates all dependent nmethods when CallSite target
changes. It doesn't induce any overhead at runtime and allows to reach
peak performance after every CallSite change (due to recompilation), but
it doesn't favor regularly changing CallSites (manifests as continuous
recompilations at runtime).
For the shake of completeness, i will just add that this is only true for callsites that are attached to a bytecode, i.e. the ones that are returned by a bootstrap method, if you allocate and store a CallSite in a local variable, it will not magically turn itself to a constant.

And that the VM trusts you, i.e. if you mutate a MutableCallSite too frequently (by example at each call), it will be dog slow because the JIT will optimize/deoptimize at each call.
Best regards,
Vladimir Ivanov
Rémi
Geoffrey De Smet
2018-02-19 13:41:32 UTC
Permalink
Thank you for the insight, Vladimir.
Post by Vladimir Ivanov
In staticMethodHandle target method is statically known [1], but in
case of lambdaMetafactory [2] compiler has to rely on profiling info
to devirtualize Function::apply(). The latter requires exact type
check on the receiver at runtime and that explains the difference you
are seeing.
Ah, so it's unlikely that a future JDK version could eliminate
that 10% difference between LambdaMetafactory and staticMethodHandle?

Good to know.
Post by Vladimir Ivanov
But comparing that with nonStaticMethodHandle is not fair: there's no
inlining happening there.
Agreed.

However, for java framework developers,
it would be really useful to have inlining for non-static method handles
too (see Charles's thread),
because - unlike JVM language developers - we can't use static method
handles and don't want to use code generation.

For example, if a JPA or JAXB implementation did use a static fields,
the code to call methods on a domain hierarchy of classes would look
like this:

public final class MyAccessors {

    private static final MethodHandle handle1; // Person.getName()
    private static final MethodHandle handle2; // Person.getAge()
    private static final MethodHandle handle3; // Company.getName()
    private static final MethodHandle handle4; // Company.getAddress()
    private static final MethodHandle handle5; // ...
    private static final MethodHandle handle6;
    private static final MethodHandle handle7;
    private static final MethodHandle handle8;
    private static final MethodHandle handle9;
    ...
    private static final MethodHandle handle1000;

}

And furthermore, it would break down with domain hierarchies
that have more than 1000 getters/setters.


With kind regards,
Geoffrey De Smet
Post by Vladimir Ivanov
Geoffrey,
In both staticMethodHandle & lambdaMetafactory Dog::getName is
inlined, but using different mechanisms.
In staticMethodHandle target method is statically known [1], but in
case of lambdaMetafactory [2] compiler has to rely on profiling info
to devirtualize Function::apply(). The latter requires exact type
check on the receiver at runtime and that explains the difference you
are seeing.
But comparing that with nonStaticMethodHandle is not fair: there's no
inlining happening there.
If you want a fair comparison, then you have to measure with polluted
profile so no inlining happens. In that case [3] non-static
LMF._4_lmf_fs  avgt   10  20.020 ± 0.635  ns/op
LMF._4_lmf_mhs avgt   10  18.360 ± 0.181  ns/op
(scores for 3 invocations in a row.)
Best regards,
Vladimir Ivanov
[1] 715  126    b        org.lmf.LMF::_1_staticMethodHandle (11 bytes)
...
bytes)   force inline by annotation
bytes)   force inline by annotation
[2] 678  117    b        org.lmf.LMF::_2_lambdaMetafactory (14 bytes)
@ 8   org.lmf.LMF$$Lambda$37/552160541::apply (8 bytes)   inline (hot)
 \-> TypeProfile (6700/6700 counts) = org/lmf/LMF$$Lambda$37
[3] http://cr.openjdk.java.net/~vlivanov/misc/LMF.java
    static Function make() throws Throwable {
        CallSite site = LambdaMetafactory.metafactory(LOOKUP,
                "apply",
                MethodType.methodType(Function.class),
                MethodType.methodType(Object.class, Object.class),
                LOOKUP.findVirtual(Dog.class, "getName",
MethodType.methodType(String.class)),
                MethodType.methodType(String.class, Dog.class));
        return (Function) site.getTarget().invokeExact();
    }
    private Function[] fs = new Function[] {
        make(), make(), make()
    };
    private MethodHandle[] mhs = new MethodHandle[] {
        nonStaticMethodHandle,
        nonStaticMethodHandle,
        nonStaticMethodHandle
    };
    public Object _4_lmf_fs() throws Throwable {
        Object r = null;
        for (Function f : fs {
            r = f.apply(dogObject);
        }
        return r;
    }
    public Object _4_lmf_mh() throws Throwable {
        Object r = null;
        for (MethodHandle mh : mhs) {
            r = mh.invokeExact(dogObject);
        }
        return r;
    }
Post by Geoffrey De Smet
Hi guys,
I ran the following JMH benchmark on JDK 9 and JDK 8.
Source code and detailed results below.
Benchmark on JDK 9        Score
staticMethodHandle          2.770
lambdaMetafactory          3.052    // 10% slower
nonStaticMethodHandle   5.250    // 90% slower
Why is LambdaMetafactory 10% slower than a static MethodHandle
but 80% faster than a non-static MethodHandle?
Source code (copy paste ready)
====================
import java.lang.invoke.CallSite;
import java.lang.invoke.LambdaMetafactory;
import java.lang.invoke.MethodHandle;
import java.lang.invoke.MethodHandles;
import java.lang.invoke.MethodType;
import java.util.concurrent.TimeUnit;
import java.util.function.Function;
import org.openjdk.jmh.annotations.Benchmark;
import org.openjdk.jmh.annotations.BenchmarkMode;
import org.openjdk.jmh.annotations.Fork;
import org.openjdk.jmh.annotations.Measurement;
import org.openjdk.jmh.annotations.Mode;
import org.openjdk.jmh.annotations.OutputTimeUnit;
import org.openjdk.jmh.annotations.Scope;
import org.openjdk.jmh.annotations.State;
import org.openjdk.jmh.annotations.Warmup;
//Benchmark on JDK 9     Mode  Cnt  Score   Error  Units
//staticMethodHandle     avgt   30  2.770 ± 0.023  ns/op // Baseline
//lambdaMetafactory      avgt   30  3.052 ± 0.004  ns/op // 10% slower
//nonStaticMethodHandle  avgt   30  5.250 ± 0.137  ns/op // 90% slower
//Benchmark on JDK 8     Mode  Cnt  Score   Error  Units
//staticMethodHandle     avgt   30  2.772 ± 0.022  ns/op // Baseline
//lambdaMetafactory      avgt   30  3.060 ± 0.007  ns/op // 10% slower
//nonStaticMethodHandle  avgt   30  5.037 ± 0.022  ns/op // 81% slower
@Warmup(iterations = 5, time = 1, timeUnit = TimeUnit.SECONDS)
@Measurement(iterations = 10, time = 1, timeUnit = TimeUnit.SECONDS)
@Fork(3)
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
@State(Scope.Thread)
public class LamdaMetafactoryWeirdPerformance {
     //
************************************************************************
     // Set up of the 3 approaches.
     //
************************************************************************
     // Unusable for Java framework developers. Only usable by JVM
language developers. Baseline.
     private static final MethodHandle staticMethodHandle;
     // Usuable for Java framework developers. 30% slower
     private final Function lambdaMetafactoryFunction;
     // Usuable for Java framework developers. 100% slower
     private final MethodHandle nonStaticMethodHandle;
     static {
         // Static MethodHandle setup
         try {
             staticMethodHandle = MethodHandles.lookup()
                     .findVirtual(Dog.class, "getName",
MethodType.methodType(String.class))
                     .asType(MethodType.methodType(Object.class,
Object.class));
         } catch (NoSuchMethodException | IllegalAccessException e) {
             throw new IllegalStateException(e);
         }
     }
     public LamdaMetafactoryWeirdPerformance() {
         try {
             MethodHandles.Lookup lookup = MethodHandles.lookup();
             // LambdaMetafactory setup
             CallSite site = LambdaMetafactory.metafactory(lookup,
                     "apply",
                     MethodType.methodType(Function.class),
                     MethodType.methodType(Object.class, Object.class),
                     lookup.findVirtual(Dog.class, "getName",
MethodType.methodType(String.class)),
                     MethodType.methodType(String.class, Dog.class));
             lambdaMetafactoryFunction = (Function)
site.getTarget().invokeExact();
             // Non-static MethodHandle setup
             nonStaticMethodHandle = lookup
                     .findVirtual(Dog.class, "getName",
MethodType.methodType(String.class))
                     .asType(MethodType.methodType(Object.class,
Object.class));
         } catch (Throwable e) {
             throw new IllegalStateException(e);
         }
     }
     //
************************************************************************
     // Benchmark
     //
************************************************************************
     private Object dogObject = new Dog("Fido");
     public Object _1_staticMethodHandle() throws Throwable {
         return staticMethodHandle.invokeExact(dogObject);
     }
     public Object _2_lambdaMetafactory() {
         return lambdaMetafactoryFunction.apply(dogObject);
     }
     public Object _3_nonStaticMethodHandle() throws Throwable {
         return nonStaticMethodHandle.invokeExact(dogObject);
     }
     private static class Dog {
         private String name;
         public Dog(String name) {
             this.name = name;
         }
         public String getName() {
             return name;
         }
     }
}
With kind regards,
Geoffrey De Smet
_______________________________________________
mlvm-dev mailing list
http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Vladimir Ivanov
2018-02-19 14:08:21 UTC
Permalink
Post by Geoffrey De Smet
Post by Vladimir Ivanov
In staticMethodHandle target method is statically known [1], but in
case of lambdaMetafactory [2] compiler has to rely on profiling info
to devirtualize Function::apply(). The latter requires exact type
check on the receiver at runtime and that explains the difference you
are seeing.
Ah, so it's unlikely that a future JDK version could eliminate
that 10% difference between LambdaMetafactory and staticMethodHandle?
Yes, that's correct.
Post by Geoffrey De Smet
Post by Vladimir Ivanov
But comparing that with nonStaticMethodHandle is not fair: there's no
inlining happening there.
Agreed.
However, for java framework developers,
it would be really useful to have inlining for non-static method handles
too (see Charles's thread),
because - unlike JVM language developers - we can't use static method
handles and don't want to use code generation.
Though inlining is desireable, benefits quickly diminish with the number
of cases. (For example, C2 only inlines up to 2 targets and only in case
of bimorphic call site - only 2 receiver classes have been ever seen.)

With non-constant method handles it's even worse: just by looking at the
call site we can't say anything about what will be called (and how!)
except its signature (reified as MethodType instance at runtime).

There were some discussions about implementing value profiling for MH
invokers (invoke()/invokeExact()), but it can only benefit cases where
the same MethodHandle instance is used always/most of the time.

I seriously doubt it scales well to the use cases you have in mind (like
JPA/JAXB).

Best regards,
Vladimir Ivanov
Post by Geoffrey De Smet
For example, if a JPA or JAXB implementation did use a static fields,
the code to call methods on a domain hierarchy of classes would look
public final class MyAccessors {
    private static final MethodHandle handle1; // Person.getName()
    private static final MethodHandle handle2; // Person.getAge()
    private static final MethodHandle handle3; // Company.getName()
    private static final MethodHandle handle4; // Company.getAddress()
    private static final MethodHandle handle5; // ...
    private static final MethodHandle handle6;
    private static final MethodHandle handle7;
    private static final MethodHandle handle8;
    private static final MethodHandle handle9;
    ...
    private static final MethodHandle handle1000;
}
And furthermore, it would break down with domain hierarchies
that have more than 1000 getters/setters.
With kind regards,
Geoffrey De Smet
Post by Vladimir Ivanov
Geoffrey,
In both staticMethodHandle & lambdaMetafactory Dog::getName is
inlined, but using different mechanisms.
In staticMethodHandle target method is statically known [1], but in
case of lambdaMetafactory [2] compiler has to rely on profiling info
to devirtualize Function::apply(). The latter requires exact type
check on the receiver at runtime and that explains the difference you
are seeing.
But comparing that with nonStaticMethodHandle is not fair: there's no
inlining happening there.
If you want a fair comparison, then you have to measure with polluted
profile so no inlining happens. In that case [3] non-static
LMF._4_lmf_fs  avgt   10  20.020 ± 0.635  ns/op
LMF._4_lmf_mhs avgt   10  18.360 ± 0.181  ns/op
(scores for 3 invocations in a row.)
Best regards,
Vladimir Ivanov
[1] 715  126    b        org.lmf.LMF::_1_staticMethodHandle (11 bytes)
...
bytes)   force inline by annotation
bytes)   force inline by annotation
[2] 678  117    b        org.lmf.LMF::_2_lambdaMetafactory (14 bytes)
@ 8   org.lmf.LMF$$Lambda$37/552160541::apply (8 bytes)   inline (hot)
 \-> TypeProfile (6700/6700 counts) = org/lmf/LMF$$Lambda$37
[3] http://cr.openjdk.java.net/~vlivanov/misc/LMF.java
    static Function make() throws Throwable {
        CallSite site = LambdaMetafactory.metafactory(LOOKUP,
                "apply",
                MethodType.methodType(Function.class),
                MethodType.methodType(Object.class, Object.class),
                LOOKUP.findVirtual(Dog.class, "getName",
MethodType.methodType(String.class)),
                MethodType.methodType(String.class, Dog.class));
        return (Function) site.getTarget().invokeExact();
    }
    private Function[] fs = new Function[] {
        make(), make(), make()
    };
    private MethodHandle[] mhs = new MethodHandle[] {
        nonStaticMethodHandle,
        nonStaticMethodHandle,
        nonStaticMethodHandle
    };
    public Object _4_lmf_fs() throws Throwable {
        Object r = null;
        for (Function f : fs {
            r = f.apply(dogObject);
        }
        return r;
    }
    public Object _4_lmf_mh() throws Throwable {
        Object r = null;
        for (MethodHandle mh : mhs) {
            r = mh.invokeExact(dogObject);
        }
        return r;
    }
Post by Geoffrey De Smet
Hi guys,
I ran the following JMH benchmark on JDK 9 and JDK 8.
Source code and detailed results below.
Benchmark on JDK 9        Score
staticMethodHandle          2.770
lambdaMetafactory          3.052    // 10% slower
nonStaticMethodHandle   5.250    // 90% slower
Why is LambdaMetafactory 10% slower than a static MethodHandle
but 80% faster than a non-static MethodHandle?
Source code (copy paste ready)
====================
import java.lang.invoke.CallSite;
import java.lang.invoke.LambdaMetafactory;
import java.lang.invoke.MethodHandle;
import java.lang.invoke.MethodHandles;
import java.lang.invoke.MethodType;
import java.util.concurrent.TimeUnit;
import java.util.function.Function;
import org.openjdk.jmh.annotations.Benchmark;
import org.openjdk.jmh.annotations.BenchmarkMode;
import org.openjdk.jmh.annotations.Fork;
import org.openjdk.jmh.annotations.Measurement;
import org.openjdk.jmh.annotations.Mode;
import org.openjdk.jmh.annotations.OutputTimeUnit;
import org.openjdk.jmh.annotations.Scope;
import org.openjdk.jmh.annotations.State;
import org.openjdk.jmh.annotations.Warmup;
//Benchmark on JDK 9     Mode  Cnt  Score   Error  Units
//staticMethodHandle     avgt   30  2.770 ± 0.023  ns/op // Baseline
//lambdaMetafactory      avgt   30  3.052 ± 0.004  ns/op // 10% slower
//nonStaticMethodHandle  avgt   30  5.250 ± 0.137  ns/op // 90% slower
//Benchmark on JDK 8     Mode  Cnt  Score   Error  Units
//staticMethodHandle     avgt   30  2.772 ± 0.022  ns/op // Baseline
//lambdaMetafactory      avgt   30  3.060 ± 0.007  ns/op // 10% slower
//nonStaticMethodHandle  avgt   30  5.037 ± 0.022  ns/op // 81% slower
@Warmup(iterations = 5, time = 1, timeUnit = TimeUnit.SECONDS)
@Measurement(iterations = 10, time = 1, timeUnit = TimeUnit.SECONDS)
@Fork(3)
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
@State(Scope.Thread)
public class LamdaMetafactoryWeirdPerformance {
     //
************************************************************************
     // Set up of the 3 approaches.
     //
************************************************************************
     // Unusable for Java framework developers. Only usable by JVM
language developers. Baseline.
     private static final MethodHandle staticMethodHandle;
     // Usuable for Java framework developers. 30% slower
     private final Function lambdaMetafactoryFunction;
     // Usuable for Java framework developers. 100% slower
     private final MethodHandle nonStaticMethodHandle;
     static {
         // Static MethodHandle setup
         try {
             staticMethodHandle = MethodHandles.lookup()
                     .findVirtual(Dog.class, "getName",
MethodType.methodType(String.class))
                     .asType(MethodType.methodType(Object.class,
Object.class));
         } catch (NoSuchMethodException | IllegalAccessException e) {
             throw new IllegalStateException(e);
         }
     }
     public LamdaMetafactoryWeirdPerformance() {
         try {
             MethodHandles.Lookup lookup = MethodHandles.lookup();
             // LambdaMetafactory setup
             CallSite site = LambdaMetafactory.metafactory(lookup,
                     "apply",
                     MethodType.methodType(Function.class),
                     MethodType.methodType(Object.class, Object.class),
                     lookup.findVirtual(Dog.class, "getName",
MethodType.methodType(String.class)),
                     MethodType.methodType(String.class, Dog.class));
             lambdaMetafactoryFunction = (Function)
site.getTarget().invokeExact();
             // Non-static MethodHandle setup
             nonStaticMethodHandle = lookup
                     .findVirtual(Dog.class, "getName",
MethodType.methodType(String.class))
                     .asType(MethodType.methodType(Object.class,
Object.class));
         } catch (Throwable e) {
             throw new IllegalStateException(e);
         }
     }
     //
************************************************************************
     // Benchmark
     //
************************************************************************
     private Object dogObject = new Dog("Fido");
     public Object _1_staticMethodHandle() throws Throwable {
         return staticMethodHandle.invokeExact(dogObject);
     }
     public Object _2_lambdaMetafactory() {
         return lambdaMetafactoryFunction.apply(dogObject);
     }
     public Object _3_nonStaticMethodHandle() throws Throwable {
         return nonStaticMethodHandle.invokeExact(dogObject);
     }
     private static class Dog {
         private String name;
         public Dog(String name) {
             this.name = name;
         }
         public String getName() {
             return name;
         }
     }
}
With kind regards,
Geoffrey De Smet
_______________________________________________
mlvm-dev mailing list
http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Wenlei Xie
2018-02-19 20:36:38 UTC
Permalink
Post by Geoffrey De Smet
However, for java framework developers,
it would be really useful to have inlining for non-static method handles
too (see Charles's thread),

Is the problem that non-static MethodHandle doesn't get customized, or it's
because in the benchmark, each time it will use a new MethodHandle from
reflection?

I remember a MethodHandle will be customized when it was called over a
threshold (127 is the default). Thus as long as you are using the same
MethodHandle over the time, you will get the performance benefit from
customization, right?




Best,
Wenlei
Post by Geoffrey De Smet
Thank you for the insight, Vladimir.
In staticMethodHandle target method is statically known [1], but in case
Post by Vladimir Ivanov
of lambdaMetafactory [2] compiler has to rely on profiling info to
devirtualize Function::apply(). The latter requires exact type check on the
receiver at runtime and that explains the difference you are seeing.
Ah, so it's unlikely that a future JDK version could eliminate
that 10% difference between LambdaMetafactory and staticMethodHandle?
Good to know.
But comparing that with nonStaticMethodHandle is not fair: there's no
Post by Vladimir Ivanov
inlining happening there.
Agreed.
However, for java framework developers,
it would be really useful to have inlining for non-static method handles
too (see Charles's thread),
because - unlike JVM language developers - we can't use static method
handles and don't want to use code generation.
For example, if a JPA or JAXB implementation did use a static fields,
the code to call methods on a domain hierarchy of classes would look like
public final class MyAccessors {
private static final MethodHandle handle1; // Person.getName()
private static final MethodHandle handle2; // Person.getAge()
private static final MethodHandle handle3; // Company.getName()
private static final MethodHandle handle4; // Company.getAddress()
private static final MethodHandle handle5; // ...
private static final MethodHandle handle6;
private static final MethodHandle handle7;
private static final MethodHandle handle8;
private static final MethodHandle handle9;
...
private static final MethodHandle handle1000;
}
And furthermore, it would break down with domain hierarchies
that have more than 1000 getters/setters.
With kind regards,
Geoffrey De Smet
Post by Vladimir Ivanov
Geoffrey,
In both staticMethodHandle & lambdaMetafactory Dog::getName is inlined,
but using different mechanisms.
In staticMethodHandle target method is statically known [1], but in case
of lambdaMetafactory [2] compiler has to rely on profiling info to
devirtualize Function::apply(). The latter requires exact type check on the
receiver at runtime and that explains the difference you are seeing.
But comparing that with nonStaticMethodHandle is not fair: there's no
inlining happening there.
If you want a fair comparison, then you have to measure with polluted
profile so no inlining happens. In that case [3] non-static MethodHandles
LMF._4_lmf_fs avgt 10 20.020 ± 0.635 ns/op
LMF._4_lmf_mhs avgt 10 18.360 ± 0.181 ns/op
(scores for 3 invocations in a row.)
Best regards,
Vladimir Ivanov
[1] 715 126 b org.lmf.LMF::_1_staticMethodHandle (11 bytes)
...
@ 37 java.lang.invoke.DirectMethodHandle$Holder::invokeVirtual (14
bytes) force inline by annotation
@ 1 java.lang.invoke.DirectMethodHandle::internalMemberName (8
bytes) force inline by annotation
@ 10 org.lmf.LMF$Dog::getName (5 bytes) accessor
[2] 678 117 b org.lmf.LMF::_2_lambdaMetafactory (14 bytes)
@ 8 org.lmf.LMF$$Lambda$37/552160541::apply (8 bytes) inline (hot)
\-> TypeProfile (6700/6700 counts) = org/lmf/LMF$$Lambda$37
@ 4 org.lmf.LMF$Dog::getName (5 bytes) accessor
[3] http://cr.openjdk.java.net/~vlivanov/misc/LMF.java
static Function make() throws Throwable {
CallSite site = LambdaMetafactory.metafactory(LOOKUP,
"apply",
MethodType.methodType(Function.class),
MethodType.methodType(Object.class, Object.class),
LOOKUP.findVirtual(Dog.class, "getName",
MethodType.methodType(String.class)),
MethodType.methodType(String.class, Dog.class));
return (Function) site.getTarget().invokeExact();
}
private Function[] fs = new Function[] {
make(), make(), make()
};
private MethodHandle[] mhs = new MethodHandle[] {
nonStaticMethodHandle,
nonStaticMethodHandle,
nonStaticMethodHandle
};
@Benchmark
public Object _4_lmf_fs() throws Throwable {
Object r = null;
for (Function f : fs {
r = f.apply(dogObject);
}
return r;
}
@Benchmark
public Object _4_lmf_mh() throws Throwable {
Object r = null;
for (MethodHandle mh : mhs) {
r = mh.invokeExact(dogObject);
}
return r;
}
Post by Geoffrey De Smet
Hi guys,
I ran the following JMH benchmark on JDK 9 and JDK 8.
Source code and detailed results below.
Benchmark on JDK 9 Score
staticMethodHandle 2.770
lambdaMetafactory 3.052 // 10% slower
nonStaticMethodHandle 5.250 // 90% slower
Why is LambdaMetafactory 10% slower than a static MethodHandle
but 80% faster than a non-static MethodHandle?
Source code (copy paste ready)
====================
import java.lang.invoke.CallSite;
import java.lang.invoke.LambdaMetafactory;
import java.lang.invoke.MethodHandle;
import java.lang.invoke.MethodHandles;
import java.lang.invoke.MethodType;
import java.util.concurrent.TimeUnit;
import java.util.function.Function;
import org.openjdk.jmh.annotations.Benchmark;
import org.openjdk.jmh.annotations.BenchmarkMode;
import org.openjdk.jmh.annotations.Fork;
import org.openjdk.jmh.annotations.Measurement;
import org.openjdk.jmh.annotations.Mode;
import org.openjdk.jmh.annotations.OutputTimeUnit;
import org.openjdk.jmh.annotations.Scope;
import org.openjdk.jmh.annotations.State;
import org.openjdk.jmh.annotations.Warmup;
//Benchmark on JDK 9 Mode Cnt Score Error Units
//staticMethodHandle avgt 30 2.770 ± 0.023 ns/op // Baseline
//lambdaMetafactory avgt 30 3.052 ± 0.004 ns/op // 10% slower
//nonStaticMethodHandle avgt 30 5.250 ± 0.137 ns/op // 90% slower
//Benchmark on JDK 8 Mode Cnt Score Error Units
//staticMethodHandle avgt 30 2.772 ± 0.022 ns/op // Baseline
//lambdaMetafactory avgt 30 3.060 ± 0.007 ns/op // 10% slower
//nonStaticMethodHandle avgt 30 5.037 ± 0.022 ns/op // 81% slower
@Warmup(iterations = 5, time = 1, timeUnit = TimeUnit.SECONDS)
@Measurement(iterations = 10, time = 1, timeUnit = TimeUnit.SECONDS)
@Fork(3)
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
@State(Scope.Thread)
public class LamdaMetafactoryWeirdPerformance {
// ************************************************************
************
// Set up of the 3 approaches.
// ************************************************************
************
// Unusable for Java framework developers. Only usable by JVM
language developers. Baseline.
private static final MethodHandle staticMethodHandle;
// Usuable for Java framework developers. 30% slower
private final Function lambdaMetafactoryFunction;
// Usuable for Java framework developers. 100% slower
private final MethodHandle nonStaticMethodHandle;
static {
// Static MethodHandle setup
try {
staticMethodHandle = MethodHandles.lookup()
.findVirtual(Dog.class, "getName",
MethodType.methodType(String.class))
.asType(MethodType.methodType(Object.class,
Object.class));
} catch (NoSuchMethodException | IllegalAccessException e) {
throw new IllegalStateException(e);
}
}
public LamdaMetafactoryWeirdPerformance() {
try {
MethodHandles.Lookup lookup = MethodHandles.lookup();
// LambdaMetafactory setup
CallSite site = LambdaMetafactory.metafactory(lookup,
"apply",
MethodType.methodType(Function.class),
MethodType.methodType(Object.class, Object.class),
lookup.findVirtual(Dog.class, "getName",
MethodType.methodType(String.class)),
MethodType.methodType(String.class, Dog.class));
lambdaMetafactoryFunction = (Function)
site.getTarget().invokeExact();
// Non-static MethodHandle setup
nonStaticMethodHandle = lookup
.findVirtual(Dog.class, "getName",
MethodType.methodType(String.class))
.asType(MethodType.methodType(Object.class,
Object.class));
} catch (Throwable e) {
throw new IllegalStateException(e);
}
}
// ************************************************************
************
// Benchmark
// ************************************************************
************
private Object dogObject = new Dog("Fido");
@Benchmark
public Object _1_staticMethodHandle() throws Throwable {
return staticMethodHandle.invokeExact(dogObject);
}
@Benchmark
public Object _2_lambdaMetafactory() {
return lambdaMetafactoryFunction.apply(dogObject);
}
@Benchmark
public Object _3_nonStaticMethodHandle() throws Throwable {
return nonStaticMethodHandle.invokeExact(dogObject);
}
private static class Dog {
private String name;
public Dog(String name) {
this.name = name;
}
public String getName() {
return name;
}
}
}
With kind regards,
Geoffrey De Smet
_______________________________________________
mlvm-dev mailing list
http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
_______________________________________________
mlvm-dev mailing list
http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
--
Best Regards,
Wenlei Xie (谢文磊)

Email: ***@gmail.com
Wenlei Xie
2018-02-19 20:43:03 UTC
Permalink
Never mind. I miss some points in the previous discussion. Static method
JIT-compiler extracts method handle instance from static final field (as
if it were a constant from class constant pool) and inlines through
MH.invokeExact() down to the target method.

Is an orthogonal optimization with MethodHandle customization?

Best,
Wenlei
Post by Geoffrey De Smet
However, for java framework developers,
it would be really useful to have inlining for non-static method handles
too (see Charles's thread),
Is the problem that non-static MethodHandle doesn't get customized, or
it's because in the benchmark, each time it will use a new MethodHandle
from reflection?
I remember a MethodHandle will be customized when it was called over a
threshold (127 is the default). Thus as long as you are using the same
MethodHandle over the time, you will get the performance benefit from
customization, right?
Best,
Wenlei
Post by Geoffrey De Smet
Thank you for the insight, Vladimir.
In staticMethodHandle target method is statically known [1], but in case
Post by Vladimir Ivanov
of lambdaMetafactory [2] compiler has to rely on profiling info to
devirtualize Function::apply(). The latter requires exact type check on the
receiver at runtime and that explains the difference you are seeing.
Ah, so it's unlikely that a future JDK version could eliminate
that 10% difference between LambdaMetafactory and staticMethodHandle?
Good to know.
But comparing that with nonStaticMethodHandle is not fair: there's no
Post by Vladimir Ivanov
inlining happening there.
Agreed.
However, for java framework developers,
it would be really useful to have inlining for non-static method handles
too (see Charles's thread),
because - unlike JVM language developers - we can't use static method
handles and don't want to use code generation.
For example, if a JPA or JAXB implementation did use a static fields,
the code to call methods on a domain hierarchy of classes would look like
public final class MyAccessors {
private static final MethodHandle handle1; // Person.getName()
private static final MethodHandle handle2; // Person.getAge()
private static final MethodHandle handle3; // Company.getName()
private static final MethodHandle handle4; // Company.getAddress()
private static final MethodHandle handle5; // ...
private static final MethodHandle handle6;
private static final MethodHandle handle7;
private static final MethodHandle handle8;
private static final MethodHandle handle9;
...
private static final MethodHandle handle1000;
}
And furthermore, it would break down with domain hierarchies
that have more than 1000 getters/setters.
With kind regards,
Geoffrey De Smet
Post by Vladimir Ivanov
Geoffrey,
In both staticMethodHandle & lambdaMetafactory Dog::getName is inlined,
but using different mechanisms.
In staticMethodHandle target method is statically known [1], but in case
of lambdaMetafactory [2] compiler has to rely on profiling info to
devirtualize Function::apply(). The latter requires exact type check on the
receiver at runtime and that explains the difference you are seeing.
But comparing that with nonStaticMethodHandle is not fair: there's no
inlining happening there.
If you want a fair comparison, then you have to measure with polluted
profile so no inlining happens. In that case [3] non-static MethodHandles
LMF._4_lmf_fs avgt 10 20.020 ± 0.635 ns/op
LMF._4_lmf_mhs avgt 10 18.360 ± 0.181 ns/op
(scores for 3 invocations in a row.)
Best regards,
Vladimir Ivanov
[1] 715 126 b org.lmf.LMF::_1_staticMethodHandle (11 bytes)
...
@ 37 java.lang.invoke.DirectMethodHandle$Holder::invokeVirtual (14
bytes) force inline by annotation
@ 1 java.lang.invoke.DirectMethodHandle::internalMemberName (8
bytes) force inline by annotation
@ 10 org.lmf.LMF$Dog::getName (5 bytes) accessor
[2] 678 117 b org.lmf.LMF::_2_lambdaMetafactory (14 bytes)
@ 8 org.lmf.LMF$$Lambda$37/552160541::apply (8 bytes) inline (hot)
\-> TypeProfile (6700/6700 counts) = org/lmf/LMF$$Lambda$37
@ 4 org.lmf.LMF$Dog::getName (5 bytes) accessor
[3] http://cr.openjdk.java.net/~vlivanov/misc/LMF.java
static Function make() throws Throwable {
CallSite site = LambdaMetafactory.metafactory(LOOKUP,
"apply",
MethodType.methodType(Function.class),
MethodType.methodType(Object.class, Object.class),
LOOKUP.findVirtual(Dog.class, "getName",
MethodType.methodType(String.class)),
MethodType.methodType(String.class, Dog.class));
return (Function) site.getTarget().invokeExact();
}
private Function[] fs = new Function[] {
make(), make(), make()
};
private MethodHandle[] mhs = new MethodHandle[] {
nonStaticMethodHandle,
nonStaticMethodHandle,
nonStaticMethodHandle
};
@Benchmark
public Object _4_lmf_fs() throws Throwable {
Object r = null;
for (Function f : fs {
r = f.apply(dogObject);
}
return r;
}
@Benchmark
public Object _4_lmf_mh() throws Throwable {
Object r = null;
for (MethodHandle mh : mhs) {
r = mh.invokeExact(dogObject);
}
return r;
}
Post by Geoffrey De Smet
Hi guys,
I ran the following JMH benchmark on JDK 9 and JDK 8.
Source code and detailed results below.
Benchmark on JDK 9 Score
staticMethodHandle 2.770
lambdaMetafactory 3.052 // 10% slower
nonStaticMethodHandle 5.250 // 90% slower
Why is LambdaMetafactory 10% slower than a static MethodHandle
but 80% faster than a non-static MethodHandle?
Source code (copy paste ready)
====================
import java.lang.invoke.CallSite;
import java.lang.invoke.LambdaMetafactory;
import java.lang.invoke.MethodHandle;
import java.lang.invoke.MethodHandles;
import java.lang.invoke.MethodType;
import java.util.concurrent.TimeUnit;
import java.util.function.Function;
import org.openjdk.jmh.annotations.Benchmark;
import org.openjdk.jmh.annotations.BenchmarkMode;
import org.openjdk.jmh.annotations.Fork;
import org.openjdk.jmh.annotations.Measurement;
import org.openjdk.jmh.annotations.Mode;
import org.openjdk.jmh.annotations.OutputTimeUnit;
import org.openjdk.jmh.annotations.Scope;
import org.openjdk.jmh.annotations.State;
import org.openjdk.jmh.annotations.Warmup;
//Benchmark on JDK 9 Mode Cnt Score Error Units
//staticMethodHandle avgt 30 2.770 ± 0.023 ns/op // Baseline
//lambdaMetafactory avgt 30 3.052 ± 0.004 ns/op // 10% slower
//nonStaticMethodHandle avgt 30 5.250 ± 0.137 ns/op // 90% slower
//Benchmark on JDK 8 Mode Cnt Score Error Units
//staticMethodHandle avgt 30 2.772 ± 0.022 ns/op // Baseline
//lambdaMetafactory avgt 30 3.060 ± 0.007 ns/op // 10% slower
//nonStaticMethodHandle avgt 30 5.037 ± 0.022 ns/op // 81% slower
@Warmup(iterations = 5, time = 1, timeUnit = TimeUnit.SECONDS)
@Measurement(iterations = 10, time = 1, timeUnit = TimeUnit.SECONDS)
@Fork(3)
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
@State(Scope.Thread)
public class LamdaMetafactoryWeirdPerformance {
// ************************************************************
************
// Set up of the 3 approaches.
// ************************************************************
************
// Unusable for Java framework developers. Only usable by JVM
language developers. Baseline.
private static final MethodHandle staticMethodHandle;
// Usuable for Java framework developers. 30% slower
private final Function lambdaMetafactoryFunction;
// Usuable for Java framework developers. 100% slower
private final MethodHandle nonStaticMethodHandle;
static {
// Static MethodHandle setup
try {
staticMethodHandle = MethodHandles.lookup()
.findVirtual(Dog.class, "getName",
MethodType.methodType(String.class))
.asType(MethodType.methodType(Object.class,
Object.class));
} catch (NoSuchMethodException | IllegalAccessException e) {
throw new IllegalStateException(e);
}
}
public LamdaMetafactoryWeirdPerformance() {
try {
MethodHandles.Lookup lookup = MethodHandles.lookup();
// LambdaMetafactory setup
CallSite site = LambdaMetafactory.metafactory(lookup,
"apply",
MethodType.methodType(Function.class),
MethodType.methodType(Object.class, Object.class),
lookup.findVirtual(Dog.class, "getName",
MethodType.methodType(String.class)),
MethodType.methodType(String.class, Dog.class));
lambdaMetafactoryFunction = (Function)
site.getTarget().invokeExact();
// Non-static MethodHandle setup
nonStaticMethodHandle = lookup
.findVirtual(Dog.class, "getName",
MethodType.methodType(String.class))
.asType(MethodType.methodType(Object.class,
Object.class));
} catch (Throwable e) {
throw new IllegalStateException(e);
}
}
// ************************************************************
************
// Benchmark
// ************************************************************
************
private Object dogObject = new Dog("Fido");
@Benchmark
public Object _1_staticMethodHandle() throws Throwable {
return staticMethodHandle.invokeExact(dogObject);
}
@Benchmark
public Object _2_lambdaMetafactory() {
return lambdaMetafactoryFunction.apply(dogObject);
}
@Benchmark
public Object _3_nonStaticMethodHandle() throws Throwable {
return nonStaticMethodHandle.invokeExact(dogObject);
}
private static class Dog {
private String name;
public Dog(String name) {
this.name = name;
}
public String getName() {
return name;
}
}
}
With kind regards,
Geoffrey De Smet
_______________________________________________
mlvm-dev mailing list
http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
_______________________________________________
mlvm-dev mailing list
http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
--
Best Regards,
Wenlei Xie (谢文磊)
--
Best Regards,
Wenlei Xie (谢文磊)

Email: ***@gmail.com
Vladimir Ivanov
2018-02-19 21:10:41 UTC
Permalink
Post by Wenlei Xie
Never mind. I miss some points in the previous discussion. Static method
JIT-compiler extracts method handle instance from static final field
(as if it were a constant from class constant pool) and inlines through
MH.invokeExact() down to the target method.
Is an orthogonal optimization with MethodHandle customization?
Yes, they are complementary. LambdaForm customization is applied to
method handles observed at MH.invokeExact()/invoke() call sites as
non-constants (in JIT-compiled code). There won't be any customization
applied (at least, at that particular call site) to a method handle
coming from a static final field.

Best regards,
Vladimir Ivanov
Post by Wenlei Xie
However, for java framework developers,
it would be really useful to have inlining for non-static method handles too (see Charles's thread),
Is the problem that non-static MethodHandle doesn't get customized,
or it's because in the benchmark, each time it will use a new
MethodHandle from reflection?
I remember a MethodHandle will be customized when it was called over
a threshold (127 is the default). Thus as long as you are using the
same MethodHandle over the time, you will get the performance
benefit from customization, right?
Best,
Wenlei
On Mon, Feb 19, 2018 at 5:41 AM, Geoffrey De Smet
Thank you for the insight, Vladimir.
In staticMethodHandle target method is statically known [1],
but in case of lambdaMetafactory [2] compiler has to rely on
profiling info to devirtualize Function::apply(). The latter
requires exact type check on the receiver at runtime and
that explains the difference you are seeing.
Ah, so it's unlikely that a future JDK version could eliminate
that 10% difference between LambdaMetafactory and
staticMethodHandle?
Good to know.
there's no inlining happening there.
Agreed.
However, for java framework developers,
it would be really useful to have inlining for non-static method
handles too (see Charles's thread),
because - unlike JVM language developers - we can't use static
method handles and don't want to use code generation.
For example, if a JPA or JAXB implementation did use a static fields,
the code to call methods on a domain hierarchy of classes would
public final class MyAccessors {
    private static final MethodHandle handle1; // Person.getName()
    private static final MethodHandle handle2; // Person.getAge()
    private static final MethodHandle handle3; // Company.getName()
    private static final MethodHandle handle4; //
Company.getAddress()
    private static final MethodHandle handle5; // ...
    private static final MethodHandle handle6;
    private static final MethodHandle handle7;
    private static final MethodHandle handle8;
    private static final MethodHandle handle9;
    ...
    private static final MethodHandle handle1000;
}
And furthermore, it would break down with domain hierarchies
that have more than 1000 getters/setters.
With kind regards,
Geoffrey De Smet
Geoffrey,
In both staticMethodHandle & lambdaMetafactory Dog::getName
is inlined, but using different mechanisms.
In staticMethodHandle target method is statically known [1],
but in case of lambdaMetafactory [2] compiler has to rely on
profiling info to devirtualize Function::apply(). The latter
requires exact type check on the receiver at runtime and
that explains the difference you are seeing.
there's no inlining happening there.
If you want a fair comparison, then you have to measure with
polluted profile so no inlining happens. In that case [3]
LMF._4_lmf_fs  avgt   10  20.020 ± 0.635  ns/op
LMF._4_lmf_mhs avgt   10  18.360 ± 0.181  ns/op
(scores for 3 invocations in a row.)
Best regards,
Vladimir Ivanov
[1] 715  126    b        org.lmf.LMF::_1_staticMethodHandle
(11 bytes)
...
java.lang.invoke.DirectMethodHandle$Holder::invokeVirtual
(14 bytes)   force inline by annotation
java.lang.invoke.DirectMethodHandle::internalMemberName (8
bytes)   force inline by annotation
[2] 678  117    b        org.lmf.LMF::_2_lambdaMetafactory
(14 bytes)
@ 8   org.lmf.LMF$$Lambda$37/552160541::apply (8 bytes)
inline (hot)
 \-> TypeProfile (6700/6700 counts) = org/lmf/LMF$$Lambda$37
[3] http://cr.openjdk.java.net/~vlivanov/misc/LMF.java
<http://cr.openjdk.java.net/~vlivanov/misc/LMF.java>
    static Function make() throws Throwable {
        CallSite site = LambdaMetafactory.metafactory(LOOKUP,
                "apply",
                MethodType.methodType(Function.class),
                MethodType.methodType(Object.class,
Object.class),
                LOOKUP.findVirtual(Dog.class, "getName",
MethodType.methodType(String.class)),
                MethodType.methodType(String.class,
Dog.class));
        return (Function) site.getTarget().invokeExact();
    }
    private Function[] fs = new Function[] {
        make(), make(), make()
    };
    private MethodHandle[] mhs = new MethodHandle[] {
        nonStaticMethodHandle,
        nonStaticMethodHandle,
        nonStaticMethodHandle
    };
    public Object _4_lmf_fs() throws Throwable {
        Object r = null;
        for (Function f : fs {
            r = f.apply(dogObject);
        }
        return r;
    }
    public Object _4_lmf_mh() throws Throwable {
        Object r = null;
        for (MethodHandle mh : mhs) {
            r = mh.invokeExact(dogObject);
        }
        return r;
    }
Hi guys,
I ran the following JMH benchmark on JDK 9 and JDK 8.
Source code and detailed results below.
Benchmark on JDK 9        Score
staticMethodHandle          2.770
lambdaMetafactory          3.052    // 10% slower
nonStaticMethodHandle   5.250    // 90% slower
Why is LambdaMetafactory 10% slower than a static
MethodHandle
but 80% faster than a non-static MethodHandle?
Source code (copy paste ready)
====================
import java.lang.invoke.CallSite;
import java.lang.invoke.LambdaMetafactory;
import java.lang.invoke.MethodHandle;
import java.lang.invoke.MethodHandles;
import java.lang.invoke.MethodType;
import java.util.concurrent.TimeUnit;
import java.util.function.Function;
import org.openjdk.jmh.annotations.Be
<http://org.openjdk.jmh.annotations.Be>nchmark;
import org.openjdk.jmh.annotations.Be
<http://org.openjdk.jmh.annotations.Be>nchmarkMode;
import org.openjdk.jmh.annotations.Fo
<http://org.openjdk.jmh.annotations.Fo>rk;
import org.openjdk.jmh.annotations.Me
<http://org.openjdk.jmh.annotations.Me>asurement;
import org.openjdk.jmh.annotations.Mo
<http://org.openjdk.jmh.annotations.Mo>de;
import org.openjdk.jmh.annotations.OutputTimeUnit;
import org.openjdk.jmh.annotations.Sc
<http://org.openjdk.jmh.annotations.Sc>ope;
import org.openjdk.jmh.annotations.St
<http://org.openjdk.jmh.annotations.St>ate;
import org.openjdk.jmh.annotations.Warmup;
//Benchmark on JDK 9     Mode  Cnt  Score   Error  Units
//staticMethodHandle     avgt   30  2.770 ± 0.023  ns/op
// Baseline
//lambdaMetafactory      avgt   30  3.052 ± 0.004  ns/op
// 10% slower
//nonStaticMethodHandle  avgt   30  5.250 ± 0.137  ns/op
// 90% slower
//Benchmark on JDK 8     Mode  Cnt  Score   Error  Units
//staticMethodHandle     avgt   30  2.772 ± 0.022  ns/op
// Baseline
//lambdaMetafactory      avgt   30  3.060 ± 0.007  ns/op
// 10% slower
//nonStaticMethodHandle  avgt   30  5.037 ± 0.022  ns/op
// 81% slower
@Warmup(iterations = 5, time = 1, timeUnit =
TimeUnit.SECONDS)
@Measurement(iterations = 10, time = 1, timeUnit =
TimeUnit.SECONDS)
@Fork(3)
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
@State(Scope.Thread)
public class LamdaMetafactoryWeirdPerformance {
     //
************************************************************************
     // Set up of the 3 approaches.
     //
************************************************************************
     // Unusable for Java framework developers. Only
usable by JVM language developers. Baseline.
     private static final MethodHandle staticMethodHandle;
     // Usuable for Java framework developers. 30% slower
     private final Function lambdaMetafactoryFunction;
     // Usuable for Java framework developers. 100% slower
     private final MethodHandle nonStaticMethodHandle;
     static {
         // Static MethodHandle setup
         try {
             staticMethodHandle = MethodHandles.lookup()
                     .findVirtual(Dog.class, "getName",
MethodType.methodType(String.class))
.asType(MethodType.methodType(Object.class, Object.class));
         } catch (NoSuchMethodException |
IllegalAccessException e) {
             throw new IllegalStateException(e);
         }
     }
     public LamdaMetafactoryWeirdPerformance() {
         try {
             MethodHandles.Lookup lookup =
MethodHandles.lookup();
             // LambdaMetafactory setup
             CallSite site =
LambdaMetafactory.metafactory(lookup,
                     "apply",
                     MethodType.methodType(Function.class),
MethodType.methodType(Object.class, Object.class),
                     lookup.findVirtual(Dog.class,
"getName", MethodType.methodType(String.class)),
MethodType.methodType(String.class, Dog.class));
             lambdaMetafactoryFunction = (Function)
site.getTarget().invokeExact();
             // Non-static MethodHandle setup
             nonStaticMethodHandle = lookup
                     .findVirtual(Dog.class, "getName",
MethodType.methodType(String.class))
.asType(MethodType.methodType(Object.class, Object.class));
         } catch (Throwable e) {
             throw new IllegalStateException(e);
         }
     }
     //
************************************************************************
     // Benchmark
     //
************************************************************************
     private Object dogObject = new Dog("Fido");
     public Object _1_staticMethodHandle() throws
Throwable {
         return staticMethodHandle.invokeExact(dogObject);
     }
     public Object _2_lambdaMetafactory() {
         return lambdaMetafactoryFunction.apply(dogObject);
     }
     public Object _3_nonStaticMethodHandle() throws
Throwable {
         return
nonStaticMethodHandle.invokeExact(dogObject);
     }
     private static class Dog {
         private String name;
         public Dog(String name) {
this.name <http://this.name> = name;
         }
         public String getName() {
             return name;
         }
     }
}
With kind regards,
Geoffrey De Smet
_______________________________________________
mlvm-dev mailing list
http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
<http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev>
_______________________________________________
mlvm-dev mailing list
http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
<http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev>
--
Best Regards,
Wenlei Xie (谢文磊)
--
Best Regards,
Wenlei Xie (谢文磊)
Wenlei Xie
2018-02-19 22:54:56 UTC
Permalink
Thank you Vladimir for the explanation!
Post by Vladimir Ivanov
In both staticMethodHandle & lambdaMetafactory Dog::getName is inlined,
but using different mechanisms.
Post by Vladimir Ivanov
In staticMethodHandle target method is statically known [1], but in case
of lambdaMetafactory [2] compiler has to rely on profiling info to
devirtualize Function::apply(). The latter requires exact type check on the
receiver at runtime and that explains the difference you are seeing.
Post by Vladimir Ivanov
But comparing that with nonStaticMethodHandle is not fair: there's no
inlining happening there.

Sorry if it's a dumb question, but why nonStaticMethodHandle cannot get
inlined here? -- In the benchmark it's always the same line with the same
final MethodHandle variable, can JIT based on some profiling info to inline
it (similar to the function object generated by LambdaMetafactory). -- Or
it cannot sine InvokeExact's PolymorphicSignature makes it quite special?

Also, does that mean if we try to pollute the LambdaMetafactory (e.g. by 3
different function objects) to prevent inline, we are likely to see similar
performance :)

Best,
Wenlei


On Mon, Feb 19, 2018 at 4:00 AM, Vladimir Ivanov <
Post by Vladimir Ivanov
Geoffrey,
In both staticMethodHandle & lambdaMetafactory Dog::getName is inlined,
but using different mechanisms.
In staticMethodHandle target method is statically known [1], but in case
of lambdaMetafactory [2] compiler has to rely on profiling info to
devirtualize Function::apply(). The latter requires exact type check on the
receiver at runtime and that explains the difference you are seeing.
But comparing that with nonStaticMethodHandle is not fair: there's no
inlining happening there.
If you want a fair comparison, then you have to measure with polluted
profile so no inlining happens. In that case [3] non-static MethodHandles
LMF._4_lmf_fs avgt 10 20.020 ± 0.635 ns/op
LMF._4_lmf_mhs avgt 10 18.360 ± 0.181 ns/op
(scores for 3 invocations in a row.)
Best regards,
Vladimir Ivanov
[1] 715 126 b org.lmf.LMF::_1_staticMethodHandle (11 bytes)
...
@ 37 java.lang.invoke.DirectMethodHandle$Holder::invokeVirtual (14
bytes) force inline by annotation
@ 1 java.lang.invoke.DirectMethodHandle::internalMemberName (8
bytes) force inline by annotation
@ 10 org.lmf.LMF$Dog::getName (5 bytes) accessor
[2] 678 117 b org.lmf.LMF::_2_lambdaMetafactory (14 bytes)
@ 8 org.lmf.LMF$$Lambda$37/552160541::apply (8 bytes) inline (hot)
\-> TypeProfile (6700/6700 counts) = org/lmf/LMF$$Lambda$37
@ 4 org.lmf.LMF$Dog::getName (5 bytes) accessor
[3] http://cr.openjdk.java.net/~vlivanov/misc/LMF.java
static Function make() throws Throwable {
CallSite site = LambdaMetafactory.metafactory(LOOKUP,
"apply",
MethodType.methodType(Function.class),
MethodType.methodType(Object.class, Object.class),
LOOKUP.findVirtual(Dog.class, "getName",
MethodType.methodType(String.class)),
MethodType.methodType(String.class, Dog.class));
return (Function) site.getTarget().invokeExact();
}
private Function[] fs = new Function[] {
make(), make(), make()
};
private MethodHandle[] mhs = new MethodHandle[] {
nonStaticMethodHandle,
nonStaticMethodHandle,
nonStaticMethodHandle
};
@Benchmark
public Object _4_lmf_fs() throws Throwable {
Object r = null;
for (Function f : fs {
r = f.apply(dogObject);
}
return r;
}
@Benchmark
public Object _4_lmf_mh() throws Throwable {
Object r = null;
for (MethodHandle mh : mhs) {
r = mh.invokeExact(dogObject);
}
return r;
}
Post by Geoffrey De Smet
Hi guys,
I ran the following JMH benchmark on JDK 9 and JDK 8.
Source code and detailed results below.
Benchmark on JDK 9 Score
staticMethodHandle 2.770
lambdaMetafactory 3.052 // 10% slower
nonStaticMethodHandle 5.250 // 90% slower
Why is LambdaMetafactory 10% slower than a static MethodHandle
but 80% faster than a non-static MethodHandle?
Source code (copy paste ready)
====================
import java.lang.invoke.CallSite;
import java.lang.invoke.LambdaMetafactory;
import java.lang.invoke.MethodHandle;
import java.lang.invoke.MethodHandles;
import java.lang.invoke.MethodType;
import java.util.concurrent.TimeUnit;
import java.util.function.Function;
import org.openjdk.jmh.annotations.Benchmark;
import org.openjdk.jmh.annotations.BenchmarkMode;
import org.openjdk.jmh.annotations.Fork;
import org.openjdk.jmh.annotations.Measurement;
import org.openjdk.jmh.annotations.Mode;
import org.openjdk.jmh.annotations.OutputTimeUnit;
import org.openjdk.jmh.annotations.Scope;
import org.openjdk.jmh.annotations.State;
import org.openjdk.jmh.annotations.Warmup;
//Benchmark on JDK 9 Mode Cnt Score Error Units
//staticMethodHandle avgt 30 2.770 ± 0.023 ns/op // Baseline
//lambdaMetafactory avgt 30 3.052 ± 0.004 ns/op // 10% slower
//nonStaticMethodHandle avgt 30 5.250 ± 0.137 ns/op // 90% slower
//Benchmark on JDK 8 Mode Cnt Score Error Units
//staticMethodHandle avgt 30 2.772 ± 0.022 ns/op // Baseline
//lambdaMetafactory avgt 30 3.060 ± 0.007 ns/op // 10% slower
//nonStaticMethodHandle avgt 30 5.037 ± 0.022 ns/op // 81% slower
@Warmup(iterations = 5, time = 1, timeUnit = TimeUnit.SECONDS)
@Measurement(iterations = 10, time = 1, timeUnit = TimeUnit.SECONDS)
@Fork(3)
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
@State(Scope.Thread)
public class LamdaMetafactoryWeirdPerformance {
// ************************************************************
************
// Set up of the 3 approaches.
// ************************************************************
************
// Unusable for Java framework developers. Only usable by JVM
language developers. Baseline.
private static final MethodHandle staticMethodHandle;
// Usuable for Java framework developers. 30% slower
private final Function lambdaMetafactoryFunction;
// Usuable for Java framework developers. 100% slower
private final MethodHandle nonStaticMethodHandle;
static {
// Static MethodHandle setup
try {
staticMethodHandle = MethodHandles.lookup()
.findVirtual(Dog.class, "getName",
MethodType.methodType(String.class))
.asType(MethodType.methodType(Object.class,
Object.class));
} catch (NoSuchMethodException | IllegalAccessException e) {
throw new IllegalStateException(e);
}
}
public LamdaMetafactoryWeirdPerformance() {
try {
MethodHandles.Lookup lookup = MethodHandles.lookup();
// LambdaMetafactory setup
CallSite site = LambdaMetafactory.metafactory(lookup,
"apply",
MethodType.methodType(Function.class),
MethodType.methodType(Object.class, Object.class),
lookup.findVirtual(Dog.class, "getName",
MethodType.methodType(String.class)),
MethodType.methodType(String.class, Dog.class));
lambdaMetafactoryFunction = (Function)
site.getTarget().invokeExact();
// Non-static MethodHandle setup
nonStaticMethodHandle = lookup
.findVirtual(Dog.class, "getName",
MethodType.methodType(String.class))
.asType(MethodType.methodType(Object.class,
Object.class));
} catch (Throwable e) {
throw new IllegalStateException(e);
}
}
// ************************************************************
************
// Benchmark
// ************************************************************
************
private Object dogObject = new Dog("Fido");
@Benchmark
public Object _1_staticMethodHandle() throws Throwable {
return staticMethodHandle.invokeExact(dogObject);
}
@Benchmark
public Object _2_lambdaMetafactory() {
return lambdaMetafactoryFunction.apply(dogObject);
}
@Benchmark
public Object _3_nonStaticMethodHandle() throws Throwable {
return nonStaticMethodHandle.invokeExact(dogObject);
}
private static class Dog {
private String name;
public Dog(String name) {
this.name = name;
}
public String getName() {
return name;
}
}
}
With kind regards,
Geoffrey De Smet
_______________________________________________
mlvm-dev mailing list
http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
_______________________________________________
mlvm-dev mailing list
http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
--
Best Regards,
Wenlei Xie (谢文磊)

Email: ***@gmail.com
Vladimir Ivanov
2018-02-19 23:14:42 UTC
Permalink
Post by Wenlei Xie
Sorry if it's a dumb question, but why nonStaticMethodHandle cannot get
inlined here? -- In the benchmark it's always the same line with the
same final MethodHandle variable, can JIT based on some profiling info
to inline it (similar to the function object generated by
LambdaMetafactory). -- Or it cannot sine InvokeExact's
PolymorphicSignature makes it quite special?
Yes, method handle invokers are special and ordinary type profiling
(class-based) doesn't work for them.

There was an idea to implement value profiling for MH invokers: record
individual MethodHandle instances observed at invoker call sites and use
that to guide devirtualizaiton & inlining decisions. But it looked way
too specialized to be beneficial in practice.
Post by Wenlei Xie
Also, does that mean if we try to pollute the LambdaMetafactory (e.g. by
3 different function objects) to prevent inline, we are likely to see
similar performance :)
Yes, performance is on a par with polluted profile. The benchmark [1]
measures non-inlined case for invokeinterface and MH.invokeBasic (3
invocations/iter):

LMF._4_lmf_fs 20.020 ± 0.635 ns/op
LMF._4_lmf_mhs 18.360 ± 0.181 ns/op

Best regards,
Vladimir Ivanov

[1] http://cr.openjdk.java.net/~vlivanov/misc/LMF.java
Post by Wenlei Xie
On Mon, Feb 19, 2018 at 4:00 AM, Vladimir Ivanov
Geoffrey,
In both staticMethodHandle & lambdaMetafactory Dog::getName is
inlined, but using different mechanisms.
In staticMethodHandle target method is statically known [1], but in
case of lambdaMetafactory [2] compiler has to rely on profiling info
to devirtualize Function::apply(). The latter requires exact type
check on the receiver at runtime and that explains the difference
you are seeing.
But comparing that with nonStaticMethodHandle is not fair: there's
no inlining happening there.
If you want a fair comparison, then you have to measure with
polluted profile so no inlining happens. In that case [3] non-static
LMF._4_lmf_fs  avgt   10  20.020 ± 0.635  ns/op
LMF._4_lmf_mhs avgt   10  18.360 ± 0.181  ns/op
(scores for 3 invocations in a row.)
Best regards,
Vladimir Ivanov
[1] 715  126    b        org.lmf.LMF::_1_staticMethodHandle (11 bytes)
...
 java.lang.invoke.DirectMethodHandle$Holder::invokeVirtual (14
bytes)   force inline by annotation
(8 bytes)   force inline by annotation
[2] 678  117    b        org.lmf.LMF::_2_lambdaMetafactory (14 bytes)
@ 8   org.lmf.LMF$$Lambda$37/552160541::apply (8 bytes)   inline (hot)
 \-> TypeProfile (6700/6700 counts) = org/lmf/LMF$$Lambda$37
[3] http://cr.openjdk.java.net/~vlivanov/misc/LMF.java
<http://cr.openjdk.java.net/~vlivanov/misc/LMF.java>
    static Function make() throws Throwable {
        CallSite site = LambdaMetafactory.metafactory(LOOKUP,
                "apply",
                MethodType.methodType(Function.class),
                MethodType.methodType(Object.class, Object.class),
                LOOKUP.findVirtual(Dog.class, "getName",
MethodType.methodType(String.class)),
                MethodType.methodType(String.class, Dog.class));
        return (Function) site.getTarget().invokeExact();
    }
    private Function[] fs = new Function[] {
        make(), make(), make()
    };
    private MethodHandle[] mhs = new MethodHandle[] {
        nonStaticMethodHandle,
        nonStaticMethodHandle,
        nonStaticMethodHandle
    };
    public Object _4_lmf_fs() throws Throwable {
        Object r = null;
        for (Function f : fs {
            r = f.apply(dogObject);
        }
        return r;
    }
    public Object _4_lmf_mh() throws Throwable {
        Object r = null;
        for (MethodHandle mh : mhs) {
            r = mh.invokeExact(dogObject);
        }
        return r;
    }
Hi guys,
I ran the following JMH benchmark on JDK 9 and JDK 8.
Source code and detailed results below.
Benchmark on JDK 9        Score
staticMethodHandle          2.770
lambdaMetafactory          3.052    // 10% slower
nonStaticMethodHandle   5.250    // 90% slower
Why is LambdaMetafactory 10% slower than a static MethodHandle
but 80% faster than a non-static MethodHandle?
Source code (copy paste ready)
====================
import java.lang.invoke.CallSite;
import java.lang.invoke.LambdaMetafactory;
import java.lang.invoke.MethodHandle;
import java.lang.invoke.MethodHandles;
import java.lang.invoke.MethodType;
import java.util.concurrent.TimeUnit;
import java.util.function.Function;
import org.openjdk.jmh.annotations.Be
<http://org.openjdk.jmh.annotations.Be>nchmark;
import org.openjdk.jmh.annotations.Be
<http://org.openjdk.jmh.annotations.Be>nchmarkMode;
import org.openjdk.jmh.annotations.Fo
<http://org.openjdk.jmh.annotations.Fo>rk;
import org.openjdk.jmh.annotations.Me
<http://org.openjdk.jmh.annotations.Me>asurement;
import org.openjdk.jmh.annotations.Mo
<http://org.openjdk.jmh.annotations.Mo>de;
import org.openjdk.jmh.annotations.OutputTimeUnit;
import org.openjdk.jmh.annotations.Sc
<http://org.openjdk.jmh.annotations.Sc>ope;
import org.openjdk.jmh.annotations.St
<http://org.openjdk.jmh.annotations.St>ate;
import org.openjdk.jmh.annotations.Warmup;
//Benchmark on JDK 9     Mode  Cnt  Score   Error  Units
//staticMethodHandle     avgt   30  2.770 ± 0.023  ns/op // Baseline
//lambdaMetafactory      avgt   30  3.052 ± 0.004  ns/op // 10%
slower
//nonStaticMethodHandle  avgt   30  5.250 ± 0.137  ns/op // 90%
slower
//Benchmark on JDK 8     Mode  Cnt  Score   Error  Units
//staticMethodHandle     avgt   30  2.772 ± 0.022  ns/op // Baseline
//lambdaMetafactory      avgt   30  3.060 ± 0.007  ns/op // 10%
slower
//nonStaticMethodHandle  avgt   30  5.037 ± 0.022  ns/op // 81%
slower
@Warmup(iterations = 5, time = 1, timeUnit = TimeUnit.SECONDS)
@Measurement(iterations = 10, time = 1, timeUnit = TimeUnit.SECONDS)
@Fork(3)
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
@State(Scope.Thread)
public class LamdaMetafactoryWeirdPerformance {
     //
************************************************************************
     // Set up of the 3 approaches.
     //
************************************************************************
     // Unusable for Java framework developers. Only usable by
JVM language developers. Baseline.
     private static final MethodHandle staticMethodHandle;
     // Usuable for Java framework developers. 30% slower
     private final Function lambdaMetafactoryFunction;
     // Usuable for Java framework developers. 100% slower
     private final MethodHandle nonStaticMethodHandle;
     static {
         // Static MethodHandle setup
         try {
             staticMethodHandle = MethodHandles.lookup()
                     .findVirtual(Dog.class, "getName",
MethodType.methodType(String.class))
.asType(MethodType.methodType(Object.class, Object.class));
         } catch (NoSuchMethodException |
IllegalAccessException e) {
             throw new IllegalStateException(e);
         }
     }
     public LamdaMetafactoryWeirdPerformance() {
         try {
             MethodHandles.Lookup lookup = MethodHandles.lookup();
             // LambdaMetafactory setup
             CallSite site = LambdaMetafactory.metafactory(lookup,
                     "apply",
                     MethodType.methodType(Function.class),
                     MethodType.methodType(Object.class,
Object.class),
                     lookup.findVirtual(Dog.class, "getName",
MethodType.methodType(String.class)),
                     MethodType.methodType(String.class,
Dog.class));
             lambdaMetafactoryFunction = (Function)
site.getTarget().invokeExact();
             // Non-static MethodHandle setup
             nonStaticMethodHandle = lookup
                     .findVirtual(Dog.class, "getName",
MethodType.methodType(String.class))
.asType(MethodType.methodType(Object.class, Object.class));
         } catch (Throwable e) {
             throw new IllegalStateException(e);
         }
     }
     //
************************************************************************
     // Benchmark
     //
************************************************************************
     private Object dogObject = new Dog("Fido");
     public Object _1_staticMethodHandle() throws Throwable {
         return staticMethodHandle.invokeExact(dogObject);
     }
     public Object _2_lambdaMetafactory() {
         return lambdaMetafactoryFunction.apply(dogObject);
     }
     public Object _3_nonStaticMethodHandle() throws Throwable {
         return nonStaticMethodHandle.invokeExact(dogObject);
     }
     private static class Dog {
         private String name;
         public Dog(String name) {
this.name <http://this.name> = name;
         }
         public String getName() {
             return name;
         }
     }
}
With kind regards,
Geoffrey De Smet
_______________________________________________
mlvm-dev mailing list
http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
<http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev>
_______________________________________________
mlvm-dev mailing list
http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
<http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev>
--
Best Regards,
Wenlei Xie (谢文磊)
Jochen Theodorou
2018-02-19 23:38:41 UTC
Permalink
Post by Vladimir Ivanov
Post by Wenlei Xie
Sorry if it's a dumb question, but why nonStaticMethodHandle cannot
get inlined here? -- In the benchmark it's always the same line with
the same final MethodHandle variable, can JIT based on some profiling
info to inline it (similar to the function object generated by
LambdaMetafactory). -- Or it cannot sine InvokeExact's
PolymorphicSignature makes it quite special?
Yes, method handle invokers are special and ordinary type profiling
(class-based) doesn't work for them.
I am absolutely not uptodate here, but there was talk about trace based
type profiling. Did that become reality?

bye Jochen
Remi Forax
2018-02-20 13:23:23 UTC
Permalink
----- Mail original -----
Envoyé: Mardi 20 Février 2018 00:14:42
Objet: Re: Why is LambdaMetafactory 10% slower than a static MethodHandle but 80% faster than a non-static MethodHandle?
Post by Wenlei Xie
Sorry if it's a dumb question, but why nonStaticMethodHandle cannot get
inlined here? -- In the benchmark it's always the same line with the
same final MethodHandle variable, can JIT based on some profiling info
to inline it (similar to the function object generated by
LambdaMetafactory). -- Or it cannot sine InvokeExact's
PolymorphicSignature makes it quite special?
Yes, method handle invokers are special and ordinary type profiling
(class-based) doesn't work for them.
There was an idea to implement value profiling for MH invokers: record
individual MethodHandle instances observed at invoker call sites and use
that to guide devirtualizaiton & inlining decisions. But it looked way
too specialized to be beneficial in practice.
Here is a code that does exactly that,
https://gist.github.com/forax/7bf08669f58804991fd45656a671c381

[...]
Best regards,
Vladimir Ivanov
Rémi
Post by Wenlei Xie
On Mon, Feb 19, 2018 at 4:00 AM, Vladimir Ivanov
Geoffrey,
In both staticMethodHandle & lambdaMetafactory Dog::getName is
inlined, but using different mechanisms.
In staticMethodHandle target method is statically known [1], but in
case of lambdaMetafactory [2] compiler has to rely on profiling info
to devirtualize Function::apply(). The latter requires exact type
check on the receiver at runtime and that explains the difference
you are seeing.
But comparing that with nonStaticMethodHandle is not fair: there's
no inlining happening there.
If you want a fair comparison, then you have to measure with
polluted profile so no inlining happens. In that case [3] non-static
LMF._4_lmf_fs  avgt   10  20.020 ± 0.635  ns/op
LMF._4_lmf_mhs avgt   10  18.360 ± 0.181  ns/op
(scores for 3 invocations in a row.)
Best regards,
Vladimir Ivanov
[1] 715  126    b        org.lmf.LMF::_1_staticMethodHandle (11 bytes)
...
 java.lang.invoke.DirectMethodHandle$Holder::invokeVirtual (14
bytes)   force inline by annotation
(8 bytes)   force inline by annotation
[2] 678  117    b        org.lmf.LMF::_2_lambdaMetafactory (14 bytes)
@ 8   org.lmf.LMF$$Lambda$37/552160541::apply (8 bytes)   inline (hot)
 \-> TypeProfile (6700/6700 counts) = org/lmf/LMF$$Lambda$37
[3] http://cr.openjdk.java.net/~vlivanov/misc/LMF.java
<http://cr.openjdk.java.net/~vlivanov/misc/LMF.java>
    static Function make() throws Throwable {
        CallSite site = LambdaMetafactory.metafactory(LOOKUP,
                "apply",
                MethodType.methodType(Function.class),
                MethodType.methodType(Object.class, Object.class),
                LOOKUP.findVirtual(Dog.class, "getName",
MethodType.methodType(String.class)),
                MethodType.methodType(String.class, Dog.class));
        return (Function) site.getTarget().invokeExact();
    }
    private Function[] fs = new Function[] {
        make(), make(), make()
    };
    private MethodHandle[] mhs = new MethodHandle[] {
        nonStaticMethodHandle,
        nonStaticMethodHandle,
        nonStaticMethodHandle
    };
    public Object _4_lmf_fs() throws Throwable {
        Object r = null;
        for (Function f : fs {
            r = f.apply(dogObject);
        }
        return r;
    }
    public Object _4_lmf_mh() throws Throwable {
        Object r = null;
        for (MethodHandle mh : mhs) {
            r = mh.invokeExact(dogObject);
        }
        return r;
    }
Hi guys,
I ran the following JMH benchmark on JDK 9 and JDK 8.
Source code and detailed results below.
Benchmark on JDK 9        Score
staticMethodHandle          2.770
lambdaMetafactory          3.052    // 10% slower
nonStaticMethodHandle   5.250    // 90% slower
Why is LambdaMetafactory 10% slower than a static MethodHandle
but 80% faster than a non-static MethodHandle?
Source code (copy paste ready)
====================
import java.lang.invoke.CallSite;
import java.lang.invoke.LambdaMetafactory;
import java.lang.invoke.MethodHandle;
import java.lang.invoke.MethodHandles;
import java.lang.invoke.MethodType;
import java.util.concurrent.TimeUnit;
import java.util.function.Function;
import org.openjdk.jmh.annotations.Be
<http://org.openjdk.jmh.annotations.Be>nchmark;
import org.openjdk.jmh.annotations.Be
<http://org.openjdk.jmh.annotations.Be>nchmarkMode;
import org.openjdk.jmh.annotations.Fo
<http://org.openjdk.jmh.annotations.Fo>rk;
import org.openjdk.jmh.annotations.Me
<http://org.openjdk.jmh.annotations.Me>asurement;
import org.openjdk.jmh.annotations.Mo
<http://org.openjdk.jmh.annotations.Mo>de;
import org.openjdk.jmh.annotations.OutputTimeUnit;
import org.openjdk.jmh.annotations.Sc
<http://org.openjdk.jmh.annotations.Sc>ope;
import org.openjdk.jmh.annotations.St
<http://org.openjdk.jmh.annotations.St>ate;
import org.openjdk.jmh.annotations.Warmup;
//Benchmark on JDK 9     Mode  Cnt  Score   Error  Units
//staticMethodHandle     avgt   30  2.770 ± 0.023  ns/op // Baseline
//lambdaMetafactory      avgt   30  3.052 ± 0.004  ns/op // 10%
slower
//nonStaticMethodHandle  avgt   30  5.250 ± 0.137  ns/op // 90%
slower
//Benchmark on JDK 8     Mode  Cnt  Score   Error  Units
//staticMethodHandle     avgt   30  2.772 ± 0.022  ns/op // Baseline
//lambdaMetafactory      avgt   30  3.060 ± 0.007  ns/op // 10%
slower
//nonStaticMethodHandle  avgt   30  5.037 ± 0.022  ns/op // 81%
slower
@Warmup(iterations = 5, time = 1, timeUnit = TimeUnit.SECONDS)
@Measurement(iterations = 10, time = 1, timeUnit = TimeUnit.SECONDS)
@Fork(3)
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
@State(Scope.Thread)
public class LamdaMetafactoryWeirdPerformance {
     //
************************************************************************
     // Set up of the 3 approaches.
     //
************************************************************************
     // Unusable for Java framework developers. Only usable by
JVM language developers. Baseline.
     private static final MethodHandle staticMethodHandle;
     // Usuable for Java framework developers. 30% slower
     private final Function lambdaMetafactoryFunction;
     // Usuable for Java framework developers. 100% slower
     private final MethodHandle nonStaticMethodHandle;
     static {
         // Static MethodHandle setup
         try {
             staticMethodHandle = MethodHandles.lookup()
                     .findVirtual(Dog.class, "getName",
MethodType.methodType(String.class))
.asType(MethodType.methodType(Object.class, Object.class));
         } catch (NoSuchMethodException |
IllegalAccessException e) {
             throw new IllegalStateException(e);
         }
     }
     public LamdaMetafactoryWeirdPerformance() {
         try {
             MethodHandles.Lookup lookup = MethodHandles.lookup();
             // LambdaMetafactory setup
             CallSite site = LambdaMetafactory.metafactory(lookup,
                     "apply",
                     MethodType.methodType(Function.class),
                     MethodType.methodType(Object.class,
Object.class),
                     lookup.findVirtual(Dog.class, "getName",
MethodType.methodType(String.class)),
                     MethodType.methodType(String.class,
Dog.class));
             lambdaMetafactoryFunction = (Function)
site.getTarget().invokeExact();
             // Non-static MethodHandle setup
             nonStaticMethodHandle = lookup
                     .findVirtual(Dog.class, "getName",
MethodType.methodType(String.class))
.asType(MethodType.methodType(Object.class, Object.class));
         } catch (Throwable e) {
             throw new IllegalStateException(e);
         }
     }
     //
************************************************************************
     // Benchmark
     //
************************************************************************
     private Object dogObject = new Dog("Fido");
     public Object _1_staticMethodHandle() throws Throwable {
         return staticMethodHandle.invokeExact(dogObject);
     }
     public Object _2_lambdaMetafactory() {
         return lambdaMetafactoryFunction.apply(dogObject);
     }
     public Object _3_nonStaticMethodHandle() throws Throwable {
         return nonStaticMethodHandle.invokeExact(dogObject);
     }
     private static class Dog {
         private String name;
         public Dog(String name) {
this.name <http://this.name> = name;
         }
         public String getName() {
             return name;
         }
     }
}
With kind regards,
Geoffrey De Smet
_______________________________________________
mlvm-dev mailing list
http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
<http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev>
_______________________________________________
mlvm-dev mailing list
http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
<http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev>
--
Best Regards,
Wenlei Xie (谢文磊)
_______________________________________________
mlvm-dev mailing list
http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Geoffrey De Smet
2018-02-20 09:09:37 UTC
Permalink
_______________________________________________
mlvm-dev mailing list
mlvm-***@openjdk.java.net
http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
Loading...