Skip to content

Instantly share code, notes, and snippets.

View rkennke's full-sized avatar
🏠
Working from home

Roman Kennke rkennke

🏠
Working from home
  • Datadog
  • Zurich, Switzerland
  • X @rkennke
View GitHub Profile
> - You are running with +AbortOnVMCompilationFailure and still see the
> performance drop or not?
On DEV we currently try to run with +AbortOnVMCompilationFailure to quickly get methods not C2 compiling. There are a few with good reason not to compile. Likely nothing to be enabled for PROD. It -TieredCompilation fails back to C1 then we can keep it enabled.
> - Are you still observing the performance drop when you run with raised node limits?
Raised node limits helped the C2 to compile some methods which would be else failing.
Unfortunately the issue is not reproducible. It happened 5 times in the last 30 days on 120 machines. Last time, on Friday, for a very critical component. From the application point of view all was looking normal, several jstacks taken did not indicate anything wrong, nothing in the log file, heap/GC ok, but the process was running like 20-50 times slower.
diff -r ab218a040145 src/share/vm/gc_implementation/shenandoah/shenandoahRootProcessor.cpp
--- a/src/share/vm/gc_implementation/shenandoah/shenandoahRootProcessor.cpp Tue Dec 08 16:07:59 2020 +0100
+++ b/src/share/vm/gc_implementation/shenandoah/shenandoahRootProcessor.cpp Tue Dec 08 20:39:28 2020 +0100
@@ -303,7 +303,7 @@
_serial_roots.oops_do(keep_alive, worker_id);
_dict_roots.oops_do(keep_alive, worker_id);
- _thread_roots.oops_do(keep_alive, &clds, &update_blobs, worker_id);
+ _thread_roots.oops_do(keep_alive, &clds, _update_code_cache ? NULL : &update_blobs, worker_id);
_cld_roots.cld_do(&clds, worker_id);
_thread_roots.oops_do(keep_alive, &clds, &update_blobs, worker_id); <--- updates *some* nmethods
_cld_roots.cld_do(&clds, worker_id);
if(_update_code_cache) {
_code_roots.code_blobs_do(&update_blobs, worker_id); <--- updates *all* nmethods
}
Event: 295,288 Executing coalesced safepoint VM operation: ShenandoahInitMark
Event: 295,288 Pause Init Mark
Event: 295,413 Pause Init Mark done
Event: 295,413 Executing coalesced safepoint VM operation: ShenandoahInitMark done
Event: 295,413 Executing VM operation: RevokeBias done
Event: 295,414 Concurrent marking
Event: 295,417 Executing VM operation: RevokeBias
Event: 295,422 Executing VM operation: RevokeBias done
Event: 295,422 Executing VM operation: RevokeBias
Event: 295,423 Executing VM operation: RevokeBias done
#
# A fatal error has been detected by the Java Runtime Environment:
#
# Internal Error (/home/rkennke/src/openjdk/shenandoah-jdk8/hotspot/src/share/vm/gc_implementation/shenandoah/shenandoahForwarding.inline.hpp:71), pid=512644, tid=0x00007f8600dfd640
# Error: Shenandoah assert_correct failed; Object klass pointer must go to metaspace
Referenced from:
no interior location recorded (probably a plain heap scan, or detached oop)
Object:
[rkennke@localhost substratevm]$ mx native-image -H:+ReportExceptionStackTraces --allow-incomplete-classpath "-J-XX:FlightRecorderOptions=retransform=false" --no-fallback -ea -cp ~/src/graal-jfr/graalvm-jfr-tests/flat EventCommit
[eventcommit:35500] classlist: 988.29 ms, 0.96 GB
[eventcommit:35500] (cap): 628.34 ms, 0.96 GB
[eventcommit:35500] setup: 1,212.93 ms, 0.96 GB
Fatal error:java.lang.NoClassDefFoundError
at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:490)
at java.base/java.util.concurrent.ForkJoinTask.getThrowableException(ForkJoinTask.java:603)
pick b3a827c707d Add stub for JfrThreadLocal and insert an instance into SubstrateVM Java Threads implementation
pick d1290475311 Add JfrOptions to read command line arguments for JFR
pick a344171200c Implement metadata parsing at build-time
pick 87c82a027ae Remove substitution for getTypeId(String), it does not exist in JDK11 and is probably not needed
pick fb669e9f838 Add javax.xml.datatype to build-time-initialized packages
pick edb52c55e72 Add some machinery to replace jfrTypeIds.hpp generation and lookup
pick cd0291ed2b0 Provide Metadata.getJfrTypeId() method as replacement for jfrTypeIds.hpp
pick 5b52723d267 Substitute Type.register() in order to get correct typeIDs for known types
pick ab877ac471e Simplify alias of knownTypes field
pick e23be2aab2f Remove substitution for MetadataRepository
Creating symlink jdk/lib/libverify.debuginfo
/usr/bin/ld: /home/rkennke/src/labs-openjdk-11/build/linux-x86_64-normal-server-release/support/native/java.base/libjava/childproc.o:/home/rkennke/src/labs-openjdk-11/src/java.base/unix/native/libjava/childproc.h:121: multiple definition of `parentPathv'; /home/rkennke/src/labs-openjdk-11/build/linux-x86_64-normal-server-release/support/native/java.base/libjava/ProcessImpl_md.o:/home/rkennke/src/labs-openjdk-11/src/java.base/unix/native/libjava/childproc.h:121: first defined here
collect2: error: ld returned 1 exit status
diff --git a/src/hotspot/cpu/x86/gc/shenandoah/shenandoahBarrierSetAssembler_x86.cpp b/src/hotspot/cpu/x86/gc/shenandoah/shenandoahBarrierSetAssembler_x86.cpp
index f19a93ea487..fb1b9ccbf20 100644
--- a/src/hotspot/cpu/x86/gc/shenandoah/shenandoahBarrierSetAssembler_x86.cpp
+++ b/src/hotspot/cpu/x86/gc/shenandoah/shenandoahBarrierSetAssembler_x86.cpp
@@ -348,27 +348,33 @@ void ShenandoahBarrierSetAssembler::load_reference_barrier(MacroAssembler* masm,
}
save_xmm_registers(masm);
+ address calladdr = NULL;
switch (kind) {
0x00007f2f2cb7650f: mov 0x10(%rbx),%rbp <==== Reference.get()
0x00007f2f2cb76513: testb $0x1,0x20(%r15)
0x00007f2f2cb76518: jne 0x00007f2f2cb76c42 <==== LRB may turn rbp into NULL
0x00007f2f2cb7651e: mov %rbp,%r12 <==== copy rbp (possibly NULL) to r12
0x00007f2f2cb76521: testb $0x2,0x20(%r15)
0x00007f2f2cb76526: jne 0x00007f2f2cb76c61 <==== keep-alive barrier
0x00007f2f2cb7652c: mov 0x50(%rsp),%r10
0x00007f2f2cb76531: cmp %r12,%r10
0x00007f2f2cb76534: jne 0x00007f2f2cb76562 <==== Not sure what that is, but branches to crashing subroutine, with r12 still NULL
0x00007f2f2cb76536: mov 0x38(%rbx),%r10