JVM Internals

Here’s a quick cheat sheet on JVM internals, highlighting key concepts and components that are useful for understanding how the JVM works under the hood:


1. Java Virtual Machine Overview

  • JVM is responsible for executing Java bytecode on any platform. It provides an abstraction layer between compiled Java code and the underlying hardware.

2. JVM Architecture

  • Class Loader Subsystem: Responsible for loading, linking, and initializing classes.
  • Class Loading: Classes are loaded into memory from classpath or JAR files.
  • Class Linking: Includes verification, preparation, and resolution.
  • Class Initialization: Executes static blocks and initializes static fields.
  • Runtime Data Areas: Memory areas used by the JVM to store data during execution.
  • Method Area: Stores class-level data (e.g., method definitions, static variables).
  • Heap: Stores instances of classes (objects), including arrays.
  • Stack: Stores method frames for each thread, including local variables and call stack.
  • PC Register (Program Counter): Each thread has a PC register that holds the address of the current instruction.
  • Native Method Stack: Holds native method calls (if any).
  • Execution Engine: Executes bytecode.
  • Interpreter: Executes bytecode instructions one by one.
  • JIT (Just-In-Time) Compiler: Converts bytecode to native machine code for performance optimization (just before execution).
  • Garbage Collector: Automatically manages memory (removes unreachable objects).

3. Garbage Collection (GC)

  • GC Basics: Manages the lifecycle of objects in the heap, reclaiming memory for unused objects.
  • Young Generation: Newly created objects are allocated here. It includes:
    • Eden Space: Where new objects are allocated.
    • Survivor Spaces (S0 and S1): Store objects that survive garbage collection.
  • Old Generation (Tenured Generation): Objects that have lived longer are moved here.
  • Permanent Generation (PermGen) (JVM < 8): Stores class metadata. Replaced by Metaspace in JVM 8 and later.
  • GC Algorithms:
  • Serial GC: Uses a single thread for garbage collection.
  • Parallel GC: Uses multiple threads for minor GC events.
  • CMS (Concurrent Mark-Sweep): Tries to minimize pause times during GC.
  • G1 GC (Garbage First): Aims to balance heap management and application performance.
  • ZGC & Shenandoah: Low-latency garbage collectors.

4. Memory Model

  • Heap:
  • Young Generation: Short-lived objects.
  • Old Generation: Long-lived objects.
  • Metaspace (JVM 8+): Stores metadata (replaces PermGen).
  • Stack: Each thread has its own stack for method invocation and local variables.
  • Thread Stack: Stores frames for each method call, local variables, and intermediate results.
  • Direct Memory (Off-Heap): Memory used for native code interactions and NIO buffers.

5. Threading and Concurrency

  • Thread: The JVM runs multiple threads for execution.
  • Thread States: New, Runnable, Blocked, Waiting, Timed Waiting, and Terminated.
  • Synchronization: JVM supports synchronization using:
  • Intrinsic Locks (via synchronized blocks/methods).
  • Lock Objects (e.g., ReentrantLock).
  • Volatile Variables for memory consistency across threads.

6. Bytecode and Execution Flow

  • Bytecode: Intermediate code that the JVM interprets or compiles.
  • JVM Instructions: Operations like aload, astore, iconst, invokevirtual, etc.
  • Method Invocation: Invocations are resolved through the Method Table (vtable) and Call Site.
  • Bytecode Verification: Ensures the bytecode is valid, safe, and follows JVM rules.

7. Just-In-Time Compilation (JIT)

  • HotSpot: The default JVM in Oracle, which includes an adaptive JIT compiler.
  • JIT Compilation Process:
  1. Interpretation: Initially, bytecode is interpreted.
  2. Profiling: JVM collects data on which methods are “hot” (frequently executed).
  3. Compilation: Hot methods are compiled to machine code (native code).
  4. Inlining & Optimization: Further optimization techniques are applied (e.g., inlining, loop unrolling).

8. JVM Options and Configuration

  • Memory Management:
  • -Xms (Initial heap size).
  • -Xmx (Maximum heap size).
  • -XX:MaxMetaspaceSize (Control metaspace size).
  • GC Options:
  • -XX:+UseG1GC (Enable G1 garbage collector).
  • -XX:+UseConcMarkSweepGC (Enable CMS GC).
  • -XX:+PrintGCDetails (Print detailed GC logs).
  • JVM Flags:
  • -server (Run JVM in server mode).
  • -d64 (Use 64-bit JVM).
  • -XX:+AggressiveOpts (Enable aggressive optimizations).

9. Java Native Interface (JNI)

  • JNI: A framework to allow Java to interact with native (non-Java) code.
  • Native Methods: Methods defined in a native language (like C or C++) but used in Java programs.
  • JNI Functions: Provide functions like GetStringUTFChars, NewObject, CallObjectMethod, etc., to interact with Java objects from native code.

10. Class File Structure

  • Class File: Java bytecode compiled into .class files. Structure:
  • Magic Number: Identifies the file as a class file.
  • Minor and Major Versions: JVM version.
  • Constant Pool: Pool of constants referenced by the class (strings, methods, etc.).
  • Access Flags: Class and method visibility modifiers.
  • Fields and Methods: Definition of fields and methods within the class.
  • Attributes: Additional metadata (e.g., code, exceptions).

11. Java Security Manager

  • Security Manager: Provides a security model for Java applications by enforcing security policies.
  • Permissions: Controls access to system resources (e.g., file I/O, network).
  • Policy Files: Define rules about what classes can or cannot do.

12. JVM Languages and Polyglot JVM

  • Polyglot JVM: JVM supports many languages beyond Java, including Kotlin, Scala, Groovy, Clojure, and JRuby.
  • Bytecode Compatibility: Different languages compile to JVM bytecode, allowing interoperability.

13. JMX (Java Management Extensions)

  • JMX: Provides management and monitoring capabilities for the JVM and its running applications.
  • MBeans (Managed Beans): Java objects that expose system or application management functions.

14. JVM Debugging and Profiling

  • JVM Debugging:
  • Remote Debugging: Using -Xdebug and -Xrunjdwp for remote debugging.
  • JDB: Command-line debugger.
  • Profiling:
  • JVisualVM: A profiling tool for monitoring JVM performance.
  • JConsole: For managing and monitoring JVM.
  • Java Flight Recorder (JFR): For collecting low-level JVM events.

15. JVM Life Cycle and Execution Flow

  1. Bootstrap ClassLoader: Loads core libraries like rt.jar.
  2. Extension ClassLoader: Loads classes from the JDK extensions.
  3. System ClassLoader: Loads classes from the classpath.

16. Native vs Managed Code

  • Managed Code: Code running under the control of the JVM (Java bytecode).
  • Native Code: Code executed directly by the underlying operating system (e.g., C or C++).
  • JVM Interoperability: JNI and JNA (Java Native Access) allow Java programs to interact with native code.

Key JVM Internals Concepts

  • Just-in-Time Compilation (JIT) vs Ahead-of-Time Compilation (AOT)
  • Method Handles and InvokeDynamic (used in JVM 7+)
  • JVM HotSpot optimizations and profiling techniques

This cheat sheet covers the core components and concepts of the JVM internals. Understanding these will help you optimize your Java application’s performance, debug issues, and work with JVM-level configurations and options.

Leave a Reply