深入理解JVM一字节码执行

前言物理机对指令的执行建立在cpu、硬件、指令集、操作系统层面。而虚拟机对指令的执行可以自行实现，JVM Specification中定义了执行引擎这个概念模型作为JVM的统一Facade。通常会有解释器执行(逐条解释字节码并执行)、编译器执行(即时编译为本地后代码执行)两种执行字节码方式的执行引擎。栈帧结构每个方法调用开始到退出，都对应着一个“栈帧”进站与出站。栈帧作为虚拟中中方法调用与方法执行

昨日不可追

3385人浏览 · 2017-04-16 22:29:30

昨日不可追 · 2017-04-16 22:29:30 发布

文章目录

前言
栈帧结构
方法调用
- 解析
- 分派

前言

物理机对指令的执行建立在cpu、硬件、指令集、操作系统层面。而虚拟机对指令的执行可以自行实现，JVM Specification中定义了执行引擎这个概念模型作为JVM的统一Facade。通常会有解释器执行(逐条解释字节码并执行)、编译器执行(即时编译为本地后代码执行)两种执行字节码方式的执行引擎。

栈帧结构

每个方法调用开始到退出，都对应着一个“栈帧”进站与出站。
运行时栈帧

作为虚拟中中方法调用与方法执行的数据结构，它包含方法执行必备的局部变量表、操作数栈、动态链接与方法返回地址等信息。
一个方法的执行，可能存在着多层方法调用，对于执行引擎来说，只有当前方法对应的当前栈帧才有效，才是活动的，也就是位于栈顶的栈帧（栈后进先出LIFO）

这里写图片描述

栈帧-局部变量表

局部变量表是栈帧中的一部分,是一个变量值存储空间，主要存储方法中的局部变量、方法参数。
在class文件编译生成时，（Code属性）就决定了局部变量表的内容、以及最大容量：

 public class TestClass {

	public static void main(String[] args) {
		int m = 0;
		inc(m);
	}

	public static int inc(int m) {
		int f = 2;
		return incNext(m + f);

	}

	public static int incNext(int m) {
		int g = 2;
		return m + g;
	}
}

javap -c .\TestClass.class
Compiled from "TestClass.java"
public class com.zs.jvm.byteCode.TestClass {
  public com.zs.jvm.byteCode.TestClass();
    Code:
       0: aload_0
       // Method java/lang/Object."<init>":()V
       1: invokespecial #1                  
       4: return

  public static void main(java.lang.String[]);
    Code:
       0: iconst_0
       1: istore_1
       2: iload_1
       // Method inc:(I)I
       3: invokestatic  #2                  
       6: pop
       7: return

  public static int inc(int);
    Code:
       0: iconst_2
       1: istore_1
       2: iload_0
       3: iload_1
       4: iadd
       // Method incNext:(I)I
       5: invokestatic  #3                  
       8: ireturn

  public static int incNext(int);
    Code:
       0: iconst_2
       1: istore_1
       2: iload_0
       3: iload_1
       4: iadd
       5: ireturn
}

局部变量表中以“变量槽”（variable slot）作为存储数据的最小单位，并且每个slot都应该可以存储一个 byte、int、boolean、char、 short、 int、 float、 reference或returnAddress类型的数据（也就是除了double、long的基本变量类型数据和引用类型数据）。

JVMhotSpot中，通常每个slot是32bit，对于double、long，64bit的数据，jvm会把这两种数据分配在两个连续slot中，每次读（写）会分配为两次读（写）单个slot完成，而且不允许读写操作只执行其中一个slot。另外，因为局部变量表示线程私有的，所以这里对64位数据读写不会有线程安全问题。

JVM对reference类型的数据没有过多说明，但是一个Reference数据应该保证：
1.通过这个Reference可以间接或者直接找到对应对象在heap中存放的起始地址。
2.通过这个Reference可以间接或直接的找到这个对象在MethodArea区中存储的类型信息。

局部变量表使用索引定位的方式来读取slot，索引范围是从0到slot的最大个数，比如读（取）一个索引为n的数据，则就是读（取）第n个slot数据。而如果这个n索引的数据是个64bit的数据，那么读（取），就是要同时连续读（取）第n个与第n+1个slot。

对于对象实例类型的方法调用，通常局部变量表中第0位索引存放的是这个对象实例的引用，也就是this。

returnAddress类型目前已经很少见了,可以忽略，不详述。

栈帧-操作数栈（Operand Stack）

jvm的解释执行引擎是基于栈的执行引擎，这句话中的栈其实就是指Operand Stack。

是一个后进先出LIFO的数据模型，字节码指令执行时会不断的向操作数栈中插入数据、提取数据，称为出栈入栈。

栈帧-动态连接

每个栈帧都包含了一个常量池中的符号引用，这个符号引用指向这个栈帧所属的方法，而字节码方法调用会(如，invokespecial、invokestatic等方法调用指令，后文详细介绍)以这个符号引用作为参数。

这一类型的符号引用一部分会在类加载(Linking—resolve，参考深入理解JVM一加载机制)时或者第一次使用之后转换为内存中的直接引用，这个种转换成为静态解析；

另一部分会在每一次运行期间（方法调用时）转换为直接引用，这部分称为动态连接。

栈帧-方法返回地址

就是决定当前方法调用退出后，应该返回的位置。
通常会有两种方式结束一个方法调用：正常退出、异常退出。

正常退出时，调用者的PC计数器的值可以作为返回地址，通常返回地址保持在栈帧中。
异常退出时，返回地址是通过异常处理器来确定的，一般不会保存在栈帧中。

一个方法的退出会恢复这个方法调用者的局部变量表、操作数栈，把返回值（如果有的话）压入调用者的操作数栈，让PC计数器执行下一条指令。

方法调用

方法调用是在运行时确定调用哪个方法的操作，是一个很频繁的操作。

因为class在编译后操作指令都是一堆常量池中符号引用，并没有直接指向内存地址入口（直接引用）。这给java带了了强大的动态扩展空间，但是也带来了复杂度，通常是在类加载(Linking—resolve，参考深入理解JVM一加载机制)、甚至运行时才确定具体执行的是哪一个方法。

解析

字节码执行，其实主要是执行方法中指令，如果在编译期就可以确定要执行的具体方法，那么对这类方法的调用的确认成为解析。

java代码编译后的字节码中，方法调用的代码都是一堆常量池中的符号引用，需要通过解析将符号引用转换为(包含内存的入口的)直接引用。

在方法调用时有如下几种指令：

//(私有方法、实例构造器、父类方法的调用)
invokespecial
//(静态方法调用)
invokestatic
//(虚方法调用\final修饰的方法调用)
invokevirtual
//(调用接口方法，会在运行时确认一个接口的实现;调用重载方法等产生多态选择的情况)
invokeinterface
//(、、、、、、、)
invokedynamic

示例：

public class TestGCChild implements GCTest {

    static String TEST_NAME = "Test";

    static GCTest t = new TestGCChild();

    public static void main(String[] args) throws Exception {

        TestGCChild.easyStatic();// static mehotd

        t.easy();// interface implements

        new TestGCChild().easyUnStatic();// normal method

        new TestGCChild().easyFinal();// normal method

    }

    @Override
    public String easy() {
        return "easy";
    }

    public static String easyStatic() {
        return "easy";
    }

    public String easyUnStatic() {
        return "easy";
    }

    public final String easyFinal() {
        return "easy";
    }
}

Compiled from "TestGCChild.java"
public class com.zs.test.TestGCChild implements com.zs.test.GCTest {
  static java.lang.String TEST_NAME;

  static com.zs.test.GCTest t;

  static {};
    Code:
       0: ldc           #14                 // String Test
       2: putstatic     #16                 // Field TEST_NAME:Ljava/lang/String;
       5: new           #1                  // class com/zs/test/TestGCChild
       8: dup
       9: invokespecial #18                 // Method "<init>":()V
      12: putstatic     #21                 // Field t:Lcom/zs/test/GCTest;
      15: return

  public com.zs.test.TestGCChild();
    Code:
       0: aload_0
       1: invokespecial #25                 // Method java/lang/Object."<init>":()V
       4: return

  public static void main(java.lang.String[]) throws java.lang.Exception;
    Code:
       0: invokestatic  #33                 // Method easyStatic:()Ljava/lang/String;
       3: pop
       4: getstatic     #21                 // Field t:Lcom/zs/test/GCTest;
       7: invokeinterface #37,  1           // InterfaceMethod com/zs/test/GCTest.easy:()Ljava/lang/String;
      12: pop
      13: new           #1                  // class com/zs/test/TestGCChild
      16: dup
      17: invokespecial #18                 // Method "<init>":()V
      20: invokevirtual #40                 // Method easyUnStatic:()Ljava/lang/String;
      23: pop
      24: new           #1                  // class com/zs/test/TestGCChild
      27: dup
      28: invokespecial #18                 // Method "<init>":()V
      31: invokevirtual #43                 // Method easyFinal:()Ljava/lang/String;
      34: pop
      35: return

  public java.lang.String easy();
    Code:
       0: ldc           #48                 // String easy
       2: areturn

  public static java.lang.String easyStatic();
    Code:
       0: ldc           #48                 // String easy
       2: areturn

  public java.lang.String easyUnStatic();
    Code:
       0: ldc           #48                 // String easy
       2: areturn

  public final java.lang.String easyFinal();
    Code:
       0: ldc           #48                 // String easy
       2: areturn
}

对于私有方法、实例构造器、父类方法、静态方法的调用,符合编译期可知，运行时不变的要求。（因为私有方法不会被覆盖或者改写，静态方法只属于当前类也不会被改写。）除此之外，被final修饰的方法，因为它无法被重写，所以也是确定的，它虽然使用了invokevirtual调用，但也是在编译期间即可确定的。这类在编译时即可确认实际调用实现的方法称为非虚方法，使用invokespeical、invokestatic指令、以及使用了final修饰的方法都属于非虚方法。除此之外都是虚方法，需要进行多态选择，后期绑定实际的方法。

分派

静态分派:
我们先看一个小程序，请输出下列代码的执行结果：

class Human {

}

class Man extends Human {

}

class Woman extends Human {

}

public class TestDispatch {

	public void test(Human h) {
		System.out.println("human");
	}

	public void test(Woman w) {
		System.out.println("Woman");
	}

	public void test(Man m) {
		System.out.println("Man");
	}

	public static void main(String[] args) {
		Human human = new Human();
		Human woman = new Woman();
		Human man = new Man();
		TestDispatch test = new TestDispatch();

		test.test(man);
		test.test(woman);
		test.test(human);
	}
}

//outpu:
human
human
human

如上边的示例，Human m（等号左边）称为静态类型(Static Type)或者称为显示类型(Apparent Type),等号右边部分称为实际类型（Actual Type），是真正初始化的对象。
静态类型都是在编译期决定并不可改变的，而实际类型只能到运行时才能真正决定，在编译期(编译后的字节码)无法确认实际类型。

因为重载，众多重载方法中具体执行哪一个方法是在编译期确定的（编译器会自动选择最合适的一个），所以产生了上边的代码执行结果。

接着，我们引出静态分派（static dispatch）的概念：所有依赖静态类型来决定具体方法执行的分派动作（分派可理解为对多态的选择）称为静态分派。

动态分派:

动态分配最典型的例子就是“重写”(override).
看一下这个例子：

class Human {

    String say() {
        System.out.println("human");
        return "human";
    }
}

class Son extends Human {

    public String say() {
        System.out.println("Son");
        return "Son";
    }

}

class Father extends Human {

    public String say() {
        System.out.println("Father");
        return "Father";
    }
}

public class TestGC {
    static Human human = new Human();

    static Human father = new Father();

    static Human son = new Son();

    public static void main(String[] args) throws Exception {
        human.say();// human
        father.say();// Father
        son.say();// Son
    }
}
//output:
//human
//Father
//Son

可以看出来，运行的say()其实是具体实现类型的方法，并不是Human.say()，很明显无法根据静态编译来确定实际执行的方法。我们再看看这段代码的字节码,特别看一下main（）中的指令：

public class com.zs.test.TestGC {
  static com.zs.test.Human human;

  static com.zs.test.Human father;

  static com.zs.test.Human son;

  static {};
    Code:
       0: new           #12                 // class com/zs/test/Human
       3: dup
       4: invokespecial #14                 // Method com/zs/test/Human."<init>":()V
       7: putstatic     #17                 // Field human:Lcom/zs/test/Human;
      10: new           #19                 // class com/zs/test/Father
      13: dup
      14: invokespecial #21                 // Method com/zs/test/Father."<init>":()V
      17: putstatic     #22                 // Field father:Lcom/zs/test/Human;
      20: new           #24                 // class com/zs/test/Son
      23: dup
      24: invokespecial #26                 // Method com/zs/test/Son."<init>":()V
      27: putstatic     #27                 // Field son:Lcom/zs/test/Human;
      30: return

  public com.zs.test.TestGC();
    Code:
       0: aload_0
       1: invokespecial #31                 // Method java/lang/Object."<init>":()V
       4: return

  public static void main(java.lang.String[]) throws java.lang.Exception;
    Code:
       0: getstatic     #17                 // Field human:Lcom/zs/test/Human;
       3: invokevirtual #39                 // Method com/zs/test/Human.say:()Ljava/lang/String;
       6: pop
       7: getstatic     #22                 // Field father:Lcom/zs/test/Human;
      10: invokevirtual #39                 // Method com/zs/test/Human.say:()Ljava/lang/String;
      13: pop
      14: getstatic     #27                 // Field son:Lcom/zs/test/Human;
      17: invokevirtual #39                 // Method com/zs/test/Human.say:()Ljava/lang/String;
      20: pop
      21: return
}

我们看到，从静态编译的字节码中无法判定方法调用者的实际类型，那么jvm是如何知道具体的实现者是哪个呢？是如何进行动态选择的呢？关键就在invokevirtual指令：

invokevirtual指令动态查找的过程如下：

1.在操作栈的栈顶弹出首元素，将首元素所指向对象的实际类型记作C类型。
2.在C类型中按方法描述符等常量查找匹配的方法，如果找到了，再进行访问权限的校验，如果校验通过允许访问，那么就直接返回这个方法的直接引用。如果访问权限校验不允许访问，抛出IllegalAccessError异常。
3.如果在C类型中找不到匹配的方法，那么就从它的直接父类开始从下到上查找，找到后再进行访问权限校验，通过后返回。
4.如果始终找不到匹配方法（在C类型、以及C的父类），那么久抛出AbstractMethodError异常。

这样看来，如果子类中没有override父类中的方法，那么调用会直接执行父类方法。如下：

class Human {

    String say() {
        System.out.println("human");
        return "human";
    }
}

class Son extends Human {

//    public String say() {
//        System.out.println("Son");
//        return "Son";
//    }

}

class Father extends Human {

    // public String say() {
    // System.out.println("Father");
    // return "Father";
    // }
}

public class TestGC {
    static Human human = new Human();

    static Human father = new Father();

    static Human son = new Son();

    public static void main(String[] args) throws Exception {
        human.say();// human
        father.say();// human
        son.say();// human
    }
}

//output:
// human
// human
// human

动态分配：在运行期间，根据对象的实际类型确认具体要执行方法的分配。

待续…