学习java 当stringequals方法时,我们经常会遇到一些困惑,特别是当我们深入到源代码时,我们会发现一些难以理解的现象。今天,我们将深入探讨jdk18环境下stringequals方法的内部逻辑,揭示奥秘。
问题描述在使用断点调试时,观察到以下现象:
问题1:
return (anobject instanceof string astring) && (!compact_strings || this.coder == astring.coder) && stringlatin.equals(value, astring.value);
这种逻辑似乎在调试过程中循环运行。而且,有时候即使字符相同,比如“a".equals("a()value与astring.value的数组长度也会有所不同。
问题2:
立即学习“Java免费学习笔记(深入);
- 对于"a".equals(new string("a"));,调试过程中发现的参数如图所示: [参数图像]
- 对于"a".equals("a");,调试过程中发现的参数如图所示: [参数图像]
在这两个例子中,参数似乎并不总是像预期的那样传递到equals方法。”a"。
解答让我们从string类的源代码开始,逐步了解这些现象。首先,compact_strings的定义和说明:
/** * if string compaction is disabled, the bytes in {@code value} are * always encoded in utf16. * * for methods with several possible implementation paths, when string * compaction is disabled, only one code path is taken. * * the instance field value is generally opaque to optimizing jit * compilers. therefore, in performance-sensitive place, an explicit * check of the static boolean {@code compact_strings} is done first * before checking the {@code coder} field since the static boolean * {@code compact_strings} would be constant folded away by an * optimizing jit compiler. the idioms for these cases are as follows. * * for code such as: * * if (coder == latin1) { ... } * * can be written more optimally as * * if (coder() == latin1) { ... } * * or: * * if (compact_strings && coder == latin1) { ... } * * an optimizing jit compiler can fold the above conditional as: * * compact_strings == true => if (coder == latin1) { ... } * compact_strings == false => if (false) { ... } * * @implnote * the actual value for this field is injected by jvm. the static * initialization block is used to set the value here to communicate * that this static final field is not statically foldable, and to * avoid any possible circular dependency during vm initialization. */ static final boolean compact_strings; static { compact_strings = true; }
从这个说明中可以看出,如果compact_strings是false,UTF16编码将永远用于value。此设置与coder字段密切相关。
接下来,我们来看看coder的定义:
/** * the identifier of the encoding used to encode the bytes in * {@code value}. the supported values in this implementation are * * latin1 * utf16 * * @implnote this field is trusted by the vm, and is a subject to * constant folding if string instance is constant. overwriting this * field after construction will cause problems. */ private final byte coder;
coder有两个可能的值,即latin1和utf16。我们可以找到与coder同名的方法:
byte coder() { return compact_strings ? coder : utf16; }
所以,条件(!!!compact_strings || this.coder == astring.coder)其意义非常明确:
如果compact_strings == false,使用utf16编码继续检查下一个条件。如果条件不确定,请检查coder是否相同。如果不同,请直接返回false。我们可以用手写代码来理解这个逻辑:
boolean flag = false; if (!compact_strings) { flag = true; // 根据 compact_strings 说明,在这种情况下使用 忽略utf16 coder 值 } else if (this.coder == astring.coder) { flag = true; // 说明 coder 一致 }
然后,stringlatin1.equals(value, astring.value)在条件下,内部数据value使用latin1编码规则进行比较。value的定义如下:
/** * The value is used for character storage. * * @implNote This field is trusted by the VM, and is a subject to * constant folding if String instance is constant. Overwriting this * field after construction will cause problems. * * Additionally, it is marked with {@link Stable} to trust the contents * of the array. No other facility in JDK provides this functionality (yet). * {@link Stable} is safe here, because value is never null. */ @Stable private final byte[] value;
因此,equals方法的完整逻辑如下:
- 首先判断是否是字符串,如果没有,直接返回false。
- 检查是否有相同的coder。(compact_strings的值间接影响coder的一致性比较),如果不同,直接返回false。
- 在coder相同的情况下,比较内部数据是否一致决定了最终的比较结果。
补充说明:
Stringlatin16的比较方法仅用于源代码中.equals,但是,如果编码规则相同,底层仍然适用于字节比较。如果您想了解更多关于utf16的比较,您可以进一步研究stringlatin1的实现。
事实上,断点调试中观察到的“循环运行”现象没有循环陈述。在调试过程中,这种现象可能是由编码比较引起的。如果发现调试中传递的参数是“gbk这可能是因为编码转换涉及到比较过程中。这需要进一步查看stringlatin1的源代码和调用栈来了解具体原因。
以上是Java Stringequals方法的工作机制是什么?详情请关注图灵教育其他相关文章!
