By Enbo Shi (Xihang)
If you frequently write utility classes, there is a high chance that you will encounter a practical issue - the need to obtain the runtime type T in the generic expression A. It requires some skills to accomplish this task. However, the explanation behind why this technique works is rarely provided. In the following article, we will start with Java generics and delve into the Java Language Specification (JLS) and Java Virtual Machine Specification (JVMS). By exploring the source code of generics processed by the Java compiler and examining the JRE reflection API source code, we will finally verify this technique using a virtual machine implementation (OpenJDK8's hotspot).
When writing utility classes or utilizing compact logic with generics, it is often necessary to obtain runtime type information for making logical judgments in the next steps. This includes scenarios such as the commonly used plugin architecture in business or obtaining type details during deserialization. As a result, many beginners may attempt to write code like the following to retrieve the runtime type using reflection and generics.
public static <T> Class<T> typeOf(T obj) {
return (Class<T>)obj.getClass();
}
Undoubtedly, the above code can work very well in a certain range. However, when it comes to generics, it does not work. It can be said that after Java 1.5 introduced generics, the handling of generics in reflection has always been troublesome.
Generics were introduced in Java 1.5 to address type checking issues [1] and provide robust type constraints for writing generic code, especially libraries, without concerns about heap pollution caused by casting in previous versions [13].
To understand why generics were introduced and how they were designed, you can refer to Gilad Bracha's paper presented at the OOPLSA conference in 1998 [2]. Subsequently, JSR14 was implemented, which added generics to the Java Programming Language [14] and eventually incorporated them into JDK in Java 1.5.
The formal definition of generics can be found in the Java Language Specification (JLS) [3], [4], [5], and [6].
For the non-formal definition of generics, you can refer to the following simple code, which explains the concepts that are often confusing, such as type variables, type parameters, and type arguments.
/**
* Define a generic class, where
*
* Type Parameter is T extends Number
* Type Variable is T
* Type argument is an integer in Foo<Integer>
*/
class Foo<T extends Number> {}
Due to the introduction of generics in Java 1.5, reflection was also extended to accommodate this new concept [7]. In terms of implementation, reflection introduces the Type interface along with its derived interfaces and classes, which are responsible for implementing the generic Java Language Specification (JLS). Their UML types are as follows.
One of the key concepts we frequently encounter is ParameterizedType [10].
ParameterizedType may be unfamiliar to some, but developers who regularly use the core reflection API may be familiar with it. ParameterizedType is one of the derived classes of the Type interface. To put it simply, this concept can be analogized to an implementation of the generic type Foo. For example, Foo and Foo are the ParameterizedType of Foo.
Additionally, as part of the implementation of generics, a set of methods and classes with "Generic" in their names were added to the reflection API [8]. These serve as the foundation for retrieving the runtime types of generics.
Although generics were introduced in Java, type erasure was implemented as a way to maintain forward compatibility (compatibility at the JVM level without changing bytecode and JVM design) and improve compilation performance (compared to C++ templates, where new types are generated with template parameters) [9]. With type erasure, Java does not need to modify the virtual machine implementation or create new classes for ParameterizedTypes.
Every design has its trade-offs. While Java generics benefit from type erasure, it also leads to two major issues.
This feature presents challenges when writing utility classes. For instance, you cannot create an instance using only the type variable T. If T is a non-generic class, we can perform operations by directly passing in the type information.
public static final <T> void foo(List<T> list, Class<T> tClass)
However, when T is a ParameterizedType, the T class type information in the above interface can only retrieve the non-generic type information from the ParameterizedType. For example, if T is List, the class will be List.class. In certain scenarios, such as deserialization, this limitation can pose difficulties.
So, is there no way to get runtime types of generics in Java? The answer is yes. But we need to make some changes. For example, we can see TypeReference or similar design mechanisms in many serialization frameworks (such as Jackson and Fastjson). We can obtain the runtime type of T without altering the signature of the function basically.
The method is to define the class.
class Wrapper<T> {
}
It's very simple, basically like a wrapper class. Then, make a simple method definition.
public static <T> Type getGenericRuntimeType(Wrapper<T> wrapper)
Finally, a small trick can be used to create an instance of an anonymous derived class and work together with the reflection API to retrieve the generic information of the superClass. If the superClass is a ParameterizedType, we can attempt to obtain the actual Type Argument information, allowing us to obtain the runtime type of T.
public static <T> Type getGenericRuntimeType(Wrapper<T> wrapper) {
Type type = wrapper.getClass().getGenericSuperclass();
if (type == null) {
return null;
}
if (type instanceof ParameterizedType) {
Type[] types = ((ParameterizedType)type).getActualTypeArguments();
return types[0];
}
return null;
}
For example, comparing the following two statements, the only difference is that line 2 creates an anonymous class of wrapper.
Type type1 = getGenericRuntimeType(new Wrapper<List<String>>());
Type type2 = getGenericRuntimeType(new Wrapper<List<String>>() {})
The results after the final running are printed separately.
null
java.util.List<java.lang.String>
So why does a single instance of an anonymous class make such a significant difference? Is it possible to obtain generics within the framework of type erasure? What is the underlying principle?
In fact, it utilizes a technique mentioned in JSR14 [14]. This technique involves saving the generic type information in the class Signature.
Classfiles need to carry generic type information in a backwards compatible way. This is accomplished by introducing a new "Signature" attribute for classes, methods, and fields.
First, the Java compiler writes the generic type information into the Signature attribute of the classfiles. Then, the JRE's reflection interface parses the string within the Signature. Finally, the hidden runtime type information is identified. In the following section, we will start with the definition of the Java Virtual Machine Specification (JVMS), study the process of compiling Java code and generating classfiles, and explore the JRE's reflection code.
A JVM classfile refers to the binary format that is generated after compiling Java source files. It can be compared to ELF in Linux or COFF in Windows and can be understood as the executable file of the JVM. The JVM reads and executes bytecode from the classfile to run the program. The format of a classfile is as follows:
ClassFile {
u4 magic;
u2 minor_version;
u2 major_version;
u2 constant_pool_count;
cp_info constant_pool[constant_pool_count-1];
u2 access_flags;
u2 this_class;
u2 super_class;
u2 interfaces_count;
u2 interfaces[interfaces_count];
u2 fields_count;
field_info fields[fields_count];
u2 methods_count;
method_info methods[methods_count];
u2 attributes_count;
attribute_info attributes[attributes_count];
}
The attributes array is where the generic type information is stored, as mentioned in JSR14. JVMS points out [11] that:
A Java compiler must emit a signature for any class, interface, constructor, method, or field whose declaration uses type variables or parameterized types
You can see that the Java compiler needs to bring the generic class information to the Signature attribute and store it in the compiled classfile.
Let's simply inherit the wrapper class and verify the conclusion through javap after compilation.
public class ExtendedWrapper extends Wrapper<List<String>> {
}
After using the javap command, you can observe that the Signature of line 42 in the class already contains the corresponding type information (Lcom/aliyun/cwz/model/Wrapper;>;). This validates the JVMS standard to a certain extent.
Classfile /Users/alibaba/myprojects/GenericsAndReflection/target/test-classes/com/aliyun/cwz/impl/ExtendedWrapper.class
Last modified 2023-4-17; size 413 bytes
MD5 checksum 96ca23aed30b94c2a445bbd76189e250
Compiled from "ExtendedWrapper.java"
public class com.aliyun.cwz.impl.ExtendedWrapper extends com.aliyun.cwz.model.Wrapper<java.util.List<java.lang.String>>
minor version: 0
major version: 52
flags: ACC_PUBLIC, ACC_SUPER
Constant pool:
#1 = Methodref #3.#15 // com/aliyun/cwz/model/Wrapper."<init>":()V
#2 = Class #16 // com/aliyun/cwz/impl/ExtendedWrapper
#3 = Class #17 // com/aliyun/cwz/model/Wrapper
#4 = Utf8 <init>
#5 = Utf8 ()V
#6 = Utf8 Code
#7 = Utf8 LineNumberTable
#8 = Utf8 LocalVariableTable
#9 = Utf8 this
#10 = Utf8 Lcom/aliyun/cwz/impl/ExtendedWrapper;
#11 = Utf8 Signature
#12 = Utf8 Lcom/aliyun/cwz/model/Wrapper<Ljava/util/List<Ljava/lang/String;>;>;
#13 = Utf8 SourceFile
#14 = Utf8 ExtendedWrapper.java
#15 = NameAndType #4:#5 // "<init>":()V
#16 = Utf8 com/aliyun/cwz/impl/ExtendedWrapper
#17 = Utf8 com/aliyun/cwz/model/Wrapper
{
public com.aliyun.cwz.impl.ExtendedWrapper();
descriptor: ()V
flags: ACC_PUBLIC
Code:
stack=1, locals=1, args_size=1
0: aload_0
1: invokespecial #1 // Method com/aliyun/cwz/model/Wrapper."<init>":()V
4: return
LineNumberTable:
line 7: 0
LocalVariableTable:
Start Length Slot Name Signature
0 5 0 this Lcom/aliyun/cwz/impl/ExtendedWrapper;
}
Signature: #12 // Lcom/aliyun/cwz/model/Wrapper<Ljava/util/List<Ljava/lang/String;>;>;
SourceFile: "ExtendedWrapper.java"
So, how does the Java compiler work?
According to "The Hitchhiker's Guide to javac" [15], the JavaCompiler serves as the driver for javac. Therefore, by studying the implementation of the JavaCompiler, we can understand the process of file compilation with javac. (Through the compilation process observed with strace javac, it can be observed that the calls include the JavaCompiler. This leads to a similar conclusion.)
Let's use OpenJDK 1.8 and see how the Java compiler compiles extended wrappers.
During the compilation process, the JavaCompiler class invokes the compile method. This method serves as the core compilation method, ultimately generating a Java class as a classfile.
com.sun.tools.javac.main.JavaCompiler#compile(com.sun.tools.javac.util.List<javax.tools.JavaFileObject>, com.sun.tools.javac.util.List<java.lang.String>, java.lang.Iterable<? extends javax.annotation.processing.Processor>)
Within this function, the Java file is first parsed as JCTree.JCCompilationUnit by the parser. It serves as the basic unit in the abstract syntax tree. The type information of the Java file is then propagated to the type variables of the class symbols generated by the ExtendedWrapper class using the visitor pattern within the class com.sun.tools.javac.comp.Enter and the corresponding method. The ClassSymbol is subsequently stored in the symbol table.
com.sun.tools.javac.jvm.ClassReader#enterClass
In the next step of the code generation process,
com.sun.tools.javac.main.JavaCompiler#generate
with the method in the following link,
com.sun.tools.javac.jvm.ClassWriter#writeClassFile
the type information of superClass is obtained through the type of ClassSymbol in the symbol table and written into the attribute of Signature of a classfile through ClassWriter.
com.sun.tools.javac.code.Types#supertype
So far, we have understood how the JavaCompiler writes generic type information into classfiles. In this section, I recommend examining the step source code (in the com.sun.tools.javac package) to gain a deeper understanding.
Now, let's analyze how the reflection API in the JRE converts the Signature string to a Type object. We have observed that the main reflection method used in the previous code is the getGenericSuperclass method under the Class type. Therefore, let's begin our analysis from here. This method was introduced in Java 1.5 and is used to return the base class with generics. Here is its code:
public Type getGenericSuperclass() {
ClassRepository info = getGenericInfo();
if (info == null) {
return getSuperclass();
}
// Historical irregularity:
// Generic signature marks interfaces with superclass = Object
// but this API returns null for interfaces
if (isInterface()) {
return null;
}
return info.getSuperclass();
}
We can notice that the core is the variables of the ClassRepository type generated by line 2, which represents the generic type information of the class. The details are as follows:
This class represents the generic type information for a class. The code is not dependent on a particular reflective implementation. It is designed to be used unchanged by at least core reflection and JDI.
By further learning getGenericInfo, we can see how info is generated.
private ClassRepository getGenericInfo() {
ClassRepository genericInfo = this.genericInfo;
if (genericInfo == null) {
String signature = getGenericSignature0();
if (signature == null) {
genericInfo = ClassRepository.NONE;
} else {
genericInfo = ClassRepository.make(signature, getFactory());
}
this.genericInfo = genericInfo;
}
return (genericInfo != ClassRepository.NONE) ? genericInfo : null;
}
It can be seen that the string Signature generates information from the method "getGenericSignature0" by the process of ClassRepository. So, where does this string come from? We can find that this is a native method, which is from the JVM implementation.
// Generic signature handling
private native String getGenericSignature0();
Since it is the JVM method, we can look through the source code to verify whether it meets the JVMS mentioned earlier.
After studying the JVMS and the Class#getGenericSignature0 function in the JRE, I believe it is necessary to explore the specific implementation within the JVM. Based on the Java compiler source code mentioned earlier, we have decided to refer to the implementation in OpenJDK. Specifically, we will refer to the widely used JDK8 [12].
By searching for the function name getGenericSignature0, we can find a configuration array of JNI methods in the file ./jdk/src/share/native/java/lang/Class.c.
static JNINativeMethod methods[] = {
{"getName0", "()" STR, (void *)&JVM_GetClassName},
{"getSuperclass", "()" CLS, NULL},
{"getInterfaces0", "()[" CLS, (void *)&JVM_GetClassInterfaces},
{"isInterface", "()Z", (void *)&JVM_IsInterface},
{"getSigners", "()[" OBJ, (void *)&JVM_GetClassSigners},
{"setSigners", "([" OBJ ")V", (void *)&JVM_SetClassSigners},
{"isArray", "()Z", (void *)&JVM_IsArrayClass},
{"isPrimitive", "()Z", (void *)&JVM_IsPrimitiveClass},
{"getComponentType", "()" CLS, (void *)&JVM_GetComponentType},
{"getModifiers", "()I", (void *)&JVM_GetClassModifiers},
{"getDeclaredFields0","(Z)[" FLD, (void *)&JVM_GetClassDeclaredFields},
{"getDeclaredMethods0","(Z)[" MHD, (void *)&JVM_GetClassDeclaredMethods},
{"getDeclaredConstructors0","(Z)[" CTR, (void *)&JVM_GetClassDeclaredConstructors},
{"getProtectionDomain0", "()" PD, (void *)&JVM_GetProtectionDomain},
{"getDeclaredClasses0", "()[" CLS, (void *)&JVM_GetDeclaredClasses},
{"getDeclaringClass0", "()" CLS, (void *)&JVM_GetDeclaringClass},
{"getGenericSignature0", "()" STR, (void *)&JVM_GetClassSignature},
{"getRawAnnotations", "()" BA, (void *)&JVM_GetClassAnnotations},
{"getConstantPool", "()" CPL, (void *)&JVM_GetClassConstantPool},
{"desiredAssertionStatus0","("CLS")Z",(void *)&JVM_DesiredAssertionStatus},
{"getEnclosingMethod0", "()[" OBJ, (void *)&JVM_GetEnclosingMethodInfo},
{"getRawTypeAnnotations", "()" BA, (void *)&JVM_GetClassTypeAnnotations},
};
getGenericSignature0 corresponds to an object of JNINativeMethod. {"getGenericSignature0", "()" STR, (void *)&JVM_GetClassSignature}JNINativeMethod is defined as follows:
typedef struct {
char *name;
char *signature;
void *fnPtr;
} JNINativeMethod;
You can see that the jvm implementation corresponding to getGenericSignature0 is JVM_GetClassSignature, a function pointer. The implementation of this function is ./hotspot/src/share/vm/prims/jvm.cpp, wrapped in the JVM_ENTRY macro.
JVM_ENTRY(jstring, JVM_GetClassSignature(JNIEnv *env, jclass cls))
assert (cls != NULL, "illegal class");
JVMWrapper("JVM_GetClassSignature");
JvmtiVMObjectAllocEventCollector oam;
ResourceMark rm(THREAD);
// Return null for arrays and primatives
if (!java_lang_Class::is_primitive(JNIHandles::resolve(cls))) {
Klass* k = java_lang_Class::as_Klass(JNIHandles::resolve(cls));
if (k->oop_is_instance()) {
Symbol* sym = InstanceKlass::cast(k)->generic_signature();
if (sym == NULL) return NULL;
Handle str = java_lang_String::create_from_symbol(sym, CHECK_NULL);
return (jstring) JNIHandles::make_local(env, str());
}
}
return NULL;
JVM_END
As you can see, the final getGenericSignature0 is obtained from the method: InstanceKlass::cast(k)->generic_signature. This method uses the _generic_signature_index to get the relevant data from the symbol array of classfiles. It is consistent with the source code of the javac compilation process and JVMS.
// for adding methods, ConstMethod::UNSET_IDNUM means no more ids available
inline u2 next_method_idnum();
void set_initial_method_idnum(u2 value) { _idnum_allocated_count = value; }
// generics support
Symbol* generic_signature() const {
return (_generic_signature_index == 0) ?
(Symbol*)NULL : _constants->symbol_at(_generic_signature_index);
}
u2 generic_signature_index() const {
return _generic_signature_index;
}
void set_generic_signature_index(u2 sig_index) {
_generic_signature_index = sig_index;
}
Correspond to Signature [11] in classfile format in JVM.
We begin by exploring Java generics and examining the impact of reflection on the extension of generics and type erasure. Additionally, we explore techniques to access the runtime type of generics by generating anonymous instances.
Furthermore, by referring to the JVMS, the javac compilation process, and analyzing the JRE source code, we have studied how the JVM obtains generics and gained an understanding of the underlying principles. In conclusion, we have found a satisfactory solution to this problem.
1: https://docs.oracle.com/javase/tutorial/java/generics/why.html
2: https://homepages.inf.ed.ac.uk/wadler/gj/Documents/gj-oopsla.pdf
3: https://docs.oracle.com/javase/specs/jls/se8/html/jls-8.html#jls-8.1.2
4: https://docs.oracle.com/javase/specs/jls/se8/html/jls-8.html#jls-8.4.4
5: https://docs.oracle.com/javase/specs/jls/se8/html/jls-8.html#jls-8.8.4
6: https://docs.oracle.com/javase/specs/jls/se8/html/jls-9.html#jls-9.1.2
7: https://docs.oracle.com/javase/1.5.0/docs/guide/reflection/enhancements.html
8: https://docs.oracle.com/javase/8/docs/api/java/lang/reflect/class-use/Type.html
9: https://docs.oracle.com/javase/specs/jls/se8/html/jls-4.html#jls-4.6
10: https://docs.oracle.com/javase/specs/jls/se8/html/jls-4.html#jls-4.5
11: https://docs.oracle.com/javase/specs/jvms/se8/html/jvms-4.html#jvms-4.7.9.1
12: https://github.com/openjdk/jdk8u
13: https://docs.oracle.com/javase/tutorial/java/generics/nonReifiableVarargsType.html#heap_pollution
14: https://jcp.org/aboutJava/communityprocess/review/jsr014/index.html
15: https://openjdk.org/groups/compiler/doc/hhgtjavac/index.html
Disclaimer: The views expressed herein are for reference only and don't necessarily represent the official views of Alibaba Cloud.
How Alibaba Cloud Ignites Gaming Businesses with Agility, Availability, and Productivity
1,042 posts | 256 followers
FollowApache Flink Community China - May 17, 2021
Alibaba Container Service - April 28, 2020
Alibaba Cloud MaxCompute - February 4, 2024
Data Geek - May 9, 2023
Alibaba Clouder - May 17, 2019
Alibaba Cloud Community - July 29, 2024
1,042 posts | 256 followers
FollowExplore Web Hosting solutions that can power your personal website or empower your online business.
Learn MoreExplore how our Web Hosting solutions help small and medium sized companies power their websites and online businesses.
Learn MoreBuild superapps and corresponding ecosystems on a full-stack platform
Learn MoreWeb App Service allows you to deploy, scale, adjust, and monitor applications in an easy, efficient, secure, and flexible manner.
Learn MoreMore Posts by Alibaba Cloud Community