US20010013117A1 - Method and apparatus for finding bugs related to garbage collection in a virtual machine - Google Patents

Method and apparatus for finding bugs related to garbage collection in a virtual machine Download PDF

Info

Publication number
US20010013117A1
US20010013117A1 US09/153,382 US15338298A US2001013117A1 US 20010013117 A1 US20010013117 A1 US 20010013117A1 US 15338298 A US15338298 A US 15338298A US 2001013117 A1 US2001013117 A1 US 2001013117A1
Authority
US
United States
Prior art keywords
computer
virtual machine
map
stack
live
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
US09/153,382
Other versions
US6327701B2 (en
Inventor
David Ungar
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Oracle America Inc
Original Assignee
Sun Microsystems Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sun Microsystems Inc filed Critical Sun Microsystems Inc
Priority to US09/153,382 priority Critical patent/US6327701B2/en
Assigned to SUN MICROSYSTEMS, INC. reassignment SUN MICROSYSTEMS, INC. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: UNGAR, DAVID
Priority to PCT/US1999/018685 priority patent/WO2000016196A1/en
Priority to AU55678/99A priority patent/AU5567899A/en
Publication of US20010013117A1 publication Critical patent/US20010013117A1/en
Application granted granted Critical
Publication of US6327701B2 publication Critical patent/US6327701B2/en
Assigned to Oracle America, Inc. reassignment Oracle America, Inc. MERGER AND CHANGE OF NAME (SEE DOCUMENT FOR DETAILS). Assignors: Oracle America, Inc., ORACLE USA, INC., SUN MICROSYSTEMS, INC.
Anticipated expiration legal-status Critical
Expired - Lifetime legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/0706Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment
    • G06F11/0718Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment in an object-oriented system
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/0751Error or fault detection not based on redundancy
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/0766Error or fault reporting or storing
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10TECHNICAL SUBJECTS COVERED BY FORMER USPC
    • Y10STECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10S707/00Data processing: database and file management or data structures
    • Y10S707/99941Database schema or data structure
    • Y10S707/99942Manipulating data structure, e.g. compression, compaction, compilation
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10TECHNICAL SUBJECTS COVERED BY FORMER USPC
    • Y10STECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10S707/00Data processing: database and file management or data structures
    • Y10S707/99951File or database maintenance
    • Y10S707/99952Coherency, e.g. same view to multiple users
    • Y10S707/99953Recoverability

Definitions

  • This invention relates to the field of computer systems, and, more specifically, to memory management garbage collection processes.
  • Sun, Sun Microsystems, the Sun logo, Java and all Java-based trademarks and logos are trademarks or registered trademarks of Sun Microsystems, Inc. in the United States and other countries. All SPARC trademarks are used under license and are trademarks of SPARC International, Inc. in the United States and other countries. Products bearing SPARC trademarks are based upon an architecture developed by Sun Microsystems, Inc.
  • Garbage collection refers to the process of reclaiming portions of main memory that are no longer in use by the system or any running applications.
  • garbage collection is typically carried out to reclaim memory allocated to objects and other data structures (e.g., arrays, etc.) that are no longer referenced by an application. The reclaimed memory can then be re-allocated to store new objects or data structures.
  • garbage collection is performed to reclaim memory space from a region of memory known as the heap.
  • the heap is used to store objects and arrays that are referenced by pointers stored as local variables in activation records, or “stack frames,” of a stack associated with an individual thread of execution in the virtual machine.
  • the invocation of a method by a given thread results in the creation of a new stack frame that is “pushed” onto the stack of that thread.
  • References to objects on the heap may be removed by an active (i.e., currently executing) method setting the respective pointer to a “null” value, or by removal of a respective stack frame in response to completion of its associated method.
  • garbage collection points In any thread of execution, there may be many garbage collection points, or “gc-points,” where garbage collection can occur. However, actual garbage collection typically takes place at only a fraction of these possible gc-points each time the given thread of execution is run.
  • the compiler provides information at each gc-point about the set of locations in the stack frames that contain pointers to objects or arrays. Garbage collection is performed by determining which objects and arrays in the heap are referenced from within the set of locations specified by the compiler, and reclaiming those objects and arrays that are no longer referenced.
  • the compiler may have an error (i.e., a “bug”) that causes a stack location to be mistakenly omitted from the specified set of pointer locations.
  • This type of compiler bug can result in the reclaiming of an object or array when a reference still exists.
  • this compiler bug may result in a failure to update a pointer reference to point to the appropriate copy of the associated object or array. In either case, future references made to the object or array through the omitted stack location can result in improper execution of an application.
  • This bug is garbage collection-related, but it may appear to be a code generation bug, making detection and correction difficult.
  • Garbage collection may be either conservative or exact.
  • Conservative garbage collection involves scanning memory space for stored values that match the address of an object (or other memory structure) that is being considered for collection. If a matching value is not found in the memory being scanned, then no references to the object exist, and the object may be safely collected. If a matching value is found, it is assumed that the value is a reference (e.g., a pointer) to the object under consideration, and the object is not collected. This assumption means that an object is not collected even if the matching memory value is not a reference to the object, but rather a data value that coincidentally matches the base address of the object.
  • a reference e.g., a pointer
  • copying garbage collection is commonly implemented.
  • the memory space is divided into regions and an object transfer is performed.
  • garbage collection is carried out, objects in a portion of memory referred to as “from” space are copied to a portion referred to as “to” space.
  • Those objects in “from” space that are considered “garbage” by the scan process are not copied to “to” space.
  • the process of copying the objects results in reduced fragmentation of the memory space and better compaction.
  • FIG. 1 is a flow diagram illustrating a copying garbage collection process.
  • the set of references to be scanned is determined. For example, a mechanism may be provided that tracks the creation of references, and maintains a list of current references for exact garbage collection. This list may be used to define the set to be scanned in step 100 .
  • the garbage collection process obtains the first reference from the set of references.
  • the reference is analyzed to determine if the reference points to an object in “from” space. If the reference does not point to an object in “from” space, the process jumps to step 107 . If, however, the reference does point to an object in “from” space, the process continues at step 103 .
  • step 103 the referenced object in “from” space is examined to determine whether the object is marked as copied. If the referenced object is marked, the process jumps to step 106 . However, if the referenced object in “from” space is not marked as copied, the process continues at step 104 , in which the referenced object is copied into “to” space. In subsequent step 105 , the referenced object in “from” space is marked as copied (e.g., replaced with a marker), with the location of the new copy in “to” space identified in the marker. In step 106 , the current reference is updated to point to the location of the new copy of the object in “to” space, as identified by the marked object in “from” space. The process continues in step 107 .
  • step 107 the referenced object in “from” space is examined to determine whether the object is marked as copied. If the referenced object is marked, the process jumps to step 106 . However, if the referenced object in “from” space is not marked as copied, the process continues at step 104
  • step 107 a check is performed to determine whether the current reference is the last reference in the set of references to be scanned. If the current reference is not the last reference, in step 108 , the next reference in the set is obtained, and the process returns to Step 102 . If, however, in step 107 , the current reference is the last reference in the set, the process completes in step 109 where “from” space is collected in its entirety. Ideally, no references will be made in the future to objects in “from” space. In a subsequent garbage collection, “from” space becomes “to” space and “to” space becomes “from” space for purposes of copying.
  • the copying garbage collection scheme described above may be expanded to implement a generational approach.
  • Generational collection schemes are predicated on the general assumption that newly created objects are more prone to collection than objects that have survived several garbage collection cycles.
  • objects are segregated into generational groups of objects according to the number of garbage collection cycles survived, with each generational group having its own respective “to” space and “from” space. Garbage collection is then carried out separately for each generational group of objects, with garbage collection being carried out more frequently for younger generations.
  • Exact garbage collection is required when objects are copied, to prevent a coincidental data value match from causing mutation of the data value during updating of object references.
  • exact garbage collection requires information about which locations contain active or “live” references to objects. Problems can arise when this information is incorrect. For example, if a live reference fails to be identified in step 100 of FIG. 1 due to misidentification of a live reference, the garbage collection process may erroneously collect the associated object without copying, causing unpredictable performance when the reference is used by a method to access the object in the future.
  • the misidentified reference is not updated in step 106 to refer to the new object copy.
  • object access through other identified references will address the new object copy
  • object access through the misidentified reference will continue to address the obsolete object with unknown and undesirable consequences. This undesirable behavior will appear as a code generation bug associated with the executing application, when it is in fact associated with garbage collection, and more specifically associated with the component that provides the information about live object references.
  • a stress testing application may be insufficient to bring about conditions that will result in misidentification of a stack location by the compiler, whereas another application may consistently trigger such a bug in the compiler.
  • a method and apparatus for finding bugs related to garbage collection in a virtual machine are described. For each possible garbage collection point in a stream of execution, a compiler in the virtual machine provides a map that specifies live pointer locations in the stack. In addition, the map identifies those locations in the stack that contain other forms of live data, such as integers. All other locations are considered “dead,” i.e., no longer in use or never used. At each garbage collection point, “dead” locations in the stack are overwritten with an invalid pointer value. Because of the overwriting process, any bug in the compiler that causes a live pointer to be omitted from the map also causes the omitted pointer to be overwritten with the invalid pointer value.
  • garbage collection is actually performed at the garbage collection point where the pointer was omitted from the compiler-generated map
  • subsequent execution steps that reference the omitted pointer trigger an invalid pointer error.
  • the invalid pointer error may be trapped and identified as a compiler bug related to map generation in the garbage collection process.
  • FIG. 1 is a flow diagram of a copying garbage collection process.
  • FIG. 2 is a block diagram of compile and runtime environments.
  • FIG. 3 is a block diagram of the runtime data areas of an embodiment of a virtual machine.
  • FIG. 4A is a block diagram illustrating an example of the use of pointers on a stack to reference objects on a heap.
  • FIG. 4B is a block diagram illustrating the application of copying garbage collection to the stack and heap example of FIG. 4A.
  • FIG. 5 is a flow diagram, in accordance with an embodiment of the invention, of a process for finding bugs related to garbage collection in a virtual machine.
  • FIG. 6 is a block diagram, in accordance with an embodiment of the invention, of apparatus for finding bugs related to garbage collection.
  • FIG. 7 is a block diagram of one embodiment of a computer system capable of providing a suitable execution environment for an embodiment of the invention.
  • the invention is a method and apparatus for finding bugs related to garbage collection in a virtual machine.
  • numerous specific details are set forth to provide a more thorough description of embodiments of the invention. It will be apparent, however, to one skilled in the art, that the invention may be practiced without these specific details. In other instances, well known features have not been described in detail so as not to obscure the invention.
  • the invention may be implemented in any environment that supports object or data access through references, and that provides information about live object references for use in garbage collection.
  • the component providing the information about live object references is the compiler.
  • the compiler is responsible for compiling method code, and is therefore knowledgeable about the contents of stack frames at garbage collection points.
  • Object-oriented programming is a method of creating computer programs by combining certain fundamental building blocks, and creating relationships among and between the building blocks.
  • the building blocks in object-oriented programming systems are called “objects.”
  • a software application can be written using an object-oriented programming language whereby the program's functionality is implemented using these objects.
  • An object is a programming unit that groups together a data structure (one or more instance variables) and the operations (methods) that can use or affect that data.
  • a data structure one or more instance variables
  • operations methods
  • an object consists of data and one or more operations or procedures that can be performed on that data.
  • the joining of data and operations into a unitary building block is called “encapsulation.”
  • An object can be instructed to perform one of its methods when it receives a “message.”
  • a message is a command or instruction sent to the object to execute a certain method.
  • a message consists of a method selection (e.g., method name) and zero or more arguments.
  • a message tells the receiving object what operations to perform.
  • One advantage of object-oriented programming is the way in which methods are invoked. When a message is sent to an object, it is not necessary for the message to instruct the object how to perform a certain method. It is only necessary to request that the object execute the method. This greatly simplifies program development.
  • Object-oriented programming languages are predominantly based on a “class” scheme.
  • An example of a class-based object-oriented programming scheme is generally described in “Smalltalk-80: The Language,” by Adele Goldberg and David Robson, published by Addison-Wesley Publishing Company, 1989.
  • An object class provides a definition for an object which typically includes both fields (e.g., variables) and methods.
  • An object class is used to create a particular object “instance.” (The term “object” by itself is often used interchangeably to refer to a particular class or a particular instance.)
  • An instance of an object class includes the variables and methods defined for that class. Multiple instances can be created from the same object class. Each instance that is created from the object class is said to be of the same type or class.
  • an employee object class can include “name” and “salary” instance variables and a “set_salary” method. Instances of the employee object class can be created, or instantiated, for each employee in an organization. Each object instance is said to be of type “employee.” Each employee object instance includes “name” and “salary” instance variables and the “set_salary” method. The values associated with the “name” and “salary” variables in each employee object instance contain the name and salary of an employee in the organization. A message can be sent to an employee's employee object instance to invoke the “set_salary” method to modify the employee's salary (i.e., the value associated with the “salary” variable in the employee's employee object).
  • a hierarchy of classes can be defined such that an object class definition has one or more subclasses.
  • a subclass inherits its parent's (and grandparent's etc.) definition.
  • Each subclass in the hierarchy may add to or modify the behavior specified by its parent class.
  • Some object-oriented programming languages support multiple inheritance where a subclass may inherit a class definition from more than one parent class.
  • Other programming languages such as the Java programming language, support only single inheritance, where a subclass is limited to inheriting the class definition of only one parent class.
  • the Java programming language also provides a mechanism known as an “interface” which comprises a set of constant and abstract method declarations.
  • An object class can implement the abstract methods defined in an interface. Both single and multiple inheritance are available to an interface. That is, an interface can inherit an interface definition from more than one parent interface.
  • Java applications typically comprise one or more object classes and interfaces. Unlike many programming languages in which a program is compiled into machine-dependent, executable program code, classes written in the Java programming language are compiled into machine independent bytecode class files. Each class contains code and data in a platform-independent format called the class file format. The computer system acting as the execution vehicle contains a program called a virtual machine, which is responsible for executing the code in each class.
  • Applications may be designed as standalone Java applications, or as Java “applets” which are identified by an applet tag in an HTML (hypertext markup language) document, and loaded by a browser application.
  • the class files associated with an application or applet may be stored on the local computing system, or on a server accessible over a network. Each class is loaded into the Java virtual machine, as needed, by the “class loader.”
  • a web server application is executed on the server to respond to HTTP (hypertext transport protocol) requests containing URLs (universal resource locators) to HTML documents, also referred to as “web pages.”
  • HTTP hypertext transport protocol
  • URLs universal resource locators
  • a browser application executing on a client platform receives an HTML document (e.g., as a result of requesting an HTML document by forwarding a URL to the web server)
  • the browser application parses the HTML and automatically initiates the download of the specified bytecode class files when it encounters an applet tag in the HTML document.
  • the classes of a Java applet are loaded on demand from the network (stored on a server), or from a local file system, when first referenced during the Java applet's execution.
  • the virtual machine locates and loads each class file, parses the class file format, allocates memory for the class's various components, and links the class with other already loaded classes. This process makes the code in the class readily executable by the virtual machine.
  • FIG. 2 illustrates the compile and runtime environments for a processing system.
  • a software developer creates source files 200 , which contain the programmer readable class definitions written in the Java programming language, including data structures, method implementations and references to other classes.
  • Source files 200 are provided to Java compiler 201 , which compiles source files 200 into compiled “.class” files 202 that contain bytecodes executable by a Java virtual machine.
  • Bytecode class files 202 are stored (e.g., in temporary or permanent storage) on a server, and are available for download over a network. Alternatively, bytecode class files 202 may be stored locally in a directory on the client platform.
  • Java runtime environment contains a Java virtual machine (JVM) 205 which is able to execute bytecode class files and execute native operating system (“O/S”) calls to operating system 209 when necessary during execution.
  • Java virtual machine 205 provides a level of abstraction between the machine independence of the bytecode classes and the machine-dependent instruction set of the underlying computer hardware 210 , as well as the platform-dependent calls of operating system 209 .
  • Class loader and bytecode verifier (“class loader”) 203 is responsible for loading bytecode class files 202 and supporting class libraries 204 into Java virtual machine 205 as needed. Class loader 203 also verifies the bytecodes of each class file to maintain proper execution and enforcement of security rules. Within the context of runtime system 208 , either an interpreter 206 executes the bytecodes directly, or a “just-in-time” (JIT) compiler 207 transforms the bytecodes into machine code, so that they can be executed by the processor (or processors) in hardware 210 .
  • JIT just-in-time
  • the runtime system 208 of virtual machine 205 supports a general stack architecture.
  • the manner in which this general stack architecture is supported by the underlying hardware 210 is determined by the particular virtual machine implementation, and reflected in the way the bytecodes are interpreted or JIT-compiled.
  • Other elements of the runtime system include thread management (e.g., scheduling) and garbage collection mechanisms.
  • FIG. 3 illustrates runtime data areas which support the stack architecture within runtime system 208 .
  • runtime data areas 300 comprise one or more thread-based data areas 307 .
  • Each thread-based data area 307 comprises a program counter register (PC REG) 308 , a local variables pointer register (VARS REG) 309 , a frame register (FRAME REG) 310 , an operand stack pointer register (OPTOP REG) 311 , and a stack 312 .
  • Stack 312 comprises one or more frames 313 which contain an operand stack 314 and local variables 315 . Separate frame formats may be implemented for interpreted code and compiled code.
  • Runtime data areas 300 further comprises shared heap 301 .
  • Heap 301 is the runtime data area from which memory for all class instances and arrays is allocated.
  • Shared heap 301 comprises method area 302 , which is shared among all threads.
  • Method area 302 comprises one or more class-based data areas 303 for storing information extracted from each loaded class file.
  • class-based data area 303 may comprise class structures such as constant pool 304 , field and method data 305 , and code for methods and constructors 306 . Methods access class structures by reference. Pointers to classes are stored in local variables 315 or in registers associated with a given stack.
  • a virtual machine can support many threads of execution at once. Each thread has its own thread-based data area 307 . At any point, each thread is executing the code of a single method, the “current method” for that thread.
  • the program counter register 308 contains the address of the virtual machine instruction currently being executed.
  • Frame register 310 points to the location of the current method in method area 302 .
  • Each thread has a private stack 312 , created at the same time as the thread.
  • Stack 312 stores one or more frames 313 associated with methods invoked by the thread.
  • Frames 313 are used to store data and partial results, as well as to perform dynamic linking, return values for methods and dispatch exceptions.
  • a new frame is created and pushed onto the stack each time a method is invoked, and an existing frame is popped from the stack and destroyed when its method completes.
  • a frame that is created by a thread is local to that thread and typically cannot be directly referenced by any other thread.
  • This frame is referred to as the “current frame,” and its method is known as the “current method.”
  • a frame ceases to be current if its method invokes another method or if its method completes.
  • a method is invoked, a new frame is created and becomes current when control transfers to the new method.
  • the current frame passes back the results of its method invocation, if any, to the previous frame. The current frame is then discarded while the previous frame becomes the current one.
  • Each frame 313 has its own set of local variables 315 and its own operand stack 314 .
  • the local variables pointer register 309 contains a pointer to the base of an array of words containing local variables 315 of the current frame.
  • the operand stack pointer register 311 points to the top of operand stack 314 of the current frame. Most virtual machine instructions take values from the operand stack of the current frame, operate on them, and return results to the same operand stack. Operand stack 314 is also used to pass arguments to methods and receive method results.
  • FIG. 4A is a block diagram illustrating the use of pointers on the stack to reference objects on the heap in a virtual machine.
  • the data structures shown include stack 408 comprising stack locations 400 - 407 , and heap 409 comprising “from” space 410 and “to” space 411 .
  • “From” space 410 comprises object A ( 412 ), object B ( 413 ) and object C ( 414 ).
  • FIG. 4 illustrates the state of the stack and heap prior to a garbage collection cycle, so no objects are shown copied to “to” space 411 .
  • stack locations 400 - 407 appear as a single array of local variables. However, in actual application, each stack frame contains its own set of local variables. Further, pointers to objects may be stored in registers as well. Stack locations 400 - 407 may therefore represent registers and local variables associated with multiple stack frames.
  • stack locations 400 and 405 contain references to object A in the form of a pointer value.
  • stack locations 401 and 404 contain references to object C
  • stack locations 402 and 407 contain references to object B.
  • Local variables and registers in stack frames may also contain actual data, i.e., data that is not indirectly referenced, such as integer data.
  • stack locations 403 and 406 contain integers X and Y, respectively.
  • FIG. 4B illustrates the stack and heap of FIG. 4A after a garbage collection cycle has taken place at a gc-point in the virtual machine's execution of a program. It is assumed in FIG. 4B that prior to garbage collection, the references to object B were explicitly released by a method assigning a “null” value to the respective local variables 402 and 407 .
  • the compiler of the virtual machine prepares a map of stack locations containing live pointers. An updated map is generated by the compiler for each gc-point.
  • FIG. 4B illustrates the stack and heap of FIG. 4A after a garbage collection cycle has taken place at a gc-point in the virtual machine's execution of a program. It is assumed in FIG. 4B that prior to garbage collection, the references to object B were explicitly released by a method assigning a “null” value to the respective local variables 402 and 407 .
  • the compiler of the virtual machine prepares a map of stack locations containing live pointers. An updated map is generated by the compiler
  • the map generated by the compiler for the recent gc-point is OOP map 417 (OOP referring to object-oriented pointers) which indicates that stack locations 400 , 404 and 405 contain live OOPs or pointers.
  • OOP map 417 OOP referring to object-oriented pointers
  • stack location 401 is erroneously omitted from the live OOPs designated by OOP map 417 .
  • references to object A were found in stack locations 400 and 405 , and a reference to object C was found in stack location 404 .
  • No references to object B were found in the stack locations of stack 408 , because locations 402 and 407 were assigned null values and/or were not specified by OOP map 417 as live pointers.
  • objects A and C are copied to “to” space 411 as object A.copy 415 and object C.copy 416 .
  • stack locations 400 and 405 are updated to point to object A.copy 415 in “to” space 411
  • stack location 404 is updated to point to object C.copy 416 .
  • stack location 401 is not updated. Therefore, any future reference to object C made via stack location 401 will erroneously access the obsolete object C 414 in “from” space 410 , whereas reference made via stack location 404 will access object C.copy 416 appropriately.
  • the multiple copies of object C are likely to diverge in respective data values over time, causing inconsistent performance.
  • obsolete object C 414 may be written over in a subsequent garbage collection cycle, resulting in indeterminate behavior and possibly a terminal error if accessed via stack location 401 .
  • a debugging mode is provided in an embodiment of the invention, that, when enabled, initiates a debugging process within the virtual machine to find bugs associated with compiler error in the identification of live OOPs.
  • the activation of the debugging mode enables a “zapping” process that overwrites unused stack locations at each gc-point in the virtual machine's execution of a program. Those stack locations that the compiler fails to correctly identify as live OOPs are also overwritten by the zapping process.
  • attempted object access via a misidentified live OOP generates an error that may be trapped and identified.
  • the zapping process takes place at each gc-point, thus providing an exhaustive test of compiler OOP identification for a given executed application.
  • the OOP map generated by the compiler is extended in debugging mode to identify live stack locations that do not contain OOPS. This permits the zapping process to identify unused stack locations as those locations which are not identified by the compiler either as live OOPS or as live non-OOPs (e.g., directly referenced data such as integers).
  • a comparison process is carried out, in addition to the zapping process, to handle another possible form of misidentification that may occur in the revised OOP map.
  • This new form of misidentification is that of a live OOP being misidentified as a live non-OOP. If a live OOP is misidentified in this manner, the zapping process assumes that the register location contains a live non-OOP value, such as an integer, and thus does not overwrite the register location.
  • live non-OOPs are compared with possible OOP values (e.g., valid object reference values) to ascertain whether the live non-OOPs may, in fact, be misidentified live OOPs. If a match is obtained for any live non-OOP, a warning is issued. The person performing the debugging process may then determine where a possible bug of this nature may be occurring based on the issued warning.
  • FIG. 5 is a flow diagram of a debugging process in accordance with an embodiment of the invention.
  • the debugging mode is enabled and the virtual machine begins execution of an application.
  • a branch occurs based on whether a gc-point has been reached. If execution is not at a gc-point, the debugging process continues at step 508 , where any attempts to use an invalid pointer value during program execution are trapped. If, at step 501 , execution has reached a gc-point, the debugging process continues at step 502 .
  • step 502 the debugging process determines, based on the compiler OOP map, which stack locations contain a live OOP and which stack locations do not contain a live OOP.
  • step 503 for those stack locations specified as not containing live OOPs, the process determines which stack locations contain other forms of live data, or live non-OOPs. Those stack locations that are not identified by the compiler OOP map as containing live OOPs or live non-OOPs are assumed to be “dead” or unused locations.
  • step 504 those dead or unused stack locations are overwritten with an invalid pointer value, such as “0000.” Steps 501 - 504 may be implemented by the zapping process previously described.
  • step 505 those locations specified as live non-OOPs (or a subset thereof) in the OOP map are compared with possible object reference values to determine whether each live non-OOP is possibly a misidentified live OOP. If, in step 506 , no live non-OOP matches a reference to an object, the debugging process continues at step 508 . If, in step 506 , the value stored in a live non-OOP matches a possible pointer value of an object, a warning is issued in step 507 before proceeding to step 508 . The warning may comprise, for example, a dialog message sent to a display, or a warning entry written to a log file. Steps 505 - 507 may be implemented by the comparison process previously described. It will be obvious to one skilled in the art that the invention may be practiced without steps 505 - 507 .
  • step 508 during program execution, any attempts to use invalid pointers (i.e., pointers with invalid pointer values such as “0000”) to access objects on the heap are trapped, or otherwise registered as errors, and identified.
  • the step of trapping the use of invalid pointers is carried out continuously during program execution, and may be implemented in a virtual machine process separate from the zapping and comparison process(es).
  • step 509 if program execution is completed, the debugging process also completes in step 510 . If, in step 509 , execution has not yet completed, the debugging process returns to step 501 .
  • FIG. 6 is a block diagram of a virtual machine implementation comprising a zapping/comparing component 600 in accordance with an embodiment of the invention.
  • Zapping/comparing component 600 performs the zapping and comparison processes described with respect to FIG. 5, and may be implemented as one or more objects, components, methods, procedures, or elements thereof, within a virtual machine environment.
  • stack 408 comprises stack locations 400 - 407 and heap 409 comprises “from” space 410 and “to” space 411 .
  • “From” space 410 comprises object A 412 , object B 413 and object C 414 .
  • “To” space 411 comprises object A.copy 415 and object B.copy 416 .
  • Stack locations 400 and 405 point to object A.copy 415 ; stack location 404 points to object C.copy 416 ; and stack locations 401 , 402 and 407 point to an invalid address 601 .
  • Compiler-generated OOP map 617 identifies stack locations 400 , 404 and 405 as live OOPs. Further, in accordance with an embodiment of the invention, OOP map 617 identifies stack locations 403 and 406 as live non-OOPs, because they contain integer data values.
  • An example representation of OOP map 617 is as follows: Stack Location* 400 Live OOP — 401 (Not Live OOP) (Dead) 402 (Not Live OOP) (Dead) 403 (Not Live OOP) Live Non-OOP 404 Live OOP — 405 Live OOP — 406 (Not Live OOP) Live Non-OOP 407 (Not Live OOP) (Dead)
  • the implementation of the OOP map may be as flag bits or bytes associated with stack locations.
  • a first bit or byte may be used to indicate whether the corresponding stack location is a Live OOP.
  • a second bit or byte may be used to indicate, in the event the first bit or byte is not set, whether the corresponding stack location is a live non-OOP.
  • Any map implementation may be used which provides three states for each stack location: “Live OOP,” “Live non-OOP,” and “dead.”
  • Zapping/comparing component 600 accesses OOP map 617 to identify “dead” locations in stack 408 for the zapping process and to identify “Live Non-OOP” locations for the comparing process. Zapping/comparing component 600 accesses stack 408 to carry out the overwriting operations of the zapping process, as well as the individual comparing operations of the comparison process. When directed by the comparison process, zapping/comparing component 600 may issue warnings to warning destination 602 .
  • stack locations 401 , 402 and 407 are overwritten with the pointer value 0000, which points to invalid location 601 .
  • the zapping of stack locations 402 and 407 results in no effect because those locations are dead stack locations previously assigned a null value.
  • the zapping of stack location 401 redirects the unidentified OOP from object C 414 to invalid location 601 .
  • Attempted access of object C via stack location 401 results in a trap that may be used to detect the error in the OOP map, and to correct the source of the error in the compiler.
  • the entry for location 401 in the OOP map were to erroneously identify the location as containing a live non-OOP rather than a “dead” value, the entry for location 401 in OOP map 617 would read, for example, as follows: 401 (Not Live OOP) Live Non-OOP
  • the zapping process of zapping/comparing component 600 does not overwrite location 401 because the location is not “dead.”
  • the comparison process of zapping/comparing component 600 examines locations identified as “Live Non-OOP” (e.g., locations 401 , 403 and 406 ), and compares the stored values with possible OOP reference values.
  • Location 401 is flagged for storing a value matching the OOP reference value for object C 414 in “from” space 410 , and a warning is issued to warning destination 602 to note the occurrence of this match.
  • the comparison process is carried out at each gc-point. Thus, a warning may be issued by zapping/comparing component 600 regardless of whether a garbage collection cycle is carried out.
  • An embodiment of the invention can be implemented as computer software in the form of computer readable code executed on a general purpose computer such as computer 700 illustrated in FIG. 7, or in the form of bytecode class files executable within a Java runtime environment running on such a computer.
  • a keyboard 710 and mouse 711 are coupled to a bi-directional system bus 718 .
  • the keyboard and mouse are for introducing user input to the computer system and communicating that user input to processor 713 .
  • Other suitable input devices may be used in addition to, or in place of, the mouse 711 and keyboard 710 .
  • I/O (input/output) unit 719 coupled to bidirectional system bus 718 represents such I/O elements as a printer, A/V (audio/video) I/O, etc.
  • Computer 700 includes a video memory 714 , main memory 715 and mass storage 712 , all coupled to bidirectional system bus 718 along with keyboard 710 , mouse 711 and processor 713 .
  • the mass storage 712 may include both fixed and removable media, such as magnetic, optical or magnetic optical storage systems or any other available mass storage technology.
  • Bus 718 may contain, for example, address lines for addressing video memory 714 or main memory 715 .
  • the system bus 718 also includes, for example, a data bus for transferring data between and among the components, such as processor 713 , main memory 715 , video memory 714 and mass storage 712 .
  • multiplex data/address lines may be used instead of separate data and address lines.
  • the processor 713 is a microprocessor manufactured by Motorola, such as the 680 ⁇ 0 processor or a microprocessor manufactured by Intel, such as the 80 ⁇ 86, or Pentium processor, or a SPARC microprocessor from Sun Microsystems, Inc.
  • Main memory 715 is comprised of dynamic random access memory (DRAM).
  • Video memory 714 is a dual-ported video random access memory. One port of the video memory 714 is coupled to video amplifier 716 .
  • the video amplifier 716 is used to drive the cathode ray tube (CRT) raster monitor 717 .
  • Video amplifier 716 is well known in the art and may be implemented by any suitable apparatus.
  • This circuitry converts pixel data stored in video memory 714 to a raster signal suitable for use by monitor 717 .
  • Monitor 717 is a type of monitor suitable for displaying graphic images.
  • the video memory could be used to drive a flat panel or liquid crystal display (LCD), or any other suitable data presentation device.
  • Computer 700 may also include a communication interface 720 coupled to bus 718 .
  • Communication interface 720 provides a two-way data communication coupling via a network link 721 to a local network 722 .
  • ISDN integrated services digital network
  • communication interface 720 provides a data communication connection to the corresponding type of telephone line, which comprises part of network link 721 .
  • LAN local area network
  • communication interface 720 provides a data communication connection via network link 721 to a compatible LAN.
  • Communication interface 720 could also be a cable modem or wireless interface. In any such implementation, communication interface 720 sends and receives electrical, electromagnetic or optical signals which carry digital data streams representing various types of information.
  • Network link 721 typically provides data communication through one or more networks to other data devices.
  • network link 721 may provide a connection through local network 722 to local server computer 723 or to data equipment operated by an Internet Service Provider (ISP) 724 .
  • ISP 724 in turn provides data communication services through the world wide packet data communication network now commonly referred to as the “Internet” 725 .
  • Internet 725 uses electrical, electromagnetic or optical signals which carry digital data streams.
  • the signals through the various networks and the signals on network link 721 and through communication interface 720 which carry the digital data to and from computer 700 , are exemplary forms of carrier waves transporting the information.
  • Computer 700 can send messages and receive data, including program code, through the network(s), network link 721 , and communication interface 720 .
  • remote server computer 726 might transmit a requested code for an application program through Internet 725 , ISP 724 , local network 722 and communication interface 720 .
  • the received code may be executed by processor 713 as it is received, and/or stored in mass storage 712 , or other non-volatile storage for later execution.
  • computer 700 may obtain application code in the form of a carrier wave.
  • an example of such a downloaded application is the apparatus for debugging a virtual machine described herein.
  • Application code may be embodied in any form of computer program product.
  • a computer program product comprises a medium configured to store or transport computer readable code or data, or in which computer readable code or data may be embedded.
  • Some examples of computer program products are CD-ROM disks, ROM cards, floppy disks, magnetic tapes, computer hard drives, servers on a network, and carrier waves.
  • An embodiment of the invention may be implemented in any type of computer system or programming or processing environment, including embedded devices (e.g., web phones, etc.) and “thin” client processing environments (e.g., network computers (NC's), etc.) that support a virtual machine.
  • embedded devices e.g., web phones, etc.
  • client processing environments e.g., network computers (NC's), etc.

Abstract

A method and apparatus for finding bugs related to garbage collection in a virtual machine. For each possible garbage collection point in a stream of execution, a compiler in the virtual machine provides a map that specifies live pointer locations in the stack. In addition, the map identifies those locations in the stack that contain other forms of live data, such as integers. All other locations are considered “dead,” i.e., no longer in use or never used. At each garbage collection point, “dead” locations in the stack are overwritten with an invalid pointer value. Because of the overwriting process, any bug in the compiler that causes a live pointer to be omitted from the map also causes the omitted pointer to be overwritten with the invalid pointer value. Regardless of whether garbage collection is actually performed at the garbage collection point where the pointer was omitted from the compiler-generated map, subsequent execution steps that reference the omitted pointer trigger an invalid pointer error. The invalid pointer error may be trapped and identified as a compiler bug related to map generation in the garbage collection process.

Description

    BACKGROUND OF THE INVENTION
  • 1. Field of the Invention [0001]
  • This invention relates to the field of computer systems, and, more specifically, to memory management garbage collection processes. [0002]
  • Sun, Sun Microsystems, the Sun logo, Java and all Java-based trademarks and logos are trademarks or registered trademarks of Sun Microsystems, Inc. in the United States and other countries. All SPARC trademarks are used under license and are trademarks of SPARC International, Inc. in the United States and other countries. Products bearing SPARC trademarks are based upon an architecture developed by Sun Microsystems, Inc. [0003]
  • 2. Background Art [0004]
  • An important aspect of memory management in any computer system is garbage collection. Garbage collection (GC) refers to the process of reclaiming portions of main memory that are no longer in use by the system or any running applications. In an object-oriented system, garbage collection is typically carried out to reclaim memory allocated to objects and other data structures (e.g., arrays, etc.) that are no longer referenced by an application. The reclaimed memory can then be re-allocated to store new objects or data structures. [0005]
  • In a Java™ virtual machine, garbage collection is performed to reclaim memory space from a region of memory known as the heap. The heap is used to store objects and arrays that are referenced by pointers stored as local variables in activation records, or “stack frames,” of a stack associated with an individual thread of execution in the virtual machine. The invocation of a method by a given thread results in the creation of a new stack frame that is “pushed” onto the stack of that thread. References to objects on the heap may be removed by an active (i.e., currently executing) method setting the respective pointer to a “null” value, or by removal of a respective stack frame in response to completion of its associated method. [0006]
  • In any thread of execution, there may be many garbage collection points, or “gc-points,” where garbage collection can occur. However, actual garbage collection typically takes place at only a fraction of these possible gc-points each time the given thread of execution is run. In virtual machine implementations using a compiler, the compiler provides information at each gc-point about the set of locations in the stack frames that contain pointers to objects or arrays. Garbage collection is performed by determining which objects and arrays in the heap are referenced from within the set of locations specified by the compiler, and reclaiming those objects and arrays that are no longer referenced. [0007]
  • Unfortunately, the compiler may have an error (i.e., a “bug”) that causes a stack location to be mistakenly omitted from the specified set of pointer locations. This type of compiler bug can result in the reclaiming of an object or array when a reference still exists. Also, for a type of garbage collection known as “copying” garbage collection, this compiler bug may result in a failure to update a pointer reference to point to the appropriate copy of the associated object or array. In either case, future references made to the object or array through the omitted stack location can result in improper execution of an application. This bug is garbage collection-related, but it may appear to be a code generation bug, making detection and correction difficult. [0008]
  • To provide a better understanding of the problems associated with garbage collection in a virtual machine, an overview of garbage collection techniques is provided below. [0009]
  • Garbage Collection [0010]
  • Garbage collection may be either conservative or exact. Conservative garbage collection involves scanning memory space for stored values that match the address of an object (or other memory structure) that is being considered for collection. If a matching value is not found in the memory being scanned, then no references to the object exist, and the object may be safely collected. If a matching value is found, it is assumed that the value is a reference (e.g., a pointer) to the object under consideration, and the object is not collected. This assumption means that an object is not collected even if the matching memory value is not a reference to the object, but rather a data value that coincidentally matches the base address of the object. [0011]
  • In exact garbage collection, only true references (pointers) are considered in a scan, so coincidentally matching data values are ignored in the collection process. This means that an object without any associated references is always considered garbage in a scan, and more efficient collection is achieved. However, to perform exact garbage collection, the scanning process must have information regarding which memory locations contain live references (i.e., active, non-null references). Only those memory locations containing live references are scanned to determine reference matches for objects under consideration for collection. [0012]
  • To provide more efficient use of memory space in terms of compaction, “copying” garbage collection is commonly implemented. In copying garbage collection, the memory space is divided into regions and an object transfer is performed. When garbage collection is carried out, objects in a portion of memory referred to as “from” space are copied to a portion referred to as “to” space. Those objects in “from” space that are considered “garbage” by the scan process are not copied to “to” space. The process of copying the objects results in reduced fragmentation of the memory space and better compaction. [0013]
  • FIG. 1 is a flow diagram illustrating a copying garbage collection process. In [0014] step 100, the set of references to be scanned is determined. For example, a mechanism may be provided that tracks the creation of references, and maintains a list of current references for exact garbage collection. This list may be used to define the set to be scanned in step 100. In step 101, the garbage collection process obtains the first reference from the set of references. In step 102, the reference is analyzed to determine if the reference points to an object in “from” space. If the reference does not point to an object in “from” space, the process jumps to step 107. If, however, the reference does point to an object in “from” space, the process continues at step 103.
  • In [0015] step 103, the referenced object in “from” space is examined to determine whether the object is marked as copied. If the referenced object is marked, the process jumps to step 106. However, if the referenced object in “from” space is not marked as copied, the process continues at step 104, in which the referenced object is copied into “to” space. In subsequent step 105, the referenced object in “from” space is marked as copied (e.g., replaced with a marker), with the location of the new copy in “to” space identified in the marker. In step 106, the current reference is updated to point to the location of the new copy of the object in “to” space, as identified by the marked object in “from” space. The process continues in step 107.
  • In [0016] step 107, a check is performed to determine whether the current reference is the last reference in the set of references to be scanned. If the current reference is not the last reference, in step 108, the next reference in the set is obtained, and the process returns to Step 102. If, however, in step 107, the current reference is the last reference in the set, the process completes in step 109 where “from” space is collected in its entirety. Ideally, no references will be made in the future to objects in “from” space. In a subsequent garbage collection, “from” space becomes “to” space and “to” space becomes “from” space for purposes of copying.
  • The copying garbage collection scheme described above may be expanded to implement a generational approach. Generational collection schemes are predicated on the general assumption that newly created objects are more prone to collection than objects that have survived several garbage collection cycles. Using the generational approach, objects are segregated into generational groups of objects according to the number of garbage collection cycles survived, with each generational group having its own respective “to” space and “from” space. Garbage collection is then carried out separately for each generational group of objects, with garbage collection being carried out more frequently for younger generations. [0017]
  • Exact garbage collection is required when objects are copied, to prevent a coincidental data value match from causing mutation of the data value during updating of object references. As stated previously, exact garbage collection requires information about which locations contain active or “live” references to objects. Problems can arise when this information is incorrect. For example, if a live reference fails to be identified in [0018] step 100 of FIG. 1 due to misidentification of a live reference, the garbage collection process may erroneously collect the associated object without copying, causing unpredictable performance when the reference is used by a method to access the object in the future.
  • Also, if the associated object is copied to “to” space (e.g., because other references to the object exist and are correctly identified), the misidentified reference is not updated in [0019] step 106 to refer to the new object copy. Thus, while object access through other identified references will address the new object copy, object access through the misidentified reference will continue to address the obsolete object with unknown and undesirable consequences. This undesirable behavior will appear as a code generation bug associated with the executing application, when it is in fact associated with garbage collection, and more specifically associated with the component that provides the information about live object references.
  • In the prior art, stress tests have been performed to test for execution bugs. Stress tests attempt to test extreme execution conditions that will result in the triggering and resulting detection of any bugs in the system. However, with respect to garbage collection, a stress test will only result in testing at gc-points where garbage collection actually occurs. Because garbage collection occurs at only a subset of gc-points, and because that subset of gc-points may not differ from one execution to the next for a particular application or input data set, stress tests are insufficient to reliably and exhaustively find bugs associated with the misidentification of a live reference at possibly a single gc-point out of many in the execution of the application. Further, with respect to a virtual machine environment where a compiler identifies the live references for applications it compiles, a stress testing application may be insufficient to bring about conditions that will result in misidentification of a stack location by the compiler, whereas another application may consistently trigger such a bug in the compiler. [0020]
  • SUMMARY OF THE INVENTION
  • A method and apparatus for finding bugs related to garbage collection in a virtual machine are described. For each possible garbage collection point in a stream of execution, a compiler in the virtual machine provides a map that specifies live pointer locations in the stack. In addition, the map identifies those locations in the stack that contain other forms of live data, such as integers. All other locations are considered “dead,” i.e., no longer in use or never used. At each garbage collection point, “dead” locations in the stack are overwritten with an invalid pointer value. Because of the overwriting process, any bug in the compiler that causes a live pointer to be omitted from the map also causes the omitted pointer to be overwritten with the invalid pointer value. Regardless of whether garbage collection is actually performed at the garbage collection point where the pointer was omitted from the compiler-generated map, subsequent execution steps that reference the omitted pointer trigger an invalid pointer error. The invalid pointer error may be trapped and identified as a compiler bug related to map generation in the garbage collection process. [0021]
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • FIG. 1 is a flow diagram of a copying garbage collection process. [0022]
  • FIG. 2 is a block diagram of compile and runtime environments. [0023]
  • FIG. 3 is a block diagram of the runtime data areas of an embodiment of a virtual machine. [0024]
  • FIG. 4A is a block diagram illustrating an example of the use of pointers on a stack to reference objects on a heap. [0025]
  • FIG. 4B is a block diagram illustrating the application of copying garbage collection to the stack and heap example of FIG. 4A. [0026]
  • FIG. 5 is a flow diagram, in accordance with an embodiment of the invention, of a process for finding bugs related to garbage collection in a virtual machine. [0027]
  • FIG. 6 is a block diagram, in accordance with an embodiment of the invention, of apparatus for finding bugs related to garbage collection. [0028]
  • FIG. 7 is a block diagram of one embodiment of a computer system capable of providing a suitable execution environment for an embodiment of the invention. [0029]
  • DETAILED DESCRIPTION OF THE INVENTION
  • The invention is a method and apparatus for finding bugs related to garbage collection in a virtual machine. In the following description, numerous specific details are set forth to provide a more thorough description of embodiments of the invention. It will be apparent, however, to one skilled in the art, that the invention may be practiced without these specific details. In other instances, well known features have not been described in detail so as not to obscure the invention. [0030]
  • Though discussed herein with respect to the Java programming language and the Java virtual machine, the invention may be implemented in any environment that supports object or data access through references, and that provides information about live object references for use in garbage collection. [0031]
  • In virtual machines that implement compiling of bytecode input, such as the Java virtual machine, the component providing the information about live object references is the compiler. The compiler is responsible for compiling method code, and is therefore knowledgeable about the contents of stack frames at garbage collection points. An embodiment of a processing environment and virtual machine implementation are more fully described below. [0032]
  • The Processing Environment [0033]
  • The processing environment of the Java programming language is object-oriented in nature. To provide a better understanding of object-oriented principles, an overview of object-oriented programming follows. [0034]
  • 1. Object-Oriented Programming [0035]
  • Object-oriented programming is a method of creating computer programs by combining certain fundamental building blocks, and creating relationships among and between the building blocks. The building blocks in object-oriented programming systems are called “objects.” A software application can be written using an object-oriented programming language whereby the program's functionality is implemented using these objects. [0036]
  • An object is a programming unit that groups together a data structure (one or more instance variables) and the operations (methods) that can use or affect that data. Thus, an object consists of data and one or more operations or procedures that can be performed on that data. The joining of data and operations into a unitary building block is called “encapsulation.” [0037]
  • An object can be instructed to perform one of its methods when it receives a “message.” A message is a command or instruction sent to the object to execute a certain method. A message consists of a method selection (e.g., method name) and zero or more arguments. A message tells the receiving object what operations to perform. [0038]
  • One advantage of object-oriented programming is the way in which methods are invoked. When a message is sent to an object, it is not necessary for the message to instruct the object how to perform a certain method. It is only necessary to request that the object execute the method. This greatly simplifies program development. [0039]
  • Object-oriented programming languages are predominantly based on a “class” scheme. An example of a class-based object-oriented programming scheme is generally described in “Smalltalk-80: The Language,” by Adele Goldberg and David Robson, published by Addison-Wesley Publishing Company, 1989. [0040]
  • An object class provides a definition for an object which typically includes both fields (e.g., variables) and methods. An object class is used to create a particular object “instance.” (The term “object” by itself is often used interchangeably to refer to a particular class or a particular instance.) An instance of an object class includes the variables and methods defined for that class. Multiple instances can be created from the same object class. Each instance that is created from the object class is said to be of the same type or class. [0041]
  • To illustrate, an employee object class can include “name” and “salary” instance variables and a “set_salary” method. Instances of the employee object class can be created, or instantiated, for each employee in an organization. Each object instance is said to be of type “employee.” Each employee object instance includes “name” and “salary” instance variables and the “set_salary” method. The values associated with the “name” and “salary” variables in each employee object instance contain the name and salary of an employee in the organization. A message can be sent to an employee's employee object instance to invoke the “set_salary” method to modify the employee's salary (i.e., the value associated with the “salary” variable in the employee's employee object). [0042]
  • A hierarchy of classes can be defined such that an object class definition has one or more subclasses. A subclass inherits its parent's (and grandparent's etc.) definition. Each subclass in the hierarchy may add to or modify the behavior specified by its parent class. Some object-oriented programming languages support multiple inheritance where a subclass may inherit a class definition from more than one parent class. Other programming languages, such as the Java programming language, support only single inheritance, where a subclass is limited to inheriting the class definition of only one parent class. The Java programming language also provides a mechanism known as an “interface” which comprises a set of constant and abstract method declarations. An object class can implement the abstract methods defined in an interface. Both single and multiple inheritance are available to an interface. That is, an interface can inherit an interface definition from more than one parent interface. [0043]
  • 2. Programming and Execution [0044]
  • Java applications typically comprise one or more object classes and interfaces. Unlike many programming languages in which a program is compiled into machine-dependent, executable program code, classes written in the Java programming language are compiled into machine independent bytecode class files. Each class contains code and data in a platform-independent format called the class file format. The computer system acting as the execution vehicle contains a program called a virtual machine, which is responsible for executing the code in each class. [0045]
  • Applications may be designed as standalone Java applications, or as Java “applets” which are identified by an applet tag in an HTML (hypertext markup language) document, and loaded by a browser application. The class files associated with an application or applet may be stored on the local computing system, or on a server accessible over a network. Each class is loaded into the Java virtual machine, as needed, by the “class loader.” [0046]
  • To provide a client with access to class files from a server on a network, a web server application is executed on the server to respond to HTTP (hypertext transport protocol) requests containing URLs (universal resource locators) to HTML documents, also referred to as “web pages.” When a browser application executing on a client platform receives an HTML document (e.g., as a result of requesting an HTML document by forwarding a URL to the web server), the browser application parses the HTML and automatically initiates the download of the specified bytecode class files when it encounters an applet tag in the HTML document. [0047]
  • The classes of a Java applet are loaded on demand from the network (stored on a server), or from a local file system, when first referenced during the Java applet's execution. The virtual machine locates and loads each class file, parses the class file format, allocates memory for the class's various components, and links the class with other already loaded classes. This process makes the code in the class readily executable by the virtual machine. [0048]
  • FIG. 2 illustrates the compile and runtime environments for a processing system. In the compile environment, a software developer creates source files [0049] 200, which contain the programmer readable class definitions written in the Java programming language, including data structures, method implementations and references to other classes. Source files 200 are provided to Java compiler 201, which compiles source files 200 into compiled “.class” files 202 that contain bytecodes executable by a Java virtual machine. Bytecode class files 202 are stored (e.g., in temporary or permanent storage) on a server, and are available for download over a network. Alternatively, bytecode class files 202 may be stored locally in a directory on the client platform.
  • The Java runtime environment contains a Java virtual machine (JVM) [0050] 205 which is able to execute bytecode class files and execute native operating system (“O/S”) calls to operating system 209 when necessary during execution. Java virtual machine 205 provides a level of abstraction between the machine independence of the bytecode classes and the machine-dependent instruction set of the underlying computer hardware 210, as well as the platform-dependent calls of operating system 209.
  • Class loader and bytecode verifier (“class loader”) [0051] 203 is responsible for loading bytecode class files 202 and supporting class libraries 204 into Java virtual machine 205 as needed. Class loader 203 also verifies the bytecodes of each class file to maintain proper execution and enforcement of security rules. Within the context of runtime system 208, either an interpreter 206 executes the bytecodes directly, or a “just-in-time” (JIT) compiler 207 transforms the bytecodes into machine code, so that they can be executed by the processor (or processors) in hardware 210.
  • The [0052] runtime system 208 of virtual machine 205 supports a general stack architecture. The manner in which this general stack architecture is supported by the underlying hardware 210 is determined by the particular virtual machine implementation, and reflected in the way the bytecodes are interpreted or JIT-compiled. Other elements of the runtime system include thread management (e.g., scheduling) and garbage collection mechanisms.
  • FIG. 3 illustrates runtime data areas which support the stack architecture within [0053] runtime system 208. In FIG. 3, runtime data areas 300 comprise one or more thread-based data areas 307. Each thread-based data area 307 comprises a program counter register (PC REG) 308, a local variables pointer register (VARS REG) 309, a frame register (FRAME REG) 310, an operand stack pointer register (OPTOP REG) 311, and a stack 312. Stack 312 comprises one or more frames 313 which contain an operand stack 314 and local variables 315. Separate frame formats may be implemented for interpreted code and compiled code.
  • [0054] Runtime data areas 300 further comprises shared heap 301. Heap 301 is the runtime data area from which memory for all class instances and arrays is allocated. Shared heap 301 comprises method area 302, which is shared among all threads. Method area 302 comprises one or more class-based data areas 303 for storing information extracted from each loaded class file. For example, class-based data area 303 may comprise class structures such as constant pool 304, field and method data 305, and code for methods and constructors 306. Methods access class structures by reference. Pointers to classes are stored in local variables 315 or in registers associated with a given stack.
  • A virtual machine can support many threads of execution at once. Each thread has its own thread-based [0055] data area 307. At any point, each thread is executing the code of a single method, the “current method” for that thread. The program counter register 308 contains the address of the virtual machine instruction currently being executed. Frame register 310 points to the location of the current method in method area 302.
  • Each thread has a [0056] private stack 312, created at the same time as the thread. Stack 312 stores one or more frames 313 associated with methods invoked by the thread. Frames 313 are used to store data and partial results, as well as to perform dynamic linking, return values for methods and dispatch exceptions. A new frame is created and pushed onto the stack each time a method is invoked, and an existing frame is popped from the stack and destroyed when its method completes. A frame that is created by a thread is local to that thread and typically cannot be directly referenced by any other thread.
  • Only one frame, the frame for the currently executing method, is active at any point in a given thread of control. This frame is referred to as the “current frame,” and its method is known as the “current method.” A frame ceases to be current if its method invokes another method or if its method completes. When a method is invoked, a new frame is created and becomes current when control transfers to the new method. On method return, the current frame passes back the results of its method invocation, if any, to the previous frame. The current frame is then discarded while the previous frame becomes the current one. [0057]
  • Each [0058] frame 313 has its own set of local variables 315 and its own operand stack 314. The local variables pointer register 309 contains a pointer to the base of an array of words containing local variables 315 of the current frame. The operand stack pointer register 311 points to the top of operand stack 314 of the current frame. Most virtual machine instructions take values from the operand stack of the current frame, operate on them, and return results to the same operand stack. Operand stack 314 is also used to pass arguments to methods and receive method results.
  • FIG. 4A is a block diagram illustrating the use of pointers on the stack to reference objects on the heap in a virtual machine. The data structures shown include [0059] stack 408 comprising stack locations 400-407, and heap 409 comprising “from” space 410 and “to” space 411. “From” space 410 comprises object A (412), object B (413) and object C (414). FIG. 4 illustrates the state of the stack and heap prior to a garbage collection cycle, so no objects are shown copied to “to” space 411.
  • For purposes of example, stack locations [0060] 400-407 appear as a single array of local variables. However, in actual application, each stack frame contains its own set of local variables. Further, pointers to objects may be stored in registers as well. Stack locations 400-407 may therefore represent registers and local variables associated with multiple stack frames.
  • When an object is instantiated by invocation of a class constructor, and the object is assigned to a local variable, it is the pointer to the object that is stored in the local variable on the [0061] stack 408. As examples, stack locations 400 and 405 contain references to object A in the form of a pointer value. Similarly, stack locations 401 and 404 contain references to object C, and stack locations 402 and 407 contain references to object B. Local variables and registers in stack frames may also contain actual data, i.e., data that is not indirectly referenced, such as integer data. For example, stack locations 403 and 406 contain integers X and Y, respectively.
  • FIG. 4B illustrates the stack and heap of FIG. 4A after a garbage collection cycle has taken place at a gc-point in the virtual machine's execution of a program. It is assumed in FIG. 4B that prior to garbage collection, the references to object B were explicitly released by a method assigning a “null” value to the respective [0062] local variables 402 and 407. During compiling of the program code, the compiler of the virtual machine prepares a map of stack locations containing live pointers. An updated map is generated by the compiler for each gc-point. In FIG. 4B, the map generated by the compiler for the recent gc-point is OOP map 417 (OOP referring to object-oriented pointers) which indicates that stack locations 400, 404 and 405 contain live OOPs or pointers. In this example, stack location 401 is erroneously omitted from the live OOPs designated by OOP map 417.
  • In carrying out garbage collection, references to object A were found in [0063] stack locations 400 and 405, and a reference to object C was found in stack location 404. No references to object B were found in the stack locations of stack 408, because locations 402 and 407 were assigned null values and/or were not specified by OOP map 417 as live pointers. As a result, objects A and C are copied to “to” space 411 as object A.copy 415 and object C.copy 416. Further, using OOP map 417 to determine the locations of the live pointers, stack locations 400 and 405 are updated to point to object A.copy 415 in “to” space 411, and stack location 404 is updated to point to object C.copy 416.
  • Unfortunately, due to its omission from [0064] OOP map 417, stack location 401 is not updated. Therefore, any future reference to object C made via stack location 401 will erroneously access the obsolete object C 414 in “from” space 410, whereas reference made via stack location 404 will access object C.copy 416 appropriately. The multiple copies of object C are likely to diverge in respective data values over time, causing inconsistent performance. Also, obsolete object C 414 may be written over in a subsequent garbage collection cycle, resulting in indeterminate behavior and possibly a terminal error if accessed via stack location 401.
  • Zapping of Stack Locations in Debugging Mode [0065]
  • A debugging mode is provided in an embodiment of the invention, that, when enabled, initiates a debugging process within the virtual machine to find bugs associated with compiler error in the identification of live OOPs. The activation of the debugging mode enables a “zapping” process that overwrites unused stack locations at each gc-point in the virtual machine's execution of a program. Those stack locations that the compiler fails to correctly identify as live OOPs are also overwritten by the zapping process. As a result, during program execution, attempted object access via a misidentified live OOP generates an error that may be trapped and identified. The zapping process takes place at each gc-point, thus providing an exhaustive test of compiler OOP identification for a given executed application. [0066]
  • The OOP map generated by the compiler is extended in debugging mode to identify live stack locations that do not contain OOPS. This permits the zapping process to identify unused stack locations as those locations which are not identified by the compiler either as live OOPS or as live non-OOPs (e.g., directly referenced data such as integers). [0067]
  • In one embodiment of the invention, a comparison process is carried out, in addition to the zapping process, to handle another possible form of misidentification that may occur in the revised OOP map. This new form of misidentification is that of a live OOP being misidentified as a live non-OOP. If a live OOP is misidentified in this manner, the zapping process assumes that the register location contains a live non-OOP value, such as an integer, and thus does not overwrite the register location. However, in the additional comparison process, live non-OOPs are compared with possible OOP values (e.g., valid object reference values) to ascertain whether the live non-OOPs may, in fact, be misidentified live OOPs. If a match is obtained for any live non-OOP, a warning is issued. The person performing the debugging process may then determine where a possible bug of this nature may be occurring based on the issued warning. [0068]
  • Where actual OOP values are intentionally being manipulated in the form of live non-OOPs, these warnings may be ignored. Screening capability may be built into the compiler and/or comparison process to track where live OOP values have been written intentionally as live non-OOPs. The comparison process may then automatically omit a comparison operation, and subsequent warning issuance, at the register locations of tracked live non-OOPs. [0069]
  • FIG. 5 is a flow diagram of a debugging process in accordance with an embodiment of the invention. In [0070] step 500, the debugging mode is enabled and the virtual machine begins execution of an application. At step 501, a branch occurs based on whether a gc-point has been reached. If execution is not at a gc-point, the debugging process continues at step 508, where any attempts to use an invalid pointer value during program execution are trapped. If, at step 501, execution has reached a gc-point, the debugging process continues at step 502.
  • In [0071] step 502, the debugging process determines, based on the compiler OOP map, which stack locations contain a live OOP and which stack locations do not contain a live OOP. In step 503, for those stack locations specified as not containing live OOPs, the process determines which stack locations contain other forms of live data, or live non-OOPs. Those stack locations that are not identified by the compiler OOP map as containing live OOPs or live non-OOPs are assumed to be “dead” or unused locations. In step 504, those dead or unused stack locations are overwritten with an invalid pointer value, such as “0000.” Steps 501-504 may be implemented by the zapping process previously described.
  • In the embodiment of FIG. 5, in [0072] step 505, those locations specified as live non-OOPs (or a subset thereof) in the OOP map are compared with possible object reference values to determine whether each live non-OOP is possibly a misidentified live OOP. If, in step 506, no live non-OOP matches a reference to an object, the debugging process continues at step 508. If, in step 506, the value stored in a live non-OOP matches a possible pointer value of an object, a warning is issued in step 507 before proceeding to step 508. The warning may comprise, for example, a dialog message sent to a display, or a warning entry written to a log file. Steps 505-507 may be implemented by the comparison process previously described. It will be obvious to one skilled in the art that the invention may be practiced without steps 505-507.
  • In [0073] step 508, during program execution, any attempts to use invalid pointers (i.e., pointers with invalid pointer values such as “0000”) to access objects on the heap are trapped, or otherwise registered as errors, and identified. The step of trapping the use of invalid pointers is carried out continuously during program execution, and may be implemented in a virtual machine process separate from the zapping and comparison process(es). In step 509, if program execution is completed, the debugging process also completes in step 510. If, in step 509, execution has not yet completed, the debugging process returns to step 501.
  • FIG. 6 is a block diagram of a virtual machine implementation comprising a zapping/comparing [0074] component 600 in accordance with an embodiment of the invention. Zapping/comparing component 600 performs the zapping and comparison processes described with respect to FIG. 5, and may be implemented as one or more objects, components, methods, procedures, or elements thereof, within a virtual machine environment.
  • In FIG. 6, stack [0075] 408 comprises stack locations 400-407 and heap 409 comprises “from” space 410 and “to” space 411. “From” space 410 comprises object A 412, object B 413 and object C 414. “To” space 411 comprises object A.copy 415 and object B.copy 416. Stack locations 400 and 405 point to object A.copy 415; stack location 404 points to object C.copy 416; and stack locations 401, 402 and 407 point to an invalid address 601.
  • Compiler-generated [0076] OOP map 617 identifies stack locations 400, 404 and 405 as live OOPs. Further, in accordance with an embodiment of the invention, OOP map 617 identifies stack locations 403 and 406 as live non-OOPs, because they contain integer data values. An example representation of OOP map 617 is as follows:
    Stack Location*
    400 Live OOP
    401 (Not Live OOP) (Dead)
    402 (Not Live OOP) (Dead)
    403 (Not Live OOP) Live Non-OOP
    404 Live OOP
    405 Live OOP
    406 (Not Live OOP) Live Non-OOP
    407 (Not Live OOP) (Dead)
  • *register or local variable [0077]
  • The implementation of the OOP map may be as flag bits or bytes associated with stack locations. A first bit or byte may be used to indicate whether the corresponding stack location is a Live OOP. A second bit or byte may be used to indicate, in the event the first bit or byte is not set, whether the corresponding stack location is a live non-OOP. Any map implementation may be used which provides three states for each stack location: “Live OOP,” “Live non-OOP,” and “dead.” [0078]
  • Zapping/comparing [0079] component 600 accesses OOP map 617 to identify “dead” locations in stack 408 for the zapping process and to identify “Live Non-OOP” locations for the comparing process. Zapping/comparing component 600 accesses stack 408 to carry out the overwriting operations of the zapping process, as well as the individual comparing operations of the comparison process. When directed by the comparison process, zapping/comparing component 600 may issue warnings to warning destination 602.
  • In this example, stack [0080] locations 401, 402 and 407 are overwritten with the pointer value 0000, which points to invalid location 601. The zapping of stack locations 402 and 407 results in no effect because those locations are dead stack locations previously assigned a null value. The zapping of stack location 401, however, redirects the unidentified OOP from object C 414 to invalid location 601. Attempted access of object C via stack location 401 results in a trap that may be used to detect the error in the OOP map, and to correct the source of the error in the compiler.
  • The overwriting of [0081] stack location 401 occurs at the gc-point regardless of whether garbage collection actually takes place. Thus, even in execution situations where, due to the absence of a garbage collection cycle, an error in the OOP map does not result in a pointer error, the zapping process overwrites misidentified stack locations to force a pointer error to occur and be trapped.
  • If the entry for [0082] location 401 in the OOP map were to erroneously identify the location as containing a live non-OOP rather than a “dead” value, the entry for location 401 in OOP map 617 would read, for example, as follows:
    401 (Not Live OOP) Live Non-OOP
  • In this case, the zapping process of zapping/comparing [0083] component 600 does not overwrite location 401 because the location is not “dead.” However, the comparison process of zapping/comparing component 600 examines locations identified as “Live Non-OOP” (e.g., locations 401, 403 and 406), and compares the stored values with possible OOP reference values. Location 401 is flagged for storing a value matching the OOP reference value for object C 414 in “from” space 410, and a warning is issued to warning destination 602 to note the occurrence of this match. As with the zapping process, the comparison process is carried out at each gc-point. Thus, a warning may be issued by zapping/comparing component 600 regardless of whether a garbage collection cycle is carried out.
  • Embodiment of Computer Execution Environment (Hardware) [0084]
  • An embodiment of the invention can be implemented as computer software in the form of computer readable code executed on a general purpose computer such as [0085] computer 700 illustrated in FIG. 7, or in the form of bytecode class files executable within a Java runtime environment running on such a computer. A keyboard 710 and mouse 711 are coupled to a bi-directional system bus 718. The keyboard and mouse are for introducing user input to the computer system and communicating that user input to processor 713. Other suitable input devices may be used in addition to, or in place of, the mouse 711 and keyboard 710. I/O (input/output) unit 719 coupled to bidirectional system bus 718 represents such I/O elements as a printer, A/V (audio/video) I/O, etc.
  • [0086] Computer 700 includes a video memory 714, main memory 715 and mass storage 712, all coupled to bidirectional system bus 718 along with keyboard 710, mouse 711 and processor 713. The mass storage 712 may include both fixed and removable media, such as magnetic, optical or magnetic optical storage systems or any other available mass storage technology. Bus 718 may contain, for example, address lines for addressing video memory 714 or main memory 715. The system bus 718 also includes, for example, a data bus for transferring data between and among the components, such as processor 713, main memory 715, video memory 714 and mass storage 712. Alternatively, multiplex data/address lines may be used instead of separate data and address lines.
  • In one embodiment of the invention, the [0087] processor 713 is a microprocessor manufactured by Motorola, such as the 680×0 processor or a microprocessor manufactured by Intel, such as the 80×86, or Pentium processor, or a SPARC microprocessor from Sun Microsystems, Inc. However, any other suitable microprocessor or microcomputer may be utilized. Main memory 715 is comprised of dynamic random access memory (DRAM). Video memory 714 is a dual-ported video random access memory. One port of the video memory 714 is coupled to video amplifier 716. The video amplifier 716 is used to drive the cathode ray tube (CRT) raster monitor 717. Video amplifier 716 is well known in the art and may be implemented by any suitable apparatus. This circuitry converts pixel data stored in video memory 714 to a raster signal suitable for use by monitor 717. Monitor 717 is a type of monitor suitable for displaying graphic images. Alternatively, the video memory could be used to drive a flat panel or liquid crystal display (LCD), or any other suitable data presentation device.
  • [0088] Computer 700 may also include a communication interface 720 coupled to bus 718. Communication interface 720 provides a two-way data communication coupling via a network link 721 to a local network 722. For example, if communication interface 720 is an integrated services digital network (ISDN) card or a modem, communication interface 720 provides a data communication connection to the corresponding type of telephone line, which comprises part of network link 721. If communication interface 720 is a local area network (LAN) card, communication interface 720 provides a data communication connection via network link 721 to a compatible LAN. Communication interface 720 could also be a cable modem or wireless interface. In any such implementation, communication interface 720 sends and receives electrical, electromagnetic or optical signals which carry digital data streams representing various types of information.
  • Network link [0089] 721 typically provides data communication through one or more networks to other data devices. For example, network link 721 may provide a connection through local network 722 to local server computer 723 or to data equipment operated by an Internet Service Provider (ISP) 724. ISP 724 in turn provides data communication services through the world wide packet data communication network now commonly referred to as the “Internet” 725. Local network 722 and Internet 725 both use electrical, electromagnetic or optical signals which carry digital data streams. The signals through the various networks and the signals on network link 721 and through communication interface 720, which carry the digital data to and from computer 700, are exemplary forms of carrier waves transporting the information.
  • [0090] Computer 700 can send messages and receive data, including program code, through the network(s), network link 721, and communication interface 720. In the Internet example, remote server computer 726 might transmit a requested code for an application program through Internet 725, ISP 724, local network 722 and communication interface 720.
  • The received code may be executed by [0091] processor 713 as it is received, and/or stored in mass storage 712, or other non-volatile storage for later execution. In this manner, computer 700 may obtain application code in the form of a carrier wave. In accordance with an embodiment of the invention, an example of such a downloaded application is the apparatus for debugging a virtual machine described herein.
  • Application code may be embodied in any form of computer program product. A computer program product comprises a medium configured to store or transport computer readable code or data, or in which computer readable code or data may be embedded. Some examples of computer program products are CD-ROM disks, ROM cards, floppy disks, magnetic tapes, computer hard drives, servers on a network, and carrier waves. [0092]
  • The computer systems described above are for purposes of example only. An embodiment of the invention may be implemented in any type of computer system or programming or processing environment, including embedded devices (e.g., web phones, etc.) and “thin” client processing environments (e.g., network computers (NC's), etc.) that support a virtual machine. [0093]
  • Thus, a method and apparatus for finding bugs related to garbage collection in a virtual machine have been described in conjunction with one or more specific embodiments. The invention is defined by the claims and their full scope of equivalents. [0094]

Claims (24)

1. In a computer system, a method comprising:
determining, at each gc-point, a memory location that does not contain a live pointer or live data, said determination based on a map; and
overwriting said memory location with an invalid pointer value.
2. The method of
claim 1
, further comprising implementing a trap for a reference made via said invalid pointer value.
3. The method of
claim 1
, further comprising generating said map in a compiler.
4. The method of
claim 1
, wherein said method is implemented within a virtual machine environment.
5. The method of
claim 1
wherein said memory location comprises a register or local variable of a stack.
6. The method of
claim 1
, further comprising:
determining from said map a second memory location designated as containing a live data value; and
issuing a warning if the contents of said second memory location match a possible object reference.
7. The method of
claim 6
, wherein said step of issuing a warning comprises logging said warning in a log file.
8. The method of
claim 6
, wherein said step of issuing a warning comprises displaying a warning dialog on a display device.
9. A computer program product comprising:
a computer usable medium having computer readable code embodied therein for debugging a garbage collection process, said computer program product comprising:
computer readable code configured to cause a computer to determine, at each gc-point, a memory location that does not contain a live pointer or live data, said determination based on a map; and
computer readable code configured to cause a computer to overwrite said memory location with an invalid pointer value.
10. The computer program product of
claim 9
, further comprising computer readable code configured to cause a computer to implement a trap for a reference made via said invalid pointer value.
11. The computer program product of
claim 9
, further comprising computer readable code configured to cause a computer to generate said map in a compiler.
12. The computer program product of
claim 9
, wherein said computer readable code is configured to be executed within a virtual machine environment.
13. The computer program product of
claim 9
wherein said memory location comprises a register or local variable of a stack.
14. The computer program product of
claim 9
, further comprising:
computer readable code configured to cause a computer to determine from said map a second memory location designated as containing a live data value; and
computer readable code configured to cause a computer to issue a warning if the contents of said second memory location match a possible object reference.
15. The computer program product of
claim 14
, wherein said computer readable code configured to cause a computer to issue a warning comprises computer readable code configured to cause a computer to log said warning in a log file.
16. The computer program product of
claim 14
, wherein said computer readable code configured to cause a computer to issue a warning comprises computer readable code configured to cause a computer to display a warning dialog on a display device.
17. A virtual machine comprising:
a stack having a plurality of memory locations;
a map associating a respective state with one or more of said plurality of memory locations; and
a process operatively coupled to said map and said stack, said process configured to overwrite at each gc-point one or more memory locations associated with a dead state, said overwriting comprising replacing the contents of a respective memory location with an invalid pointer value.
18. The virtual machine of
claim 17
, wherein said virtual machine is configured to implement a trap for operations that access said invalid pointer value.
19. The virtual machine of
claim 17
, further comprising a compiler configured to generate said map.
20. The virtual machine of
claim 17
, wherein said one or more memory locations comprise registers and local variables.
21. The virtual machine of
claim 17
, wherein said process is further configured to issue a warning regarding one or more memory locations that are associated with a live data state and whose contents match an object reference.
22. The virtual machine of
claim 21
, wherein said warning is issued as a log entry to a log file.
23. The virtual machine of
claim 21
, wherein said warning is issued as a warning dialog to a display.
24. A virtual machine comprising:
means for determining, at each gc-point, a memory location that does not contain a live pointer or live data, said determination based on a map; and
means for overwriting said memory location with an invalid pointer value.
US09/153,382 1998-09-15 1998-09-15 Method and apparatus for finding bugs related to garbage collection in a virtual machine Expired - Lifetime US6327701B2 (en)

Priority Applications (3)

Application Number Priority Date Filing Date Title
US09/153,382 US6327701B2 (en) 1998-09-15 1998-09-15 Method and apparatus for finding bugs related to garbage collection in a virtual machine
PCT/US1999/018685 WO2000016196A1 (en) 1998-09-15 1999-08-16 Method and apparatus for finding bugs related to garbage collection in a virtual machine
AU55678/99A AU5567899A (en) 1998-09-15 1999-08-16 Method and apparatus for finding bugs related to garbage collection in a virtualmachine

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US09/153,382 US6327701B2 (en) 1998-09-15 1998-09-15 Method and apparatus for finding bugs related to garbage collection in a virtual machine

Publications (2)

Publication Number Publication Date
US20010013117A1 true US20010013117A1 (en) 2001-08-09
US6327701B2 US6327701B2 (en) 2001-12-04

Family

ID=22546986

Family Applications (1)

Application Number Title Priority Date Filing Date
US09/153,382 Expired - Lifetime US6327701B2 (en) 1998-09-15 1998-09-15 Method and apparatus for finding bugs related to garbage collection in a virtual machine

Country Status (3)

Country Link
US (1) US6327701B2 (en)
AU (1) AU5567899A (en)
WO (1) WO2000016196A1 (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020032719A1 (en) * 1998-11-16 2002-03-14 Insignia Solutions, Plc Method and system of dynamic memory management
US20030023768A1 (en) * 2001-07-24 2003-01-30 Foti David A. Designating an object for destruction
US20050004884A1 (en) * 2003-07-02 2005-01-06 International Business Machines Corporation System and method for reducing memory leaks in virtual machine programs
US20060156152A1 (en) * 2004-12-10 2006-07-13 Microsoft Corporation Critical finalizers
US7966480B2 (en) * 2001-06-01 2011-06-21 Microchip Technology Incorporated Register pointer trap to prevent errors due to an invalid pointer value in a register
US20110246543A1 (en) * 2010-04-01 2011-10-06 International Business Machines Corporation Write Barrier Elision for Reference Arrays
US20120246203A1 (en) * 2011-03-25 2012-09-27 Microsoft Corporation Reconstruction of garbage collection information
US9852046B1 (en) * 2015-05-07 2017-12-26 Cadence Design Systems, Inc. Method and system for automated debugging memory allocation and memory release

Families Citing this family (25)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6324685B1 (en) 1998-03-18 2001-11-27 Becomm Corporation Applet server that provides applets in various forms
US6442751B1 (en) * 1998-12-14 2002-08-27 International Business Machines Corporation Determination of local variable type and precision in the presence of subroutines
US6594820B1 (en) * 1999-09-28 2003-07-15 Sun Microsystems, Inc. Method and apparatus for testing a process in a computer system
US6427154B1 (en) * 1999-12-16 2002-07-30 International Business Machines Corporation Method of delaying space allocation for parallel copying garbage collection
US7085810B1 (en) * 2000-05-31 2006-08-01 International Business Machines Corporation Dynamic command sets in a computer messaging system in a method, system and program
US20040015874A1 (en) * 2001-05-14 2004-01-22 David Ungar Method and apparatus for finding resource allocation errors in virtual machine compilers
US6745213B2 (en) * 2001-11-21 2004-06-01 Sun Microsystems, Inc. Method and apparatus to facilitate testing of garbage collection implementations
EP1502196A4 (en) * 2002-05-02 2008-04-02 Sarvega Inc System and method for transformation of xml documents using stylesheets
US6862674B2 (en) * 2002-06-06 2005-03-01 Sun Microsystems Methods and apparatus for performing a memory management technique
US7062755B2 (en) * 2002-10-16 2006-06-13 Hewlett-Packard Development Company, L.P. Recovering from compilation errors in a dynamic compilation environment
US20040181782A1 (en) * 2003-03-13 2004-09-16 Piotr Findeisen System and method for optimizing memory usage by locating lingering objects
US20040181562A1 (en) * 2003-03-13 2004-09-16 Piotr Findeisen System and method for determining deallocatable memory in a heap
US7594234B1 (en) 2004-06-04 2009-09-22 Sun Microsystems, Inc. Adaptive spin-then-block mutual exclusion in multi-threaded processing
US7644409B2 (en) * 2004-06-04 2010-01-05 Sun Microsystems, Inc. Techniques for accessing a shared resource using an improved synchronization mechanism
WO2006008223A1 (en) * 2004-07-16 2006-01-26 Swisscom Mobile Ag Method, by which a terminal can retrieve information associated to an epc-code from an epc network
US7475397B1 (en) 2004-07-28 2009-01-06 Sun Microsystems, Inc. Methods and apparatus for providing a remote serialization guarantee
US20060048106A1 (en) * 2004-08-27 2006-03-02 International Business Machines Corporation Link-time profile-based method for reducing run-time image of executables
US8412751B2 (en) 2010-12-15 2013-04-02 International Business Machines Corporation Determining whether a Java object has been scan-missed by a garbage collector scan
US8997066B2 (en) * 2010-12-27 2015-03-31 Microsoft Technology Licensing, Llc Emulating pointers
US8738877B2 (en) 2011-12-14 2014-05-27 Advance Micro Devices, Inc. Processor with garbage-collection based classification of memory
US9875173B2 (en) * 2014-06-30 2018-01-23 Microsoft Technology Licensing, Llc Time travel debugging in managed runtime
US10042737B2 (en) 2016-08-31 2018-08-07 Microsoft Technology Licensing, Llc Program tracing for time travel debugging and analysis
US10031834B2 (en) 2016-08-31 2018-07-24 Microsoft Technology Licensing, Llc Cache-based tracing for time travel debugging and analysis
US10310977B2 (en) 2016-10-20 2019-06-04 Microsoft Technology Licensing, Llc Facilitating recording a trace file of code execution using a processor cache
US10324851B2 (en) 2016-10-20 2019-06-18 Microsoft Technology Licensing, Llc Facilitating recording a trace file of code execution using way-locking in a set-associative processor cache

Family Cites Families (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4775932A (en) * 1984-07-31 1988-10-04 Texas Instruments Incorporated Computer memory system with parallel garbage collection independent from an associated user processor
US4797810A (en) * 1986-06-26 1989-01-10 Texas Instruments Incorporated Incremental, multi-area, generational, copying garbage collector for use in a virtual address space
US4989134A (en) * 1987-03-20 1991-01-29 Hewlett-Packard Company Method and apparatus for enhancing data storage efficiency
US5088036A (en) 1989-01-17 1992-02-11 Digital Equipment Corporation Real time, concurrent garbage collection system and method
US5355483A (en) * 1991-07-18 1994-10-11 Next Computers Asynchronous garbage collection
US5832289A (en) * 1991-09-20 1998-11-03 Shaw; Venson M. System for estimating worst time duration required to execute procedure calls and looking ahead/preparing for the next stack operation of the forthcoming procedure calls
ATE233001T1 (en) 1992-06-15 2003-03-15 Microsoft Corp COMPUTER METHOD AND SYSTEM FOR MEMORY MANAGEMENT
US5560003A (en) * 1992-12-21 1996-09-24 Iowa State University Research Foundation, Inc. System and hardware module for incremental real time garbage collection and memory management
US5687368A (en) * 1994-07-22 1997-11-11 Iowa State University Research Foundation, Inc. CPU-controlled garbage-collecting memory module
US5590332A (en) * 1995-01-13 1996-12-31 Baker; Henry G. Garbage collection, tail recursion and first-class continuations in stack-oriented languages
JP2924705B2 (en) 1995-04-10 1999-07-26 富士ゼロックス株式会社 Memory management method and object management method
US5819304A (en) * 1996-01-29 1998-10-06 Iowa State University Research Foundation, Inc. Random access memory assembly
US5845298A (en) * 1997-04-23 1998-12-01 Sun Microsystems, Inc. Write barrier system and method for trapping garbage collection page boundary crossing pointer stores
US5920876A (en) * 1997-04-23 1999-07-06 Sun Microsystems, Inc. Performing exact garbage collection using bitmaps that identify pointer values within objects
US5900001A (en) * 1997-04-23 1999-05-04 Sun Microsystems, Inc. Method and apparatus for optimizing exact garbage collection using a bifurcated data structure
US5915255A (en) * 1997-04-23 1999-06-22 Sun Microsystems, Inc. Method and apparatus for referencing nodes using links
US5903900A (en) * 1997-04-23 1999-05-11 Sun Microsystems, Inc. Method and apparatus for optimizing exact garbage collection of array nodes in a carded heap
US5909579A (en) * 1997-04-23 1999-06-01 Sun Microsystems, Inc. Method and apparatus for encoding and decoding delta encoded information to locate live pointers in program data stacks
US5873105A (en) * 1997-06-26 1999-02-16 Sun Microsystems, Inc. Bounded-pause time garbage collection system and method including write barrier associated with a source instance of a partially relocated object
US6070073A (en) * 1997-12-18 2000-05-30 Nortel Networks Corporation Communication system and method for notification and call routing in a mobile satellite network
US6081665A (en) * 1997-12-19 2000-06-27 Newmonics Inc. Method for efficient soft real-time execution of portable byte code computer programs
US6185569B1 (en) * 1998-06-29 2001-02-06 Microsoft Corporation Linked data structure integrity verification system which verifies actual node information with expected node information stored in a table

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080016507A1 (en) * 1998-11-16 2008-01-17 Esmertec Ag Computer system
US8631219B2 (en) 1998-11-16 2014-01-14 Myriad Group Ag Method and system for dynamic memory management
US8127280B2 (en) 1998-11-16 2012-02-28 Myriad Group Ag Method and system for dynamic memory management
US20020032719A1 (en) * 1998-11-16 2002-03-14 Insignia Solutions, Plc Method and system of dynamic memory management
US7966480B2 (en) * 2001-06-01 2011-06-21 Microchip Technology Incorporated Register pointer trap to prevent errors due to an invalid pointer value in a register
US7237237B2 (en) * 2001-07-24 2007-06-26 The Mathworks, Inc. Designating an object for destruction
US20030023768A1 (en) * 2001-07-24 2003-01-30 Foti David A. Designating an object for destruction
US7340493B2 (en) * 2003-07-02 2008-03-04 International Business Machines Corporation System and method for reducing memory leaks in virtual machine programs
US20050004884A1 (en) * 2003-07-02 2005-01-06 International Business Machines Corporation System and method for reducing memory leaks in virtual machine programs
US7610579B2 (en) * 2004-12-10 2009-10-27 Microsoft Corporation Critical finalizers
US20060156152A1 (en) * 2004-12-10 2006-07-13 Microsoft Corporation Critical finalizers
US20110246543A1 (en) * 2010-04-01 2011-10-06 International Business Machines Corporation Write Barrier Elision for Reference Arrays
US8943109B2 (en) * 2010-04-01 2015-01-27 International Business Machines Corporation Write barrier elision for reference arrays
US20120246203A1 (en) * 2011-03-25 2012-09-27 Microsoft Corporation Reconstruction of garbage collection information
US8489651B2 (en) * 2011-03-25 2013-07-16 Microsoft Corporation Reconstruction of garbage collection information
US9852046B1 (en) * 2015-05-07 2017-12-26 Cadence Design Systems, Inc. Method and system for automated debugging memory allocation and memory release

Also Published As

Publication number Publication date
AU5567899A (en) 2000-04-03
WO2000016196A1 (en) 2000-03-23
US6327701B2 (en) 2001-12-04

Similar Documents

Publication Publication Date Title
US6327701B2 (en) Method and apparatus for finding bugs related to garbage collection in a virtual machine
US6282702B1 (en) Method and apparatus of translating and executing native code in a virtual machine environment
US6594820B1 (en) Method and apparatus for testing a process in a computer system
US6530080B2 (en) Method and apparatus for pre-processing and packaging class files
US6557023B1 (en) Method and apparatus for avoiding array class creation in virtual machines
US6651248B1 (en) Method and apparatus for efficient interface method dispatch
US6826750B1 (en) Method of automatically selecting program and data updates based upon versions
US6973646B1 (en) Method for compiling program components in a mixed static and dynamic environment
US6637025B1 (en) Dynamic selection/definition of which class/methods should or should not be jit'ed using information stored in a jar file
US6026237A (en) System and method for dynamic modification of class files
US6757890B1 (en) Methods and apparatus for enabling local Java object allocation and collection
US7114152B2 (en) Method, apparatus, and program to determine the mutability of an object at loading time
EP0945791A2 (en) Techniques for reducing the cost of dynamic class initialization checks in compiled code
US20080127143A1 (en) System and method for efficiently generating native code calls from byte code in virtual machines
US20060248523A1 (en) Annotation management
US7774717B2 (en) Method and apparatus for detecting device support in a graphical user interface
US6584478B1 (en) Transparent garbage collection of resources
US6779188B1 (en) Apparatus and method for improved devirtualization of method calls
US20040015874A1 (en) Method and apparatus for finding resource allocation errors in virtual machine compilers
WO2001009715A2 (en) A single-compiler architecture
US11568047B2 (en) Distinguished nest-based access control
US11243876B2 (en) Techniques for accessing off-heap memory

Legal Events

Date Code Title Description
AS Assignment

Owner name: SUN MICROSYSTEMS, INC., CALIFORNIA

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:UNGAR, DAVID;REEL/FRAME:009466/0345

Effective date: 19980914

STCF Information on status: patent grant

Free format text: PATENTED CASE

FPAY Fee payment

Year of fee payment: 4

FPAY Fee payment

Year of fee payment: 8

FPAY Fee payment

Year of fee payment: 12

AS Assignment

Owner name: ORACLE AMERICA, INC., CALIFORNIA

Free format text: MERGER AND CHANGE OF NAME;ASSIGNORS:ORACLE USA, INC.;SUN MICROSYSTEMS, INC.;ORACLE AMERICA, INC.;REEL/FRAME:037278/0592

Effective date: 20100212