US6202202B1 - Pointer analysis by type inference for programs with structured memory objects and potentially inconsistent memory object accesses - Google Patents

Pointer analysis by type inference for programs with structured memory objects and potentially inconsistent memory object accesses Download PDF

Info

Publication number
US6202202B1
US6202202B1 US08/719,144 US71914496A US6202202B1 US 6202202 B1 US6202202 B1 US 6202202B1 US 71914496 A US71914496 A US 71914496A US 6202202 B1 US6202202 B1 US 6202202B1
Authority
US
United States
Prior art keywords
location
locations
type
representing
store usage
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Lifetime
Application number
US08/719,144
Inventor
Bjarne Steensgaard
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Microsoft Technology Licensing LLC
Original Assignee
Microsoft Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Microsoft Corp filed Critical Microsoft Corp
Priority to US08/719,144 priority Critical patent/US6202202B1/en
Assigned to MICROSOFT CORPORATION reassignment MICROSOFT CORPORATION ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: STEENSGAARD, BJARNE
Application granted granted Critical
Publication of US6202202B1 publication Critical patent/US6202202B1/en
Assigned to MICROSOFT TECHNOLOGY LICENSING, LLC reassignment MICROSOFT TECHNOLOGY LICENSING, LLC ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: MICROSOFT CORPORATION
Anticipated expiration legal-status Critical
Expired - Lifetime legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/40Transformation of program code
    • G06F8/41Compilation
    • G06F8/43Checking; Contextual analysis
    • G06F8/436Semantic checking
    • G06F8/437Type checking

Definitions

  • This application contains a Microfiche Appendix consisting of 18 frames, 1 sheet.
  • the present invention relates generally to the field of computer program analysis. More particularly, the present invention relates to the field of pointer analysis for computer programs.
  • one typical compiler performs optimization techniques based on the expected run-time usage of memory locations for the compiled program as defined by a store model or storage shape graph.
  • the compiler may generate the store model by performing a pointer analysis to determine the effects of program statements referencing memory locations with constants, variables, or functions, for example.
  • One typical pointer analysis by type inference treats structured memory objects, such as structures or records, as single memory locations and may therefore generate an overly conservative store model.
  • Another typical pointer analysis by type inference describes structured memory objects and elements of structured memory objects for a program by types based only on the type declarations of the program. Such a pointer analysis, however, produces inaccurate store models for programs using arbitrary type casts, unions, and pointer arithmetic.
  • a method performs a pointer analysis for a program with a data processing system.
  • the method may be implemented in software stored by a memory for execution by a data processing system.
  • the method may perform the pointer analysis for a program browser or while compiling the program for execution by a data processing system.
  • a store usage in the program accessing locations is identified and may be identified based on a form of a program statement describing the store usage.
  • Locations for the identified store usage are represented with types describing access patterns for the locations for the identified store usage based on how the locations for the identified store usage are accessed in the program such that the types representing the locations for the identified store usage comply with a typing constraint.
  • Each type may be represented by a type variable and an associated type constructor.
  • a content of one of the locations for the identified store usage may be described with a location type and a function type.
  • One of the locations for the identified store usage may be represented with a type describing the one location is accessed as a structured memory object if the one location is accessed as a structured memory object for the identified store usage.
  • the one location may also be represented with a type comprising location types describing locations of elements of the structured memory object.
  • one of the locations for the identified store usage may be represented with a type comprising a location type describing a location representing a structured memory object if the one location represents an element of the structured memory object.
  • One of the locations for the identified store usage may be represented with a type describing the one location is accessed inconsistently if an access pattern of the one location as defined by the identified store usage is different from the access pattern described by the type representing the one location. Also, one of the locations for the identified store usage may be represented with a type describing the one location is accessed inconsistently if the one location is accessed through an offset pointer.
  • a location pointer may be described with a location type representing a pointed-to location and with an offset describing how the location pointer points to the pointed-to location relative to a beginning of the pointed-to location.
  • One of the locations for the identified store usage may be represented with a type describing a size of the one location based on a size of an assigned value for the identified store usage.
  • the method may determine whether the types representing the locations for the identified store usage comply with the typing constraint and may determine whether the types representing the locations for the identified store usage comply with a type rule specifying the typing constraint for the identified program statement form. If the types representing the locations for the identified store usage do not comply with the typing constraint, the method modifies types representing locations for the identified store usage to comply with the typing constraint.
  • the method may identify any potential constraints for types representing locations for the identified store usage and may identify a potential constraint in a pending set.
  • the method may identify from the identified store usage an access pattern relationship for types representing locations for the identified store usage.
  • the method may identify from the identified store usage a pointer offset relationship for types representing locations for the identified store usage.
  • the method may identify from the identified store usage any potential points-to relationships for a type representing a non-pointer value. Types representing locations for any identified potential constraints affected by the modification of types representing locations for the identified store usage may also be modified.
  • the method may analyze each store usage for the program only one time in an order independent of program control flow.
  • a data processing system performing the pointer analysis comprises a translator for translating a program in a first language into code in a second language, a pointer analyzer for performing the pointer analysis for the program, a store model for storing the types representing locations for the program, and an optimizer for optimizing the code based on the store model.
  • FIG. 1 illustrates for one embodiment a compiler that uses a pointer analysis by type inference for a program with structured memory objects and potentially inconsistent memory object accesses;
  • FIG. 2 illustrates for one embodiment a flow diagram for performing a pointer analysis by type inference for a program with structured memory objects and potentially inconsistent memory object accesses
  • FIG. 3 illustrates for one embodiment graphical representations of types for a sample program fragment
  • FIGS. 4A and 4B illustrate for one embodiment graphical representations of types for another sample program fragment
  • FIGS. 5A and 5B illustrate for one embodiment graphical representations of types for another sample program fragment
  • FIGS. 6A and 6B illustrate for one embodiment graphical representations of types for another sample program fragment.
  • FIG. 7 illustrates for one embodiment a data processing system for performing a pointer analysis by type inference for a program with structured memory objects and potentially inconsistent memory object accesses.
  • a pointer analysis by type inference for a computer program with structured memory objects and potentially inconsistent memory object accesses helps approximate run-time store usage for the program.
  • This pointer analysis may supplement the pointer analysis described in U.S. patent application Ser. No. 08/664,441, filed Jun. 18, 1996, entitled POINTER ANALYSIS BY TYPE INFERENCE COMBINED WITH A NON-POINTER ANALYSIS, by Bjarne Steensgaard.
  • U.S. patent application Ser. No. 08/664,441 is herein incorporated by reference.
  • the pointer analysis may be used for any suitable programming tool or analyzer, such as a suitable program understanding and browsing tool or a suitable program compiler or interpreter for example.
  • FIG. 1 illustrates for one embodiment a program compiler 100 that uses this pointer analysis.
  • Compiler 100 is implemented in software for execution by a suitable data processing system and comprises a front end 102 , a translator 104 , and a back end 106 .
  • Compiler 100 compiles source code 112 in a suitable source language into target code 116 in a suitable target language.
  • Target code 116 may be executed directly by a suitable data processing system or linked by a suitable linker with other target code for execution by the data processing system.
  • front end 102 In analyzing source code 112 , front end 102 generates a suitable symbol table 122 in the form of a data structure for recording identifiers, such as variables and function names for example, used in source code 112 . Suitable attribute information regarding each recorded identifier is also recorded and referenced for front end 102 , translator 104 , and back end 106 .
  • translator 104 performs the pointer analysis.
  • Translator 104 evaluates statements referencing memory locations with variables and/or functions, for example, of source code 112 using symbol table 122 to determine store usages for the memory locations.
  • Translator 104 generates a store model 124 in the form of a data structure to represent an approximation of the run-time store usage for source code 112 .
  • front end 102 may perform the pointer analysis to generate store model 124 while parsing source code 112 . As illustrated in FIG.
  • translator 104 and/or back end 106 may use store model 124 to help optimize intermediate code 114 and target code 116 , respectively, with suitable techniques including, for example, code motion, common subexpression elimination, dead code elimination, peephole optimization, and register allocation techniques.
  • the pointer analysis may be performed for code in any suitable source language that supports, for example, pointers to locations, pointers to functions, dynamic data allocation, data address computation for variables, and/or pointer arithmetic.
  • the source language for one embodiment also supports structured memory objects, such as records or structures for example, and/or inconsistent access of memory objects, such as for unions or through offset location pointers for example.
  • the x S fun(f 1 . . . f n ) ⁇ (r 1 . . . r m )
  • S* statement form describes the definition of a function or procedure and the assignment of a pointer to the defined function to the variable x.
  • the function is defined with formal parameter variables f 1 . . . f n , return parameter variables r 1 . . . r m , and a body represented by the sequence of statements S*.
  • the sequence of statements S* is executed when the function is called.
  • the pointer analysis is performed independent of program control flow, and therefore the source language may support any suitable control flow structures.
  • the pointer analysis for one embodiment is performed independent of type casts and type declarations for the source language.
  • the C programming language is one suitable source language for the pointer analysis.
  • the pointer analysis uses types to define a store model or storage shape graph representing an approximation of the run-time store usage for a program.
  • a type represents a set of one or more locations and describes the content of those locations.
  • the types define points-to relationships among the sets of locations as determined by the pointer analysis in accordance with typing constraints, and the set of types for a program define the store model for the program.
  • the types used for the pointer analysis are to be distinguished from the types, such as character, integer, long, float, double, etc., associated with type declarations in a program.
  • the pointer analysis represents each location representing a variable of the program and each dynamically allocated location of the program with a location type.
  • a location type represents a set of one or more locations and comprises a type representing a set of locations that may be pointed-to by the content of the location(s).
  • the pointer analysis describes each location pointer by a type comprising the type representing the pointed-to location(s).
  • the pointer analysis represents each location of the program with a location type describing how the represented location(s) are accessed, for example as a unit or as a structured object, and whether the represented location(s) represent an element of a structured object or objects.
  • a location type representing a set of one or more structured objects comprises types describing the location of each element of the represented structured object(s).
  • the pointer analysis represents each location of the program with a location type describing the size of the represented location(s) and whether the represented location(s) are accessed in an inconsistent manner.
  • the pointer analysis describes each location pointer by the type representing the location(s) pointed-to or pointed-into and by an offset describing how the location pointer points to the location(s) relative to the beginning of the location(s).
  • the pointer analysis represents each function of the program with a function type.
  • a function type represents a set of one or more functions and comprises types representing the locations of the formal and return parameter variables for the represented function(s).
  • the pointer analysis represents each location for a function pointer variable with a type comprising a function type representing the function(s) pointed-to by the content of the represented location.
  • the pointer analysis describes each function pointer by the type representing the pointed-to function(s).
  • the pointer analysis represents the location representing a variable with a type describing how the variable is accessed, a mapping of element types to describe the location of each element of the variable, the size of the variable, and/or a set of parent types to describe the structured object(s) of which the represented location represents an element.
  • the pointer analysis may also describe the value of the variable with a location pointer type to describe a location pointer and a function type to describe a function pointer.
  • the location type ⁇ represents a set of one or more locations and describes how the represented location(s) are accessed.
  • the simple location type describes locations whose content values are used only as a unit.
  • This location type describes the represented location(s) with a size component s and a parent type component p and describes the content of the represented location(s) with a location pointer type component ⁇ and a function type component ⁇ .
  • the struct location type describes locations whose content values are used only as structured objects and describes the represented location(s) with an element mapping type component m, a size component s, and a parent type component p.
  • the object location type describes locations whose content values are used in inconsistent manners.
  • This location type describes the represented location(s) with a size component s and a parent type component p and describes the content of the represented location(s) with a location pointer type component a and a function type component ⁇ .
  • Location pointer type components a represent location pointers and comprise a location type ⁇ and an offset type o.
  • the location type ⁇ represents the location(s) pointed-to or pointed-into by the described location pointer and may be ⁇ indicating the value described by the location type ⁇ does not comprise a location pointer.
  • the offset type o indicates how the described location pointer points to the location(s) represented by the location type ⁇ paired with the offset type o relative to the beginning of the represented location(s).
  • the offset type o may be zero indicating the described location pointer is a direct pointer that points to the beginning of the location(s) represented by the location type r paired with the offset type o.
  • the offset type o may also be unknown indicating the described location pointer does not necessarily point to the beginning of the location(s) represented by the location type ⁇ paired with the offset type o but rather is an offset pointer that may point into or around the represented location(s).
  • the location type ⁇ paired with an unknown offset type o should be an object location type to describe the inconsistent access of the represented location(s) through the described offset pointer if indeed the described offset pointer is dereferenced.
  • the offset type o may also be negative or positive indicating an offset direction of the described location pointer relative to the beginning of the location(s) represented by the location type T paired with the offset type o.
  • Element mapping type components m describe mappings of element specifiers to location types representing locations of elements of structured objects.
  • the element specifiers may be numeric or symbolic.
  • Size components s describe sizes of objects and may be numeric or symbolic.
  • the T designation for a size component s is used for types describing memory objects of different sizes and indicates the location type represents the entire memory object(s) or the rest of the memory object(s).
  • the parent type components p describe a set of struct types of which the location type is a component.
  • the T designation for a parent type component p indicates the location type is not a component of any struct types.
  • the pointer analysis may also describe locations containing constants with types.
  • the pointer analysis for one embodiment may represent locations containing constants with types similarly as locations representing non-pointer variables.
  • the pointer analysis for one embodiment represents each type with a type variable in the form of a data structure and an associated type constructor in the form of a data structure.
  • Each type variable represents a set of one or more locations, an offset, or a set of one or more functions.
  • each type variable is implemented as an equivalence class representative (ECR) data structure.
  • ECR equivalence class representative
  • the data structure may be Tarjan's fast-union/find data structure, for example.
  • the type constructor associated with a type variable representing a set of locations comprises other type variables describing the content of the represented location(s).
  • a simple type constructor and an object type constructor comprise a location type component ⁇ , an offset type component o, and a function type component ⁇ .
  • the location type component ⁇ , offset type component o, and function type component ⁇ are each represented with a type variable and an associated type constructor.
  • the struct type constructor comprises location type components ⁇ for structure elements. Each location type component ⁇ is represented with a type variable and an associated type constructor.
  • the type constructor associated with a type variable representing an offset indicates how the location pointer described by the offset points to one or more locations relative to the beginning of the location(s). Using the above types, this type constructor may be either zero or unknown.
  • the type constructor associated with a type variable representing a set of functions comprises other type variables representing the locations of the formal and return parameter variables for the represented function(s).
  • a lam type constructor comprises a location type component ⁇ for each formal and return parameter variable of the represented function(s).
  • Each location type component ⁇ is represented with a type variable and an associated type constructor.
  • the pointer analysis describes the locations and functions for a program with types so the set of types defining the store model for the program is a valid description of all possible run-time storage configurations for the program.
  • the pointer analysis identifies store usages, including pointer relationships, for the locations and functions for the program and describes the locations and functions for the program with types in accordance with typing constraints based on the store usages.
  • the pointer analysis describes each location with a type describing how the location is accessed or used in the program.
  • the pointer analysis represents pointer locations with a type describing that the represented location(s) are accessed as a unit.
  • the pointer analysis represents locations accessed through address computation of an element of the represented location(s) with a type describing the represented location(s) are accessed as a structured object. If access through a pointer is offset from the beginning of the pointed-to location(s), the pointer analysis represents the pointed-to location(s) with a type describing the inconsistent access.
  • the pointer analysis For assignments of memory objects, the pointer analysis describes the assigned-to location with a type describing how the assigned-from location is accessed as the structure of a location is to reflect the structure of the value stored in the location.
  • the pointer analysis describes the assigned-to location with a type describing an inconsistent access if the assigned-to location is subject to assignment from assigned-from locations described by different access patterns, for example a unit access and a structured object access, or described by an inconsistent access.
  • the pointer analysis for one embodiment describes the assigned-to location with a type describing the same access pattern as the type representing the assigned-from location or with a type describing an access pattern greater than that for the assigned-from location in accordance with the following hierarchy.
  • the pointer analysis for a well-typed program also represents each location with a type describing the size of the represented location(s).
  • the pointer analysis describes the assigned-from and assigned-to locations with types such that the size of the representation of the assigned value is less than or equal to that of the types describing the assigned-from and the assigned-to locations.
  • the pointer analysis for a well-typed program describes each location pointer for the program with a type comprising the type representing the pointed-to location(s). If a location pointer may point to either one of two locations, the pointer analysis represents the two locations with the same type and describes the location pointer with a type comprising the type representing both locations. If two location pointers may point to the same location, the pointer analysis describes each of the two location pointers with a type comprising the type representing the pointed-to location.
  • the pointer analysis for a well-typed program likewise describes each function pointer for the program with a type representing the pointed-to function(s). If a function pointer may point to either one of two functions, the pointer analysis represents the two functions with the same type and describes the function pointer with the type representing both functions. If two function pointers may point to the same function, the pointer analysis describes each of the two function pointers with the type representing the pointed-to function.
  • the pointer analysis for one embodiment describes locations for the assignment of memory objects in accordance with a constraint described as follows.
  • ⁇ 2 ⁇ ⁇ _ s ⁇ ⁇ 1 ⁇ ⁇ ⁇ 2 blank ⁇ ( s 2 , p 2 ) ⁇ ⁇ 1 ⁇ ⁇ ⁇ ⁇ s ⁇ s 2 ⁇ s ⁇ sizeof ⁇ ⁇ ( ⁇ 1 )
  • ⁇ ⁇ ⁇ 1 simple ⁇ ( ⁇ 1 , ⁇ 1 , s 1 , p 1 )
  • the location type ⁇ 2 representing the assigned-from location and the location type ⁇ 1 representing the assigned-to location satisfy the constraint if the location type ⁇ 1 describes the same access pattern as the location type ⁇ 2 or describes an access pattern greater than that for the location type ⁇ 2 in the location type hierarchy described above, if the size s 1 of the assigned-to location and the size S 2 of the assigned-from location are each greater than or equal to the size s of the representation of the assigned value, if the location pointer type ⁇ 2 describing the content of the assigned-from location and the location pointer type ⁇ 1 describing the content of the assigned-to location satisfy the constraint where the location types ⁇ 1 and ⁇ 2 are either simple or object location types, if the function type ⁇ 2 describing the content of the assigned-from location and the function type ⁇ 1 describing the content of the assigned-to location satisfy the constraint where the location types ⁇ 1 and ⁇ 2 are either simple or object location types, if the location types m 2 (n) and m 1 (n) and
  • the constraint ⁇ 2 ⁇ 1 or ( ⁇ 2 ⁇ o 2 )( ⁇ 1 ⁇ o 1 ) is g;A satisfied if the location type ⁇ 2 is ⁇ or if both constraints ⁇ 2 ⁇ 1 and o 2 o z are satisfied.
  • the constraint ⁇ 2 ⁇ 1 is satisfied if the location type ⁇ 2 is ⁇ or if the location type ⁇ 1 is equal to the location type ⁇ 2 .
  • the constraint o 2 o 1 is satisfied if the offset type o 2 is zero or if the offset type o 1 is equal to the offset type o 2 .
  • the constraint ⁇ 2 ⁇ 1 is satisfied if the function type ⁇ 2 is ⁇ or if the function type ⁇ 1 is equal to the function type ⁇ 2 .
  • the constraint s 2 s 1 is satisfied if the size s 1 is equal to the size s 2 or if the size s 1 is T for one embodiment using symbolic sizes with no ordering relation. For another embodiment using numeric sizes, the constraint s 2 s 1 is satisfied if the size s 1 is greater than or equal to the size s 2 or if the size s 1 is T.
  • type ⁇ 1 or ⁇ 1 ⁇ o 1 describing the content of the assigned-to location therefore satisfy the constraint if the content of the assigned-from location does not comprise a location pointer.
  • the location pointer types ⁇ 2 ⁇ o 2 and ⁇ 1 ⁇ o 1 satisfy the constraint (1) if the location types ⁇ 1 and ⁇ 2 are the same and (2) if the offset type o 2 indicates the content of the assigned-from location does not comprise an offset pointer or if the offset types o 1 and o 2 indicate the content of both the assigned-from and assigned-to locations may comprise offset pointers.
  • the function type ⁇ 2 describing the content of the assigned-from location and the function type ⁇ 1 describing the content of fizz the assigned-to location therefore satisfy the constraint if the content of the assigned-from location does not comprise a function pointer. If the content of the assigned-from location does comprise a potential function pointer, the function types ⁇ 2 and ⁇ 1 satisfy the constraint if the function types ⁇ 1 and ⁇ 2 are the same.
  • the pointer analysis applies the typing constraints for a well-typed program based on store usages for the locations and functions for the program. For one embodiment, the pointer analysis identifies store usages based on the form of each program statement referencing one or more locations or functions. The pointer analysis describes locations and functions affected by the store usages with types in accordance with a type rule specifying the typing constraints for the statement form so the description of the store as defined by the store model is valid both before and after execution of the statement. If each program statement referencing one or more locations or functions is typed in this manner, or well-typed, the program is well-typed. For one embodiment, the type rules for well-typed statements S are as follows.
  • the value of the variable y will be assigned to the variable x after execution of the statement.
  • the value of the variable x and the value of the variable y may each comprise a pointer pointing to the same location or function if the value of the variable y may comprise a pointer.
  • the typing environment A associates all variables for a program with a type and represents the store model for the program.
  • a ⁇ sim / obj ⁇ ⁇ ( ⁇ 1 , ⁇ 1 , s 1 , p 1 )
  • a ⁇ welltyped ⁇ ⁇ ( x ⁇ s ⁇ & y )
  • the content of the location(s) pointed-to by the value of the variable y will be assigned to the variable x after execution of the statement. If the value of the variable y is an offset pointer, the pointed-to location(s) are accessed inconsistently.
  • the value of the variable x and the content of the location(s) pointed-to by the value of the variable y may each comprise a pointer pointing to the same location or function if the content of the location(s) pointed-to by the value of the variable y may comprise a pointer.
  • a ⁇ x : ⁇ 1 A ⁇ y : sim / obj ⁇ ⁇ ( ⁇ 2 , ⁇ 2 , s 2 , p 2 ) ⁇ 2 ( ⁇ 2 ⁇ unknown )
  • ⁇ 2 object ⁇ ⁇ ( ⁇ 3 , ⁇ 3 , s 3 , p 3 ) sizeof ( void ⁇ *) ⁇ s 2 ⁇ 2 ⁇ ⁇ _
  • the value of the variable y will be assigned to be the content of the location(s) pointed-to by the value of the variable x after execution of the statement. If the value of the variable x is an offset pointer, the pointed-to location(s) are accessed inconsistently.
  • the content of the location(s) pointed-to by the value of the variable x and the value of the variable y may each comprise a pointer pointing to the same location or function if the value of the variable y may comprise a pointer.
  • a ⁇ x sim / obj ⁇ ⁇ ( ⁇ 1 , ⁇ 1 , s 1 , p 1 )
  • a ⁇ y : ⁇ 2 ⁇ 1 ( ⁇ 1 ⁇ ⁇ x ⁇ ⁇ zero ) sizeof ( void ⁇ *) ⁇ s 1 ⁇ 2 ⁇ ⁇ s ⁇ ⁇ 1
  • a ⁇ welltyped ⁇ ⁇ (* ⁇ x ⁇ ⁇ s ⁇ ⁇ y )
  • a ⁇ x sim / obj ⁇ ⁇ ( ⁇ 1 , ⁇ 1 , s 1 , p 1 )
  • a ⁇ y : ⁇ 2 ⁇ 1 ( ⁇ 1 ⁇ ⁇ x ⁇ unknown )
  • ⁇ 1 object ⁇ ⁇ ( ⁇ 3 , ⁇ 3 , s 3 , p 3 ) sizeof ⁇ ⁇ (
  • the value of the variable x and the value of any one operand variable y i of the operand variables y 1 . . . y n may each comprise a pointer pointing to the same location or function if the value of the variable y i may comprise a pointer.
  • a ⁇ welltyped ⁇ ⁇ ( x ⁇ s ⁇ ⁇ op ⁇ ⁇ ( y 1 ⁇ ... ⁇ ⁇ y n ) )
  • the type rule for a statement in the form x x op(y 1 . . . y n ), for other embodiments, may depend on the operation identified by op. Where the operation may return a structured value, the variable x may be described with a struct location type. For some operations, such as a ⁇ or ⁇ comparison operation for example, the result of the operation and therefore the value assigned to the variable x will not comprise a pointer regardless of whether the value of any one of the operand variables y 1 . . . y n may comprise a pointer. The value of the variable x may therefore be described with a type different from the type describing the value of any one of the operand variables y 1 . . .
  • the value of the variable x may be described with a type different from the type describing the value of any one of the operand variables y 1 . . . y n while maintaining the well-typedness of the statement.
  • the value of the variable x will be a pointer to the location representing the element n of the location pointed-to by the value of the variable y after execution of the statement.
  • a ⁇ x sim / obj ⁇ ⁇ ( ⁇ 1 , ⁇ 1 , s 1 , p 1 )
  • a ⁇ y sim / obj ⁇ ⁇ ( ⁇ 2 , ⁇ 2 , s 2 , p 2 )
  • ⁇ 1 ( ⁇ 1 ⁇ ⁇ x ⁇ ⁇ o 1 )
  • ⁇ 2 ( ⁇ 2 ⁇ ⁇ x ⁇ ⁇ zero )
  • ⁇ 2 struct ⁇ ⁇ ( m 3 , s 3 , p 3 ) compatible ⁇ ⁇ ( n , m 3 ) s ⁇ s 1 sizeof ⁇ ⁇ ( void ⁇ *) ⁇ s 2 m 3 ⁇ ( n ) ⁇ ⁇ 1
  • a ⁇ welltyped ⁇ ⁇ ( x ⁇ s ⁇ & y ⁇ - ⁇ > n
  • the location type component ⁇ 2 is not a struct location type or the offset type component describing the value of the variable y is not zero
  • a statement in this form is well-typed under the typing environment A if the location type component ⁇ 2 describing the value of the variable y and the location type component ⁇ 1 describing the value of the variable x satisfy the constraint ⁇ 2 ⁇ 1 , if the location type component ⁇ 2 is an object location type, and if the offset type component describing the value of the variable x is unknown.
  • a ⁇ f i ⁇ i
  • a ⁇ r j ⁇ n + j
  • x s fun(f 1 . . . f n ) ⁇ (r 1 . . . r m ) S* is well-typed under the typing environment A if the location type representing the location representing the variable x is a simple or object location type, if the size s 0 of the location representing the variable x is greater than or equal to the size s of the representation of the assigned value, if the location type ⁇ i representing the location representing each one f i of the variables f 1 . . .
  • the values of the variables y 1 . . . y n are assigned to the formal parameter variables for the function or procedure pointed-to by the value of the variable p before execution of the function, and the values of the return parameter variables for the called function are assigned to the variables x 1 . . . x m after execution of the called function.
  • the contents of the assigned-from and assigned-to variables may each comprise a pointer pointing to the same location or function if the assigned-from variable may comprise a pointer.
  • a ⁇ y i : ⁇ i s i sizeof ⁇ ⁇ ( y i ) ⁇ i ⁇ ⁇ ⁇ [ 1 ⁇ ⁇ ... ⁇ n ] : ⁇ i ′ ⁇ ⁇ Si ⁇ ⁇ i ⁇ j ⁇
  • the pointer analysis describes locations for the assignment of memory objects in accordance with a constraint, in lieu of the constraint, described as follows.
  • the location type T 2 representing the assigned-from location and the location type T 1 representing the assigned-to location satisfy the constraint if the location type T 1 describes the same access pattern as the location type T 2 , if the size s 1 of the assigned-to location and the size s 2 of the assigned-from location are each greater than or equal to the size s of the representation of the assigned value, if the location pointer type ⁇ 2 describing the content of the assigned-from location is the same as the location pointer type ⁇ 1 describing the content of the assigned-to location where the location types T 1 and T 2 are each simple location types or are each object location types, if the function type ⁇ 2 describing the content of the assigned-from location is the same as the function type ⁇ 1 describing the content of the assigned-to location where the location types ⁇ 1 and ⁇ 2 are each simple location types or are each object location types, if the location types m 2 (n) and m 1 (n) representing the locations representing each respective element n of the assigned-from and assigned-to locations
  • FIG. 2 illustrates for one embodiment a flow diagram 200 for performing a pointer analysis by type inference for a program with structured memory objects and potentially inconsistent memory object accesses.
  • the analysis represents each location for the program with a separate type as having no access pattern.
  • Each type describes the size of the represented location and describes the content of the represented location as a non-pointer value.
  • the analysis for one embodiment initially describes the value of each variable for the program with a blank (s, ⁇ ) location type, where the size component s is the size of the representation of the variable.
  • the analysis defines a typing environment for the program.
  • the analysis for one embodiment identifies each location for the program using a symbol table having recorded identifiers for each variable for the program.
  • the analysis identifies a store usage that is described by the program and has not been processed by the analysis.
  • the analysis for one embodiment identifies the form of a statement of the program to identify the store usage or usages described by the statement.
  • the analysis for step 206 determines whether the location(s) and/or function(s) affected by the identified store usage are well-typed for the current typing environment under a typing constraint. For one embodiment, the analysis determines whether the program statement describing the identified store usage(s) is well-typed for the current typing environment in accordance with a type rule specifying the typing constraint for the form of the statement.
  • the analysis for step 208 identifies any potential constraints for types for the identified store usage and proceeds to process another store usage for step 204 .
  • the analysis identifies any potential constraints in case the analysis determines in processing other store usages that the modification of a type in a subsequent typing environment may impose a constraint on another type.
  • the analysis identifies access pattern relationships for types in case a modification of the description of the access pattern for one type may warrant the modification of the access pattern description for another type.
  • the analysis also identifies pointer offset relationships for types in case a modification of a type to describe an offset pointer warrants the modification of another type to describe an offset pointer and/or the modification of another type to describe the pointed-to location as being accessed in an inconsistent manner.
  • the analysis further identifies potential points-to relationships for types representing only non-pointer values in case a modification of a type to describe a pointer value warrants the modification of another type to describe the pointed-to location(s).
  • each potential constraint in a separate pending set associated with the type the modification of which may impose the potential constraint.
  • Each pending set for one embodiment is a pending bag implemented as a binary tree data structure.
  • all potential constraints identified for the program may be identified in a single pending set.
  • the analysis for step 210 modifies types for location(s) and/or function(s) affected by the identified store usage as necessary so the store usage is well-typed under the typing constraint. For one embodiment, the analysis modifies types in accordance with a type rule specifying the typing constraint for the form of the program statement describing the identified store usage.
  • the analysis for one embodiment modifies types by modifying access pattern descriptions for types.
  • the analysis for one embodiment modifies access pattern descriptions for types by promoting a location type to a greater location type in the location type hierarchy described above.
  • a ⁇ location type may be promoted to a blank, simple, struct, or object location type.
  • a blank location type may be promoted to a simple, struct, or object location type.
  • a simple or struct location type may be promoted to an object location type.
  • the analysis for one embodiment also modifies types by expanding the sizes of types to accommodate the sizes of the representations of assigned values.
  • the analysis for one embodiment further modifies types by unifying types or setting types.
  • step 212 identifies any potential constraints for types for the identified store usage similarly as for step 208 in case the analysis determines in processing other store usages that the modification of a type in a subsequent typing environment may impose a constraint on another type.
  • step 214 the analysis determines whether any previously processed store usages were affected by the modification of types for step 210 .
  • the analysis determines whether any potential constraints identified for steps 208 , 212 , and 218 were affected by the modification of types for step 210 .
  • the modification of the description of the access pattern for one type may warrant the modification of the access pattern description for another type for an identified access pattern relationship.
  • the modification of a type to describe an offset pointer may warrant the modification of another type to describe an offset pointer and/or the modification of another type to describe the pointed-to location as being accessed in an inconsistent manner for identified pointer offset relationships.
  • the modification of a type to describe a pointer value may warrant the modification of another type to describe the pointed-to location for an identified potential points-to relationship.
  • the affected potential constraints are determined by the pending set associated with the type modified for step 210 . If any previously processed store usages were affected as determined for step 214 , the analysis for step 216 modifies types in accordance with the affected potential constraints and for step 218 identifies any potential constraints for the modified types similarly as for steps 208 and 212 . The analysis repeats steps 214 through 218 as potential constraints are imposed by the modification of types for step 216 .
  • step 214 determines whether the analysis determines for step 214 that no previously processed store usages were affected by steps 210 or 216 .
  • the analysis proceeds to process another store usage for steps 204 through 218 until the analysis determines for step 204 that all store usages for the program have been processed.
  • the analysis for one embodiment determines whether the last statement referencing a location and/or function for the program has been identified. When the analysis has processed all store usages for the program, the program is well-typed for step 220 .
  • the resulting set of types define the store model for the program and is a valid description of all possible run-time storage configurations for the program.
  • the analysis for FIG. 2 may process each store usage described by the program in any order independent of program control flow as defined by the control flow structures for the program.
  • the analysis processes each store usage for the program only one time.
  • steps 204 through 218 is implemented with the above type rules for one embodiment in accordance with the representative pseudo-code of Appendix A for each identified program statement form.
  • the functions for the pseudo-code of Appendix A are implemented for one embodiment in accordance with the representative pseudo-code of Appendix B.
  • FIG. 3 illustrates for one embodiment graphical representations for types describing locations for the following sample pseudo-code program fragment.
  • This program fragment may be implemented in the C programming language as follows.
  • This sample program fragment implements a lookup function for operation with a binary tree data structure.
  • Each node of the binary tree has a key and a value data field.
  • Each node also has a left subtree containing all keys of lexicographic lower ordering than the node's key and a right subtree containing all keys of lexicographic higher ordering than the node's key.
  • the pointer analysis represents the locations representing *tree, tree->right, tree->left, tree->key, tree->value, tree, rightaddr, leftaddr, keyaddr, valueaddr, treekey, result, and key with types 310 , 311 , 312 , 313 , 314 , 320 , 321 , 322 , 323 , 324 , 333 , 334 , and 343 , respectively, as illustrated in FIG. 3 .
  • Type 353 represents the location(s) pointed-to by the content of the location represented by types 313 , 333 , and 343 .
  • Types 310 - 314 , 320 - 324 , 333 - 334 , 343 , and 353 describe these locations as follows.
  • FIGS. 4A and 4B illustrate for one embodiment graphical representations for types describing locations at different stages of the analysis for the following sample C program fragment.
  • This sample program fragment implements an assignment of the value of the variable bigint to the location representing s ⁇ >a. As the value being assigned is larger in size than the location representing s ⁇ >a, the assignment leads to the capture of the neighboring location representing s ⁇ >b by the type representing the location representing s ⁇ >a.
  • the pointer analysis represents, using the above types, the locations representing *s, s ⁇ >a, s ⁇ >b, s, and bigint with types 410 , 411 , 412 , 420 , and 430 , respectively, as illustrated in FIG. 4 A.
  • Types 410 - 412 , 420 , and 430 describe these locations as follows.
  • ⁇ 1 simple(( ⁇ zero), ⁇ , ⁇ int>, ⁇ 0 ⁇ )
  • ⁇ 2 simple(( ⁇ zero), ⁇ , ⁇ int>, ⁇ 0 ⁇ )
  • ⁇ 3 simple(( ⁇ 0 ⁇ zero), ⁇ , ⁇ void*>, ⁇ )
  • the pointer analysis represents the locations representing *s, s ⁇ >a, s ⁇ >b, s, and bigint with types 410 , 413 , 413 , 420 , and 430 , respectively, as illustrated in FIG. 4 B.
  • Types 410 , 413 , 420 , and 430 describe these locations as follows.
  • ⁇ 5 object(( ⁇ zero), ⁇ ,T, ⁇ 0 ⁇ )
  • ⁇ 3 simple(( ⁇ 0 ⁇ zero), ⁇ , ⁇ void*>, ⁇ )
  • FIGS. 5A and 5B illustrate for one embodiment graphical representations for types describing locations at different stages of the analysis for the following sample C program fragment.
  • This sample program fragment implements an assignment from the location representing the variable i to the location representing the element u.c.
  • the assigned-from and assigned-to locations have different access patterns, and therefore the assigned-to location must be described by a type describing the inconsistent access.
  • the pointer analysis represents, using the above types, the locations representing u, u.d.a, and i with types 510 , 511 , and 520 , respectively, as illustrated in FIG. 5 A.
  • the element u.d.b is not represented with a type because the element u.d.b has not been accessed at this point.
  • Types 510 - 511 and 520 describe the represented locations as follows.
  • ⁇ 1 simple(( ⁇ zero), ⁇ , ⁇ int>, ⁇ 0 ⁇ )
  • the pointer analysis represents the locations representing u and i with types 510 and 520 , respectively, as illustrated in FIG. 5 B.
  • Types 510 and 520 describe these locations as follows.
  • FIGS. 6A and 6B illustrate for one embodiment graphical representations for types describing locations at different stages of the analysis for the following sample C program fragment.
  • This sample program fragment implements the inconsistent access of the structure variable *s due to pointer arithmetic on the address of s ⁇ >b, an assignment of the created offset pointer to the location representing the variable c, and an assignment of a value to the location pointed-to by the offset pointer c.
  • the location representing the variable c must therefore be described by a type describing the offset pointer, and the location pointed-to by the offset pointer c, i.e., the location representing the structure variable *s, must therefore be described by a type describing the inconsistent access.
  • the pointer analysis represents, using the above types, the locations representing *s, s ⁇ >a, s ⁇ >b, s, and c with types 610 , 611 , 612 , 620 , and 630 , respectively, as illustrated in FIG. 6 A.
  • Types 610 - 612 , 620 , and 630 describes these locations as follows.
  • ⁇ 1 blank( ⁇ int>, ⁇ 0 ⁇ )
  • ⁇ 2 blank( ⁇ int>, ⁇ 0 ⁇ )
  • ⁇ 3 simple(( ⁇ 0 ⁇ zero), ⁇ , ⁇ void*>, ⁇ )
  • ⁇ 4 simple(( ⁇ 2 ⁇ unknown), ⁇ , ⁇ void*>, ⁇ )
  • the pointer analysis represents the locations representing *s, s, and c with types 610 , 620 , and 630 , respectively, as illustrated in FIG. 6 B.
  • Types 610 , 620 , and 630 describes these locations as follows.
  • ⁇ 3 simple(( ⁇ 0 ⁇ zero), ⁇ , ⁇ void*>, ⁇ )
  • ⁇ 4 simple(( ⁇ 0 ⁇ unknown), ⁇ , ⁇ void*>, ⁇ )
  • pointer analysis by type inference for a program with structured memory objects and potentially inconsistent memory object accesses may be implemented in software for execution by any suitable data processing system configured with any suitable combination of hardware devices.
  • FIG. 7 illustrates for one embodiment a data processing system 700 that may be programmed to perform pointer analysis by type inference for a program with structured memory objects and potentially inconsistent memory object accesses.
  • data processing system 700 comprises a processor 702 , a system bus 704 , a static memory 706 , a main memory 708 , a mass memory 710 , an alphanumeric input device 712 , a cursor control device 714 , and a display 716 .
  • System bus 704 couples processor 702 , static memory 706 , main memory 708 , mass memory 710 , alphanumeric input device 712 , cursor control device 714 , and display 716 .
  • Processor 702 comprises a suitable processing device such as a microprocessor, for example, and may comprise a plurality of suitable processing devices. Processor 702 may execute instructions stored in static memory 706 , main memory 708 , and/or mass memory 710 for example. Processor 702 may process data stored in static memory 706 , main memory 708 , and/or mass memory 710 for example.
  • Static memory 706 may comprise read only memory (ROM) or any other suitable memory device. Static memory 706 may store, for example, a boot program for execution by processor 702 to initialize data processing system 700 .
  • Main memory 708 may comprise random access memory (RAM) or any other suitable memory device.
  • Mass memory 710 may comprise a hard disk device, a floppy disk, an optical disk, a flash memory device, a file server device, or any other suitable memory device.
  • the term memory comprises a single memory device and any combination of suitable memory devices for the storage of data and instructions, for example.
  • System bus 704 provides for the communication of digital information between hardware devices for data processing system 700 .
  • Processor 702 may receive over system bus 704 information that is input by a user through alphanumeric input device 712 and/or cursor control device 714 .
  • Alphanumeric input device 712 may comprise a keyboard, for example, that comprises alphanumeric keys.
  • Alphanumeric input device 712 may comprise other suitable keys, comprising function keys for example.
  • Alphanumeric input device 712 may be used to input information or commands, for example, for data processing system 700 .
  • Cursor control device 714 may comprise a mouse, touch tablet, track-ball, and/or joystick, for example, for controlling the movement of a cursor displayed by display 716 .
  • Processor 702 may also output over system bus 704 information that is to be displayed on display 716 .
  • Display 716 may comprise a cathode ray tube (CRT) or a liquid crystal display (LCD), for example, for displaying information to a user.
  • Processor 702 may use system bus 704 to transmit information to and to receive information from other hardware devices such as mass memory 710 for example.
  • Data processing system 700 may be programmed to execute suitable program code or machine instructions directing data processing system 700 to perform pointer analysis by type inference for a program with structured memory objects and potentially inconsistent memory object accesses.
  • the executable program code or machine instructions for the analysis may be stored in main memory 708 and/or in mass memory 710 , such as on a suitable magnetic or optical disk for example, for execution by processor 702 .
  • the program analyzed by data processing system 700 may also be stored in main memory 708 and/or in mass memory 710 , such as on a suitable magnetic or optical disk for example.

Abstract

A pointer analysis by type inference for a computer program ith structured memory objects and potentially inconsistent memory object accesses helps approximate run-time store usage for the program. The analysis represents locations for the program with types describing access patterns for the represented locations based on how the locations are accessed in the program. The analysis describes access patterns for structured memory objects, elements of structured memory objects, and memory objects accessed in inconsistent manners in the program. The analysis identifies store usages described by the program and determines whether the location(s) and/or function(s) affected by the identified store usages are well-typed under typing constraints. If the identified store usages are not well-typed, the analysis modifies types for location(s) and/or function(s) affected by the identified store usages as necessary so the store usages are well-typed. When the locations and/or functions for all identified store usages are well-typed, the program is well-typed with the set of types defining a store model for the program.

Description

This application contains a Microfiche Appendix consisting of 18 frames, 1 sheet.
FIELD OF THE INTENTION
The present invention relates generally to the field of computer program analysis. More particularly, the present invention relates to the field of pointer analysis for computer programs.
BACKGROUND OF THE INVENTION
Software compilers compile source code in a source language into target code in a target language. The target code may be executed directly by a data processing system or linked by a suitable linker with other target code for execution by the data processing system.
To help improve the execution of target code by the data processing system, one typical compiler performs optimization techniques based on the expected run-time usage of memory locations for the compiled program as defined by a store model or storage shape graph. The compiler may generate the store model by performing a pointer analysis to determine the effects of program statements referencing memory locations with constants, variables, or functions, for example.
One typical pointer analysis by type inference treats structured memory objects, such as structures or records, as single memory locations and may therefore generate an overly conservative store model. Another typical pointer analysis by type inference describes structured memory objects and elements of structured memory objects for a program by types based only on the type declarations of the program. Such a pointer analysis, however, produces inaccurate store models for programs using arbitrary type casts, unions, and pointer arithmetic.
SUMMARY OF THE INVENTION
A method performs a pointer analysis for a program with a data processing system. The method may be implemented in software stored by a memory for execution by a data processing system. The method may perform the pointer analysis for a program browser or while compiling the program for execution by a data processing system.
For the method, a store usage in the program accessing locations is identified and may be identified based on a form of a program statement describing the store usage. Locations for the identified store usage are represented with types describing access patterns for the locations for the identified store usage based on how the locations for the identified store usage are accessed in the program such that the types representing the locations for the identified store usage comply with a typing constraint.
Each type may be represented by a type variable and an associated type constructor. A content of one of the locations for the identified store usage may be described with a location type and a function type.
One of the locations for the identified store usage may be represented with a type describing the one location is accessed as a structured memory object if the one location is accessed as a structured memory object for the identified store usage. The one location may also be represented with a type comprising location types describing locations of elements of the structured memory object. Also, one of the locations for the identified store usage may be represented with a type comprising a location type describing a location representing a structured memory object if the one location represents an element of the structured memory object.
One of the locations for the identified store usage may be represented with a type describing the one location is accessed inconsistently if an access pattern of the one location as defined by the identified store usage is different from the access pattern described by the type representing the one location. Also, one of the locations for the identified store usage may be represented with a type describing the one location is accessed inconsistently if the one location is accessed through an offset pointer.
A location pointer may be described with a location type representing a pointed-to location and with an offset describing how the location pointer points to the pointed-to location relative to a beginning of the pointed-to location. One of the locations for the identified store usage may be represented with a type describing a size of the one location based on a size of an assigned value for the identified store usage.
The method may unify types representing values of locations for the identified store usage if the types representing values of locations for the identified store usage are different and if a select one of the types representing values of locations for the identified store usage describes a potential pointer value.
The method may determine whether the types representing the locations for the identified store usage comply with the typing constraint and may determine whether the types representing the locations for the identified store usage comply with a type rule specifying the typing constraint for the identified program statement form. If the types representing the locations for the identified store usage do not comply with the typing constraint, the method modifies types representing locations for the identified store usage to comply with the typing constraint.
The method may identify any potential constraints for types representing locations for the identified store usage and may identify a potential constraint in a pending set. The method may identify from the identified store usage an access pattern relationship for types representing locations for the identified store usage. The method may identify from the identified store usage a pointer offset relationship for types representing locations for the identified store usage. The method may identify from the identified store usage any potential points-to relationships for a type representing a non-pointer value. Types representing locations for any identified potential constraints affected by the modification of types representing locations for the identified store usage may also be modified.
The method may analyze each store usage for the program only one time in an order independent of program control flow.
A data processing system performing the pointer analysis comprises a translator for translating a program in a first language into code in a second language, a pointer analyzer for performing the pointer analysis for the program, a store model for storing the types representing locations for the program, and an optimizer for optimizing the code based on the store model.
BRIEF DESCRIPTION OF THE DRAWINGS
The present invention is illustrated by way of example and not by way of limitation in the figures of the accompanying drawings, in which like references indicate similar elements and in which:
FIG. 1 illustrates for one embodiment a compiler that uses a pointer analysis by type inference for a program with structured memory objects and potentially inconsistent memory object accesses;
FIG. 2 illustrates for one embodiment a flow diagram for performing a pointer analysis by type inference for a program with structured memory objects and potentially inconsistent memory object accesses;
FIG. 3 illustrates for one embodiment graphical representations of types for a sample program fragment;
FIGS. 4A and 4B illustrate for one embodiment graphical representations of types for another sample program fragment;
FIGS. 5A and 5B illustrate for one embodiment graphical representations of types for another sample program fragment;
FIGS. 6A and 6B illustrate for one embodiment graphical representations of types for another sample program fragment; and
FIG. 7 illustrates for one embodiment a data processing system for performing a pointer analysis by type inference for a program with structured memory objects and potentially inconsistent memory object accesses.
DETAILED DESCRIPTION
The subject matter of this patent application is disclosed in a conference paper, Steensgaard, B., “Points-to Analysis by Type Inference of Programs with Structures and Unions,” Proceedings of the 1996 International Conference on Compiler Construction, Linköbing, Sweden, Apr. 24-26, 1996, pp. 136-150, Vol. 1060, Lecture Notes on Computer Science, Springer-Verlag. This paper is herein incorporated by reference.
A pointer analysis by type inference for a computer program with structured memory objects and potentially inconsistent memory object accesses helps approximate run-time store usage for the program. This pointer analysis may supplement the pointer analysis described in U.S. patent application Ser. No. 08/664,441, filed Jun. 18, 1996, entitled POINTER ANALYSIS BY TYPE INFERENCE COMBINED WITH A NON-POINTER ANALYSIS, by Bjarne Steensgaard. U.S. patent application Ser. No. 08/664,441 is herein incorporated by reference. The pointer analysis may be used for any suitable programming tool or analyzer, such as a suitable program understanding and browsing tool or a suitable program compiler or interpreter for example. FIG. 1 illustrates for one embodiment a program compiler 100 that uses this pointer analysis.
Compiler 100 is implemented in software for execution by a suitable data processing system and comprises a front end 102, a translator 104, and a back end 106. Compiler 100 compiles source code 112 in a suitable source language into target code 116 in a suitable target language. Target code 116 may be executed directly by a suitable data processing system or linked by a suitable linker with other target code for execution by the data processing system.
Front end 102 is source language dependent and performs suitable lexical, syntax, and semantic analyses, for example, on source code 112. Translator 104 generates suitable intermediate code 114 based on the lexical, syntax, and semantic information generated by front end 102. Back end 106 is machine dependent and generates target code 116 based on intermediate code 114. Compiler 100 generates intermediate code 114 to represent source code 112 independent of the source language for source code 112 and independent of the specific machine or data processing system to execute target code 116.
In analyzing source code 112, front end 102 generates a suitable symbol table 122 in the form of a data structure for recording identifiers, such as variables and function names for example, used in source code 112. Suitable attribute information regarding each recorded identifier is also recorded and referenced for front end 102, translator 104, and back end 106.
As illustrated in FIG. 1, translator 104 performs the pointer analysis. Translator 104 evaluates statements referencing memory locations with variables and/or functions, for example, of source code 112 using symbol table 122 to determine store usages for the memory locations. Translator 104 generates a store model 124 in the form of a data structure to represent an approximation of the run-time store usage for source code 112. For another embodiment, front end 102 may perform the pointer analysis to generate store model 124 while parsing source code 112. As illustrated in FIG. 1, translator 104 and/or back end 106 may use store model 124 to help optimize intermediate code 114 and target code 116, respectively, with suitable techniques including, for example, code motion, common subexpression elimination, dead code elimination, peephole optimization, and register allocation techniques.
Source Language
The pointer analysis may be performed for code in any suitable source language that supports, for example, pointers to locations, pointers to functions, dynamic data allocation, data address computation for variables, and/or pointer arithmetic. The source language for one embodiment also supports structured memory objects, such as records or structures for example, and/or inconsistent access of memory objects, such as for unions or through offset location pointers for example. One suitable source language supports one or more of the following forms of abstract syntax statements S: S = x = s y x = s & y x = s * y x = s allocate ( y ) * x = s y x = s op ( y 1 y n ) x = s & y -> n x = s fun ( f 1 f n ) ( r 1 r m ) S * x 1 x m = s1… s m p ( y 1 y n )
Figure US06202202-20010313-M00001
where x, y, f, r, and p range over a set of variable names, n ranges over a set of element names, op ranges over a set of primitive operator names and constants, and S* denotes a sequence of statements. The assignment operator=is annotated with a size s of the representation of the value(s) being assigned.
The x=Sy statement form describes the assignment of the value of the variable y to the variable x.
The x=S&y statement form describes the assignment of the address of the variable y to the variable x so the value of the variable x points to y.
The x=S*y statement form describes the assignment of the content of the location pointed-to by the value of the variable y to the variable x.
The x=S allocate(y) statement form describes the allocation of a block of memory of size y and the assignment of the address of the allocated block to the variable x so the value of x points to the location for the allocated block.
The *x=Sy statement form describes the assignment of the value of the variable y to the content of the location pointed-to by the value of the variable x.
The x=Sop(y1. . . yn) statement form describes the performance of the primitive operation identified by op, such as addition, subtraction, etc., on the values of the operand variables y1 . . . yn and the assignment of the operation result to the variable x. This statement form may describe the assignment of a constant to the variable x, such as for the statement x=<int>7 for example.
The x=S&y−>n statement form describes the assignment of the address of the element n of the location pointed-to by the value of the variable y to the variable x.
The x=S fun(f1 . . . fn)→(r1 . . . rm) S* statement form describes the definition of a function or procedure and the assignment of a pointer to the defined function to the variable x. The function is defined with formal parameter variables f1 . . . fn, return parameter variables r1 . . . rm, and a body represented by the sequence of statements S*. The sequence of statements S* is executed when the function is called.
The x1 . . . xm=S1 . . . Sm p(y1 . . . yn) statement form describes the call of a function or procedure pointed-to by the value of the variable p with the values of the parameter variables y1 . . . yn passed as parameters to the called function and with the values of the return parameter variables for the called function respectively assigned to the variables x1 . . . xm after execution of the called function.
For one embodiment, the pointer analysis is performed independent of program control flow, and therefore the source language may support any suitable control flow structures. The pointer analysis for one embodiment is performed independent of type casts and type declarations for the source language. The C programming language is one suitable source language for the pointer analysis.
Types
The pointer analysis uses types to define a store model or storage shape graph representing an approximation of the run-time store usage for a program. A type represents a set of one or more locations and describes the content of those locations. The types define points-to relationships among the sets of locations as determined by the pointer analysis in accordance with typing constraints, and the set of types for a program define the store model for the program. The types used for the pointer analysis are to be distinguished from the types, such as character, integer, long, float, double, etc., associated with type declarations in a program.
To define types for a program in a source language supporting location pointer variables, such as for pointers to locations, dynamic data allocation, and/or data address computation for example, the pointer analysis represents each location representing a variable of the program and each dynamically allocated location of the program with a location type. A location type represents a set of one or more locations and comprises a type representing a set of locations that may be pointed-to by the content of the location(s). The pointer analysis describes each location pointer by a type comprising the type representing the pointed-to location(s).
To define types for a program in a source language supporting structured memory objects, such as for records or structures for example, the pointer analysis represents each location of the program with a location type describing how the represented location(s) are accessed, for example as a unit or as a structured object, and whether the represented location(s) represent an element of a structured object or objects. A location type representing a set of one or more structured objects comprises types describing the location of each element of the represented structured object(s).
To define types for a program in a source language supporting inconsistent access of memory objects, such as for unions or through offset location pointers for example, the pointer analysis represents each location of the program with a location type describing the size of the represented location(s) and whether the represented location(s) are accessed in an inconsistent manner. The pointer analysis describes each location pointer by the type representing the location(s) pointed-to or pointed-into and by an offset describing how the location pointer points to the location(s) relative to the beginning of the location(s).
To define types for a program in a source language supporting function pointer variables, the pointer analysis represents each function of the program with a function type. A function type represents a set of one or more functions and comprises types representing the locations of the formal and return parameter variables for the represented function(s). The pointer analysis represents each location for a function pointer variable with a type comprising a function type representing the function(s) pointed-to by the content of the represented location. The pointer analysis describes each function pointer by the type representing the pointed-to function(s).
For one embodiment, the pointer analysis represents the location representing a variable with a type describing how the variable is accessed, a mapping of element types to describe the location of each element of the variable, the size of the variable, and/or a set of parent types to describe the structured object(s) of which the represented location represents an element. The pointer analysis may also describe the value of the variable with a location pointer type to describe a location pointer and a function type to describe a function pointer. This typing may be described as follows: τ = blank ( s , p ) simple ( α , λ , s , p ) struct ( m , s , p ) object ( α , λ , s , p ) ( Locations ) α = ( τ × o ) ( Location Pointers ) o = zero unknown ( Offsets ) λ = lam ( τ 1 τ n ) ( τ n + 1 τ n + m ) ( Functions ) m = ( element τ ) mapping ( Elements ) s = SIZE ( Sizes ) p = Powerset ( τ ) ( Parents )
Figure US06202202-20010313-M00002
That is, the location type τ represents a set of one or more locations and describes how the represented location(s) are accessed.
The blank location type describes locations whose content are not being used and describes the represented location(s) with a size component s and a parent type component p.
The simple location type describes locations whose content values are used only as a unit. This location type describes the represented location(s) with a size component s and a parent type component p and describes the content of the represented location(s) with a location pointer type component α and a function type component λ.
The struct location type describes locations whose content values are used only as structured objects and describes the represented location(s) with an element mapping type component m, a size component s, and a parent type component p.
The object location type describes locations whose content values are used in inconsistent manners. This location type describes the represented location(s) with a size component s and a parent type component p and describes the content of the represented location(s) with a location pointer type component a and a function type component λ.
Location pointer type components a represent location pointers and comprise a location type τ and an offset type o. The location type τ represents the location(s) pointed-to or pointed-into by the described location pointer and may be ⊥ indicating the value described by the location type τ does not comprise a location pointer. The offset type o indicates how the described location pointer points to the location(s) represented by the location type τ paired with the offset type o relative to the beginning of the represented location(s).
For one embodiment, the offset type o may be zero indicating the described location pointer is a direct pointer that points to the beginning of the location(s) represented by the location type r paired with the offset type o. The offset type o may also be unknown indicating the described location pointer does not necessarily point to the beginning of the location(s) represented by the location type τ paired with the offset type o but rather is an offset pointer that may point into or around the represented location(s). The location type τ paired with an unknown offset type o should be an object location type to describe the inconsistent access of the represented location(s) through the described offset pointer if indeed the described offset pointer is dereferenced. For another embodiment, the offset type o may also be negative or positive indicating an offset direction of the described location pointer relative to the beginning of the location(s) represented by the location type T paired with the offset type o.
Function type components λ represent functions and function pointers. The function type component λ may comprise types representing the locations of the formal and return parameter variables for the represented functions or may be ⊥ indicating the value described by the function type component λ does not comprise a function pointer.
Element mapping type components m describe mappings of element specifiers to location types representing locations of elements of structured objects. The element specifiers may be numeric or symbolic. Size components s describe sizes of objects and may be numeric or symbolic. The T designation for a size component s is used for types describing memory objects of different sizes and indicates the location type represents the entire memory object(s) or the rest of the memory object(s). The parent type components p describe a set of struct types of which the location type is a component. The T designation for a parent type component p indicates the location type is not a component of any struct types.
The pointer analysis may also describe locations containing constants with types. The pointer analysis for one embodiment may represent locations containing constants with types similarly as locations representing non-pointer variables.
The pointer analysis for one embodiment represents each type with a type variable in the form of a data structure and an associated type constructor in the form of a data structure. Each type variable represents a set of one or more locations, an offset, or a set of one or more functions. For one embodiment, each type variable is implemented as an equivalence class representative (ECR) data structure. The data structure may be Tarjan's fast-union/find data structure, for example.
The type constructor associated with a type variable representing a set of locations comprises other type variables describing the content of the represented location(s). Using the above types, a simple type constructor and an object type constructor comprise a location type component τ, an offset type component o, and a function type component λ. The location type component τ, offset type component o, and function type component λ are each represented with a type variable and an associated type constructor. The struct type constructor comprises location type components τ for structure elements. Each location type component τ is represented with a type variable and an associated type constructor.
The type constructor associated with a type variable representing an offset indicates how the location pointer described by the offset points to one or more locations relative to the beginning of the location(s). Using the above types, this type constructor may be either zero or unknown.
The type constructor associated with a type variable representing a set of functions comprises other type variables representing the locations of the formal and return parameter variables for the represented function(s). Using the above types, a lam type constructor comprises a location type component τ for each formal and return parameter variable of the represented function(s). Each location type component τ is represented with a type variable and an associated type constructor.
Type Rules
The pointer analysis describes the locations and functions for a program with types so the set of types defining the store model for the program is a valid description of all possible run-time storage configurations for the program. For the program to be typed in this manner, or well-typed, the pointer analysis identifies store usages, including pointer relationships, for the locations and functions for the program and describes the locations and functions for the program with types in accordance with typing constraints based on the store usages.
For a well-typed program, the pointer analysis describes each location with a type describing how the location is accessed or used in the program. The pointer analysis represents pointer locations with a type describing that the represented location(s) are accessed as a unit. The pointer analysis represents locations accessed through address computation of an element of the represented location(s) with a type describing the represented location(s) are accessed as a structured object. If access through a pointer is offset from the beginning of the pointed-to location(s), the pointer analysis represents the pointed-to location(s) with a type describing the inconsistent access.
For assignments of memory objects, the pointer analysis describes the assigned-to location with a type describing how the assigned-from location is accessed as the structure of a location is to reflect the structure of the value stored in the location. The pointer analysis describes the assigned-to location with a type describing an inconsistent access if the assigned-to location is subject to assignment from assigned-from locations described by different access patterns, for example a unit access and a structured object access, or described by an inconsistent access.
Using the above types, the pointer analysis for one embodiment describes the assigned-to location with a type describing the same access pattern as the type representing the assigned-from location or with a type describing an access pattern greater than that for the assigned-from location in accordance with the following hierarchy.
Figure US06202202-20010313-C00001
The pointer analysis for a well-typed program also represents each location with a type describing the size of the represented location(s). For assignments of memory objects, the pointer analysis describes the assigned-from and assigned-to locations with types such that the size of the representation of the assigned value is less than or equal to that of the types describing the assigned-from and the assigned-to locations.
Furthermore, the pointer analysis for a well-typed program describes each location pointer for the program with a type comprising the type representing the pointed-to location(s). If a location pointer may point to either one of two locations, the pointer analysis represents the two locations with the same type and describes the location pointer with a type comprising the type representing both locations. If two location pointers may point to the same location, the pointer analysis describes each of the two location pointers with a type comprising the type representing the pointed-to location.
The pointer analysis for a well-typed program likewise describes each function pointer for the program with a type representing the pointed-to function(s). If a function pointer may point to either one of two functions, the pointer analysis represents the two functions with the same type and describes the function pointer with the type representing both functions. If two function pointers may point to the same function, the pointer analysis describes each of the two function pointers with the type representing the pointed-to function.
Using the above types, the pointer analysis for one embodiment describes locations for the assignment of memory objects in accordance with a constraint described as follows. τ 2 _ s τ 1 τ 2 = blank ( s 2 , p 2 ) τ 1 s s 2 s sizeof ( τ 1 ) τ 1 = simple ( α 1 , λ 1 , s 1 , p 1 ) τ 2 = simple ( α 2 , λ 2 , s 2 , p 2 ) α 2 α 1 λ 2 λ 1 s s 1 s s 2 τ 1 = object ( α 2 , λ 1 , s 1 , p 1 ) τ 2 = simple ( α 2 , λ 2 , s 2 , p 2 ) α 2 α 1 λ 2 λ 1 s s 1 s s 2 τ 1 = struct ( m 1 , s 1 , p 1 ) τ 2 = struct ( m 2 , s 2 , p 2 ) ( n ε Dom ( m 2 ) : ( m 2 ( n ) _ s n m 1 ( n ) , where s n = sizeof ( m 2 ( n ) ) ) ) s s 1 s s 2 τ 1 = object ( α 1 , λ 1 , s 1 , p 1 ) τ 2 = struct ( m 2 , s 2 , p 2 ) ( n ε Dom ( m 2 ) : ( m 2 ( n ) _ s n τ 1 , where s n = sizeof ( m 2 ( n ) ) ) ) s s 1 s s 2 τ 1 = object ( α 1 , λ 1 , s 1 , p 1 ) τ 2 = object ( α 1 , λ 2 , s 2 , p 2 ) α 2 α 1 λ 2 λ 1 s s 1 s s 2
Figure US06202202-20010313-M00003
That is, the location type τ2 representing the assigned-from location and the location type τ1 representing the assigned-to location satisfy the constraint if the location type τ1 describes the same access pattern as the location type τ2 or describes an access pattern greater than that for the location type τ2 in the location type hierarchy described above, if the size s1 of the assigned-to location and the size S2 of the assigned-from location are each greater than or equal to the size s of the representation of the assigned value, if the location pointer type α2 describing the content of the assigned-from location and the location pointer type α1 describing the content of the assigned-to location satisfy the constraint where the location types τ1 and τ2 are either simple or object location types, if the function type λ2 describing the content of the assigned-from location and the function type λ1 describing the content of the assigned-to location satisfy the constraint where the location types τ1 and τ2 are either simple or object location types, if the location types m2(n) and m1(n) representing the locations representing each respective element n of the assigned-from and assigned-to locations satisfy the constraint where the location types τ1 and τ2 are each struct location types, and if each location type m2(n) representing the location representing each element n of the assigned-from location and the location type τ1 satisfy the constraint where the location types τ1 and τ2 are object and struct location types, respectively.
The constraint for one embodiment is described as follows. α 2 α 1 ( τ 2 × o 2 ) ( τ 1 × o 1 ) ( τ 2 × o 2 ) ( τ 1 × o 1 ) ( τ 2 = ) ( ( τ 2 τ 1 ) ( o 2 o 1 ) ) τ 2 τ 1 ( τ 2 = ) ( τ 1 = τ 2 ) o 2 o 1 ( o 2 = zero ) ( o 1 = o 2 ) λ 2 λ 1 ( λ 2 = ) ( λ 1 = λ 2 ) s 2 s 1 ( s 1 = s 2 ) ( s 1 = )
Figure US06202202-20010313-M00004
That is, the constraint α 1 or (τ2×o2)(τ1×o1) is g;A satisfied if the location type τ2 is ⊥ or if both constraints τ2 τ 1 and o2oz are satisfied. The constraint τ 1 is satisfied if the location type τ2 is ⊥ or if the location type τ1 is equal to the location type τ2. The constraint o2o1 is satisfied if the offset type o2 is zero or if the offset type o1 is equal to the offset type o2. The constraint λ 1 is satisfied if the function type λ2 is ⊥ or if the function type λ1 is equal to the function type λ2. The constraint s2s1 is satisfied if the size s1 is equal to the size s2 or if the size s1 is T for one embodiment using symbolic sizes with no ordering relation. For another embodiment using numeric sizes, the constraint s2s1 is satisfied if the size s1 is greater than or equal to the size s2 or if the size s1 is T.
The location pointer type α2 or τ2×o2 describing the content of the assigned-from location and the location pointer. type α1 or τ1×o1 describing the content of the assigned-to location therefore satisfy the constraint if the content of the assigned-from location does not comprise a location pointer. If the content of the assigned-from location does comprise a potential location pointer, the location pointer types τ2×o2 and τ1×o1 satisfy the constraint (1) if the location types τ1 and τ2 are the same and (2) if the offset type o2 indicates the content of the assigned-from location does not comprise an offset pointer or if the offset types o1 and o2 indicate the content of both the assigned-from and assigned-to locations may comprise offset pointers.
The function type λ2 describing the content of the assigned-from location and the function type λ1 describing the content of fizz the assigned-to location therefore satisfy the constraint if the content of the assigned-from location does not comprise a function pointer. If the content of the assigned-from location does comprise a potential function pointer, the function types λ2 and λ1 satisfy the constraint if the function types λ1 and λ2 are the same.
The pointer analysis applies the typing constraints for a well-typed program based on store usages for the locations and functions for the program. For one embodiment, the pointer analysis identifies store usages based on the form of each program statement referencing one or more locations or functions. The pointer analysis describes locations and functions affected by the store usages with types in accordance with a type rule specifying the typing constraints for the statement form so the description of the store as defined by the store model is valid both before and after execution of the statement. If each program statement referencing one or more locations or functions is typed in this manner, or well-typed, the program is well-typed. For one embodiment, the type rules for well-typed statements S are as follows.
For a statement in the form x=s y, the value of the variable y will be assigned to the variable x after execution of the statement. The value of the variable x and the value of the variable y may each comprise a pointer pointing to the same location or function if the value of the variable y may comprise a pointer.
Using the above types, the type rule for this statement form is described as: A x : τ 1 A y : τ 2 τ 2 _ s τ 1 A welltyped ( x = s y )
Figure US06202202-20010313-M00005
That is, a statement in the form x=s y is well-typed under the typing environment A if the location type τ2 representing the location representing the variable y and the location type τ1 representing the location representing the variable x satisfy the constraint τ 1.
The typing environment A associates all variables for a program with a type and represents the store model for the program. A├x: τ holds true if and only if the variable x is associated with the type τ in the typing environment A.
For a statement in the form x=x &y, the value of the variable x will be a pointer to the location representing the variable y after execution of the statement.
Using the above types, the type rule for this statement form is described as: A : sim / obj ( α 1 , λ 1 , s 1 , p 1 ) A y : τ 2 α 1 = ( τ 1 × o 1 ) s s 1 τ 2 τ 1 A welltyped ( x = s & y )
Figure US06202202-20010313-M00006
That is, a statement in the form x=s &y is well-typed under the typing environment A if the location type representing the location representing the variable x is a simple or object location type, if the location type τ2 representing the location representing the variable y and the location type component τ1 describing the value of the variable x satisfy the constraint τ2τ1, and if the size s1 of the location representing the variable x is greater than or equal to the size s of the representation of the assigned value.
For a statement in the form x=x *y, the content of the location(s) pointed-to by the value of the variable y will be assigned to the variable x after execution of the statement. If the value of the variable y is an offset pointer, the pointed-to location(s) are accessed inconsistently. The value of the variable x and the content of the location(s) pointed-to by the value of the variable y may each comprise a pointer pointing to the same location or function if the content of the location(s) pointed-to by the value of the variable y may comprise a pointer.
Using the above types, the type rule for this statement form is described as: A x : τ 1 A y : sim / obj ( α 2 , λ 2 , s 2 , p 2 ) α 2 = ( τ 2 × zero ) sizeof ( void *) s 2 τ 2 _ s τ 1 A welltyped ( x = s * y ) A x : τ 1 A y : sim / obj ( α 2 , λ 2 , s 2 , p 2 ) α 2 = ( τ 2 × unknown ) τ 2 = object ( α 3 , λ 3 , s 3 , p 3 ) sizeof ( void *) s 2 τ 2 _ s τ 1 A welltyped ( x = s * y )
Figure US06202202-20010313-M00007
That is, a statement in the form x=x *y is well-typed under the typing environment A if the location type representing the location representing the variable y is a simple or object location type, if the location type component τ2 describing the value of the variable y and the location type τ1 representing the location representing the variable x satisfy the constraint τ 1, and if the size s2 of the location representing the variable y is greater than or equal to the size of a pointer, where the location type component τ2 is an object location type if the offset type component describing the value of the variable y is unknown indicating the value of the variable y comprises an offset pointer.
For a statement in the form x=x allocate(y), the value of the variable x will be a pointer to the allocated memory block after execution of the statement.
Using the above types, the type rule for this statement form is described as: A y : sim / obj ( α 1 , λ 1 , s 1 , p 1 ) α 1 = ( τ 1 × o 1 ) blank ( , ) τ 1 s s 1 A welltyped ( x = s allocate ( y ) )
Figure US06202202-20010313-M00008
That is, a statement in the form x=s allocate(y) is well-typed under the typing environment A if the location type representing the location representing the variable x is a simple or object location type, if the location type component τ1 describing the value of the variable x is a non-⊥ location type of unknown size indicating the value of the variable x is a location pointer, and if the size s1 of the location representing the variable x is greater than or equal to the size s of the representation of the assigned value.
For a statement in the form *x=s y, the value of the variable y will be assigned to be the content of the location(s) pointed-to by the value of the variable x after execution of the statement. If the value of the variable x is an offset pointer, the pointed-to location(s) are accessed inconsistently. The content of the location(s) pointed-to by the value of the variable x and the value of the variable y may each comprise a pointer pointing to the same location or function if the value of the variable y may comprise a pointer.
Using the above types, the type rule for this statement form is described as: A x : sim / obj ( α 1 , λ 1 , s 1 , p 1 ) A y : τ 2 α 1 = ( τ 1 x zero ) sizeof ( void *) s 1 τ 2 s τ 1 A welltyped (* x = s y ) A x : sim / obj ( α 1 , λ 1 , s 1 , p 1 ) A y : τ 2 α 1 = ( τ 1 x unknown ) τ 1 = object ( α 3 , λ 3 , s 3 , p 3 ) sizeof ( void *) s 1 τ 2 s τ 1 A welltyped (* x = s y )
Figure US06202202-20010313-M00009
That is, a statement in the form *x=s y is well-typed under the typing environment A if the location type representing the location representing the variable x is a simple or object location type, if the location type τ2 representing the location representing the variable y and the location type component τ1 describing the value of the variable x satisfy the constraint τ2τ1, and if the size s1 of the location representing the variable x is greater than or equal to the size of a pointer, where the location type component τ1 is an object location type if the offset type component describing the value of the variable x is unknown indicating the value of the variable x comprises an offset pointer.
For a statement in the form x=s op(y1 . . . yn), for one embodiment, the value of the variable x and the value of any one operand variable yi of the operand variables y1 . . . yn may each comprise a pointer pointing to the same location or function if the value of the variable yi may comprise a pointer.
Using the above types, the type rule for this statement form is described as: A x : τ A y i : τ i i ε [ 1 n ] : τ i s τ τ = sim / obj ( α , λ , s , p ) α = ( τ x unknown ) A welltyped ( x = s op ( y 1 y n ) )
Figure US06202202-20010313-M00010
That is, a statement in the form x=s op(y1 . . . yn) is well-typed under the typing environment A if the location type τi representing the location representing each one yi of the operand variables y1 . . . yn and the location type τ representing the location representing the variable x satisfy the constraint τiτ and if the location type τ is a simple or object location type comprising an unknown offset type component indicating the value of the variable x comprises an offset pointer if indeed the value of the variable x does comprise a pointer.
The type rule for a statement in the form x=x op(y1 . . . yn), for other embodiments, may depend on the operation identified by op. Where the operation may return a structured value, the variable x may be described with a struct location type. For some operations, such as a < or ≠ comparison operation for example, the result of the operation and therefore the value assigned to the variable x will not comprise a pointer regardless of whether the value of any one of the operand variables y1 . . . yn may comprise a pointer. The value of the variable x may therefore be described with a type different from the type describing the value of any one of the operand variables y1 . . . yn while maintaining the well-typedness of the statement. For embodiments where the source language does not support pointer arithmetic, the value of the variable x may be described with a type different from the type describing the value of any one of the operand variables y1 . . . yn while maintaining the well-typedness of the statement.
For a statement in the form x=s &y−>n, the value of the variable x will be a pointer to the location representing the element n of the location pointed-to by the value of the variable y after execution of the statement.
Using the above types, the type rule for this statement form is described as: A x : sim / obj ( α 1 , λ 1 , s 1 , p 1 ) A y : sim / obj ( α 2 , λ 2 , s 2 , p 2 ) α 1 = ( τ 1 x o 1 ) α 2 = ( τ 2 x zero ) τ 2 = struct ( m 3 , s 3 , p 3 ) compatible ( n , m 3 ) s s 1 sizeof ( void *) s 2 m 3 ( n ) τ 1 A welltyped ( x = s & y - > n ) A x : sim / obj ( α 1 , λ 1 , s 1 , p 1 ) A y : sim / obj ( α 2 , λ 2 , s 2 , p 2 ) α 1 = ( τ 1 x unknown ) α 2 = ( τ 2 x o 2 ) τ 2 = object ( α 3 , λ 3 , s 3 , p 3 ) s s 1 sizeof ( void *) s 2 τ 2 τ 1 A welltyped ( x = s & y - > n )
Figure US06202202-20010313-M00011
That is, a statement in the form x=x &y−>n is well-typed under the typing environment A if the location types representing the locations representing the variables x and y are each a simple or object location type, if the size s1 of the location representing the variable x is greater than or equal to the size s of the representation of the assigned value, if the size s2 of the location representing the variable y is greater than or equal to the size of a pointer, and if the location type m3(n) representing the location representing the element n and the location type component τ1 describing the value of the variable x satisfy the constraint m3(n)τ1 where the location type component τ2 describing the value of the variable y is a struct location type and the offset type component describing the value of the variable y is zero, where compatible (n,m3) is a predicate stating that the mapping m3 describes a structure having a prefix matching that of the structure being accessed up to and including the element n. Where the location type component τ2 is not a struct location type or the offset type component describing the value of the variable y is not zero, a statement in this form is well-typed under the typing environment A if the location type component τ2 describing the value of the variable y and the location type component τ1 describing the value of the variable x satisfy the constraint τ2τ1, if the location type component τ2 is an object location type, and if the offset type component describing the value of the variable x is unknown.
For a statement in the form x=s fun(f1 . . . fn)→(r1 . . . rm) S*, the value of the variable x will be a pointer to a function with formal and return parameter variables f1 . . . fn and r1 . . . rm after execution of the statement.
Using the above types, the type rule for this statement form is described as: A x : sim / obj ( α 0 , λ 0 , s 0 , p 0 ) λ 0 = lam ( τ 1 τ n ) ( τ n + 1 τ n + m ) A f i : τ i A r j : τ n + j s i = sizeof ( f i ) s n + j = sizeof ( r j ) s s 0 i ε [ 1 n ] : τ i Si τ i j ε [ 1 m ] : τ n + j Sn + j τ n + j S ε S * : A welltyped ( S ) A welltyped ( x = s fun ( f 1 f n ) ( r 1 r n ) ( r 1 r m )
Figure US06202202-20010313-M00012
That is, a statement in the form x=s fun(f1 . . . fn)→(r1 . . . rm) S* is well-typed under the typing environment A if the location type representing the location representing the variable x is a simple or object location type, if the size s0 of the location representing the variable x is greater than or equal to the size s of the representation of the assigned value, if the location type τi representing the location representing each one fi of the variables f1 . . . fn and the corresponding location type τi representing the location for each one of the formal parameter variables for all the functions potentially pointed-to by the value of the variable x satisfy the constraint τiτi where s1=sizeof (fi), if the location type τn+j representing the location representing each one rj of the variables r1 . . . rm and the corresponding location type τn+j representing the location for each one of the return parameter variables for all the functions potentially pointed-to by the value of the variable x satisfy the constraint τn+j τn+j where Sn+j=sizeof(rj), and if each statement S of the sequence of statements S* is well-typed under the typing environment A.
For a statement in the form x1 . . . xm=Sn+1 . . . Sn+m p(y1 . . . yn), the values of the variables y1 . . . yn are assigned to the formal parameter variables for the function or procedure pointed-to by the value of the variable p before execution of the function, and the values of the return parameter variables for the called function are assigned to the variables x1 . . . xm after execution of the called function. The contents of the assigned-from and assigned-to variables may each comprise a pointer pointing to the same location or function if the assigned-from variable may comprise a pointer.
Using the above types, the type rule for this statement form is described as: A p : sim / obj ( α 0 , λ 0 , s 0 , p 0 ) sizeof ( void (* ) ( ) ) s 0 λ 0 = lam ( τ 1 τ n ) ( τ n + 1 τ n + m ) A x j : τ n + j A y i : τ i s i = sizeof ( y i ) i ε [ 1 n ] : τ i Si τ i j ε [ 1 m ] : τ n + j Sn + j τ n + j A welltyped ( x 1 x m = Sn + 1 Sn + m p ( y 1 y n ) )
Figure US06202202-20010313-M00013
That is, a statement in the form x1 . . . xm=Sn+1 . . . Sn+m p(y1 . . . yn) is well-typed under the typing environment A if the location type representing the location representing the variable p is a simple or object location type, if the size s0 of the location representing the variable p is greater than or equal to the size of a pointer to a function, if the location type τi 40 representing the location representing each one yi of the variables y1 . . . yn and the corresponding location type τi representing the location for each one of the formal parameter variables for the potentially called functions pointed-to by the value of the variable p satisfy the constraint τi 40τi where si=sizeof (yi), and if the location type τn+j representing the location representing each one xj of the variables x1 . . . xm and the corresponding location type τn+j representing the location for each one of the return parameter variables for the potentially called functions pointed-to by the value of the variable p satisfy the constraint τn+jτn+j .
For another embodiment using the above types, the pointer analysis describes locations for the assignment of memory objects in accordance with a constraint, in lieu of the constraint, described as follows. τ 2 s τ 1 τ 1 = blank ( s 1 , p 1 ) τ 2 = blank ( s 2 , p 2 ) s s 1 s s 2 τ 1 = simple ( α 1 , λ 1 , s 1 , p 1 ) τ 2 = simple ( α 2 , λ 2 , s 2 , p 2 ) α 1 = α 2 λ 1 = λ 2 Λ s s 1 Λ s s 2 τ 1 = struct ( m 1 , s 1 , p 1 ) τ 2 = struct ( m 2 , s 2 , p 2 ) Dom ( m 1 ) = Dom ( m 2 ) ( n ε Dom ( m 2 ) : ( m 2 ( n ) s n m 1 ( n ) , where s n = sizeof ( m 2 ( n ) ) ) ) s s 1 s s 2 τ 1 = object ( α 1 , λ 1 , s 1 , p 1 ) τ 2 = object ( α 2 , λ 2 , s 2 , p 2 ) α 1 = α 2 λ 1 = λ 2 s s 1 s s 2
Figure US06202202-20010313-M00014
That is, the location type T2 representing the assigned-from location and the location type T1 representing the assigned-to location satisfy the constraint if the location type T1 describes the same access pattern as the location type T2, if the size s1 of the assigned-to location and the size s2 of the assigned-from location are each greater than or equal to the size s of the representation of the assigned value, if the location pointer type α2 describing the content of the assigned-from location is the same as the location pointer type α1 describing the content of the assigned-to location where the location types T1 and T2 are each simple location types or are each object location types, if the function type λ2 describing the content of the assigned-from location is the same as the function type λ1 describing the content of the assigned-to location where the location types τ1 and τ2 are each simple location types or are each object location types, if the location types m2(n) and m1(n) representing the locations representing each respective element n of the assigned-from and assigned-to locations satisfy the constraint where the location types τ1 and τ2 are each struct location types and where the assigned-from and assigned-to locations have the same domain of elements.
Analysis
FIG. 2 illustrates for one embodiment a flow diagram 200 for performing a pointer analysis by type inference for a program with structured memory objects and potentially inconsistent memory object accesses.
For step 202 of FIG. 2, the analysis represents each location for the program with a separate type as having no access pattern. Each type describes the size of the represented location and describes the content of the represented location as a non-pointer value. The analysis for one embodiment initially describes the value of each variable for the program with a blank (s,Ø) location type, where the size component s is the size of the representation of the variable. By representing each variable of the program with a type, the analysis defines a typing environment for the program. The analysis for one embodiment identifies each location for the program using a symbol table having recorded identifiers for each variable for the program.
For step 204, the analysis identifies a store usage that is described by the program and has not been processed by the analysis. The analysis for one embodiment identifies the form of a statement of the program to identify the store usage or usages described by the statement.
If the analysis identifies a store usage that has not been processed, the analysis for step 206 determines whether the location(s) and/or function(s) affected by the identified store usage are well-typed for the current typing environment under a typing constraint. For one embodiment, the analysis determines whether the program statement describing the identified store usage(s) is well-typed for the current typing environment in accordance with a type rule specifying the typing constraint for the form of the statement.
If the analysis determines an identified store usage is well-typed, the analysis for step 208 identifies any potential constraints for types for the identified store usage and proceeds to process another store usage for step 204. The analysis identifies any potential constraints in case the analysis determines in processing other store usages that the modification of a type in a subsequent typing environment may impose a constraint on another type.
For one embodiment, the analysis identifies access pattern relationships for types in case a modification of the description of the access pattern for one type may warrant the modification of the access pattern description for another type. The analysis also identifies pointer offset relationships for types in case a modification of a type to describe an offset pointer warrants the modification of another type to describe an offset pointer and/or the modification of another type to describe the pointed-to location as being accessed in an inconsistent manner. For embodiments combining the points-to analysis with a non-pointer analysis as described in U.S. patent application Ser. No. 08/664,441, for example, the analysis further identifies potential points-to relationships for types representing only non-pointer values in case a modification of a type to describe a pointer value warrants the modification of another type to describe the pointed-to location(s).
The analysis for one embodiment identifies each potential constraint in a separate pending set associated with the type the modification of which may impose the potential constraint. Each pending set for one embodiment is a pending bag implemented as a binary tree data structure. For another embodiment, all potential constraints identified for the program may be identified in a single pending set.
If the analysis for step 206 determines the identified store usage is not well-typed for the current typing environment under the typing constraint, the analysis for step 210 modifies types for location(s) and/or function(s) affected by the identified store usage as necessary so the store usage is well-typed under the typing constraint. For one embodiment, the analysis modifies types in accordance with a type rule specifying the typing constraint for the form of the program statement describing the identified store usage.
The analysis for one embodiment modifies types by modifying access pattern descriptions for types. The analysis for one embodiment modifies access pattern descriptions for types by promoting a location type to a greater location type in the location type hierarchy described above. A ⊥ location type may be promoted to a blank, simple, struct, or object location type. A blank location type may be promoted to a simple, struct, or object location type. A simple or struct location type may be promoted to an object location type. The analysis for one embodiment also modifies types by expanding the sizes of types to accommodate the sizes of the representations of assigned values. The analysis for one embodiment further modifies types by unifying types or setting types.
The analysis for step 212 identifies any potential constraints for types for the identified store usage similarly as for step 208 in case the analysis determines in processing other store usages that the modification of a type in a subsequent typing environment may impose a constraint on another type.
For step 214, the analysis determines whether any previously processed store usages were affected by the modification of types for step 210. The analysis determines whether any potential constraints identified for steps 208, 212, and 218 were affected by the modification of types for step 210.
For one embodiment, the modification of the description of the access pattern for one type may warrant the modification of the access pattern description for another type for an identified access pattern relationship. The modification of a type to describe an offset pointer may warrant the modification of another type to describe an offset pointer and/or the modification of another type to describe the pointed-to location as being accessed in an inconsistent manner for identified pointer offset relationships. For embodiments using a non-pointer analysis, the modification of a type to describe a pointer value may warrant the modification of another type to describe the pointed-to location for an identified potential points-to relationship.
For one embodiment, the affected potential constraints are determined by the pending set associated with the type modified for step 210. If any previously processed store usages were affected as determined for step 214, the analysis for step 216 modifies types in accordance with the affected potential constraints and for step 218 identifies any potential constraints for the modified types similarly as for steps 208 and 212. The analysis repeats steps 214 through 218 as potential constraints are imposed by the modification of types for step 216.
If the analysis determines for step 214 that no previously processed store usages were affected by steps 210 or 216, the analysis proceeds to process another store usage for steps 204 through 218 until the analysis determines for step 204 that all store usages for the program have been processed. The analysis for one embodiment determines whether the last statement referencing a location and/or function for the program has been identified. When the analysis has processed all store usages for the program, the program is well-typed for step 220. The resulting set of types define the store model for the program and is a valid description of all possible run-time storage configurations for the program.
For one embodiment, the analysis for FIG. 2 may process each store usage described by the program in any order independent of program control flow as defined by the control flow structures for the program. The analysis processes each store usage for the program only one time.
The analysis for steps 204 through 218 is implemented with the above type rules for one embodiment in accordance with the representative pseudo-code of Appendix A for each identified program statement form. The functions for the pseudo-code of Appendix A are implemented for one embodiment in accordance with the representative pseudo-code of Appendix B.
Types for Sample Programs
FIG. 3 illustrates for one embodiment graphical representations for types describing locations for the following sample pseudo-code program fragment.
struct treenode (right, left, key, value)
lookup =<void*> fun(tree, key)→(result)
while (tree)
keyaddr =<void*> &tree->key;
treekey =<void*> *keyaddr;
rel =<char> strcmp(key, treekey)
if (rel > 0) then
rightaddr =<void*> &tree->right
tree =<struct treenode> *rightaddr
elseif (rel < 0) then
leftaddr =<void*> &tree->left
tree =<struct treenode> *leftaddr
else
valueaddr =<void*> &tree->value
result =<void*> *valueaddr
return
fi
endwhile
This program fragment may be implemented in the C programming language as follows.
typedef struct treenode {
struct treenode *right, *left;
char *key, *value;
} treenode;
char *lookup (treenode *tree, char *key)
{
int rel;
while (tree) {
rel = strcmp(key, tree->key);
if (rel > 0)
tree = tree->right;
else if (rel < 0)
tree = tree->left;
else
return tree->value;
}
}
This sample program fragment implements a lookup function for operation with a binary tree data structure. Each node of the binary tree has a key and a value data field. Each node also has a left subtree containing all keys of lexicographic lower ordering than the node's key and a right subtree containing all keys of lexicographic higher ordering than the node's key.
Using the above types, the pointer analysis represents the locations representing *tree, tree->right, tree->left, tree->key, tree->value, tree, rightaddr, leftaddr, keyaddr, valueaddr, treekey, result, and key with types 310, 311, 312, 313, 314, 320, 321, 322, 323, 324, 333, 334, and 343, respectively, as illustrated in FIG. 3. Type 353 represents the location(s) pointed-to by the content of the location represented by types 313, 333, and 343. Types 310-314, 320-324, 333-334, 343, and 353 describe these locations as follows.
*tree: τ0 = struct([right→τ1,left→τ2,key→τ3,value→τ4],
<struct treenode>,φ)
tree->right: τ1 = simple((τ0 × zero),⊥,<void*>,{τ0})
tree->left: τ2 = simple((τ0 × zero),⊥,<void*>,{τ0})
tree->key: τ3 = simple((τ12 × zero),⊥,<void*>,{τ0})
tree->value: τ4 = simple((⊥ × zero),⊥,<void*>,{τ0})
tree: τ5 = simple((τ0 × zero),⊥,<void*>,φ)
rightaddr: τ6 = simple((τ1 × zero),⊥,<void*>,φ)
leftaddr: τ7 = simple((τ2 × zero),⊥,<void*>,φ)
keyaddr: τ8 = simple((τ3 × zero),⊥,<void*>,φ)
valueaddr: τ9 = simple((τ4 × zero),⊥,<void*>,φ)
treekey: τ10 = simple((τ12 × zero),⊥,<void*>,φ)
key: τ11 = simple((τ12 × zero),⊥,<void*>,φ)
τ12 = object((⊥ × zero),⊥,T,φ)
result: τ13 = simple((⊥ × zero),⊥,<void*>,φ)
FIGS. 4A and 4B illustrate for one embodiment graphical representations for types describing locations at different stages of the analysis for the following sample C program fragment.
struct pair {int a; int b;} *s;
long bigint;
*(&s->a) = 4;
*(&s->b) = 5;
*((long*)&s->a) = bigint;
This sample program fragment implements an assignment of the value of the variable bigint to the location representing s−>a. As the value being assigned is larger in size than the location representing s−>a, the assignment leads to the capture of the neighboring location representing s−>b by the type representing the location representing s−>a.
After analysis of only the *(&s−>a)=4 and *(&s−>b)=5 statements, the pointer analysis represents, using the above types, the locations representing *s, s−>a, s−>b, s, and bigint with types 410, 411, 412, 420, and 430, respectively, as illustrated in FIG. 4A. Types 410-412, 420, and 430 describe these locations as follows.
*s: τ0=struct([a→τ1,b→τ2],<struct pair>,Ø)
s−>a: τ1=simple((⊥×zero),⊥,<int>,{τ0})
s−>b: τ2=simple((⊥×zero),⊥,<int>,{τ0})
s: τ3=simple((τ0×zero),⊥,<void*>,Ø)
bigint: τ4=blank(<long>,Ø)
After analysis of the *((long*)&s−>a)=bigint statement in addition to the previously analyzed statements, the pointer analysis represents the locations representing *s, s−>a, s−>b, s, and bigint with types 410, 413, 413, 420, and 430, respectively, as illustrated in FIG. 4B. Types 410, 413, 420, and 430 describe these locations as follows.
*s: τ0=struct([a→τ5,b→τ5],T,Ø)
s−>a: τ5=object((⊥×zero),⊥,T,{τ0})
s−>b: τ5
s: τ3=simple((τ0×zero),⊥,<void*>,Ø)
bigint: τ4=simple((⊥×zero),⊥,<long>,Ø)
FIGS. 5A and 5B illustrate for one embodiment graphical representations for types describing locations at different stages of the analysis for the following sample C program fragment.
struct s {int a; int *b;};
union (int c; struct s d;} u;
int i;
u.d.a = 5;
u.c = i;
This sample program fragment implements an assignment from the location representing the variable i to the location representing the element u.c. The assigned-from and assigned-to locations have different access patterns, and therefore the assigned-to location must be described by a type describing the inconsistent access.
After analysis of only the u.d.a=5 statement, the pointer analysis represents, using the above types, the locations representing u, u.d.a, and i with types 510, 511, and 520, respectively, as illustrated in FIG. 5A. The element u.d.b is not represented with a type because the element u.d.b has not been accessed at this point. Types 510-511 and 520 describe the represented locations as follows.
u: τ0=struct([a→τ1],<struct s>,Ø)
u.d.a: τ1=simple((⊥×zero),⊥,<int>,{τ0})
i: τ3=blank(<int>,Ø)
After analysis of both the u.d.a=5 and the u.c=i statements, the pointer analysis represents the locations representing u and i with types 510 and 520, respectively, as illustrated in FIG. 5B. Types 510 and 520 describe these locations as follows.
u: τ0=object((⊥×zero),⊥,T,Ø)
i: τ3=simple((⊥×zero),⊥,<int>,Ø)
FIGS. 6A and 6B illustrate for one embodiment graphical representations for types describing locations at different stages of the analysis for the following sample C program fragment.
struct pair {int a; int b;} *s;
int *c;
c=(&s−>b)−1;
*c=5;
This sample program fragment implements the inconsistent access of the structure variable *s due to pointer arithmetic on the address of s−>b, an assignment of the created offset pointer to the location representing the variable c, and an assignment of a value to the location pointed-to by the offset pointer c. The location representing the variable c must therefore be described by a type describing the offset pointer, and the location pointed-to by the offset pointer c, i.e., the location representing the structure variable *s, must therefore be described by a type describing the inconsistent access.
After analysis of only the c=(&s−>b)−1 statement, the pointer analysis represents, using the above types, the locations representing *s, s−>a, s−>b, s, and c with types 610, 611, 612, 620, and 630, respectively, as illustrated in FIG. 6A. Types 610-612, 620, and 630 describes these locations as follows.
*s: τ0=struct([a→τ1,b→τ2],<struct pair>,Ø)
s−>a: τ1=blank(<int>,{τ0})
s−>b: τ2=blank(<int>,{τ0})
s: τ3=simple((τ0×zero),⊥,<void*>,Ø)
c: τ4=simple((τ2×unknown),⊥,<void*>,Ø)
After analysis of both the c=(&s−>b)−1 statement and the *c=5 statement, the pointer analysis represents the locations representing *s, s, and c with types 610, 620, and 630, respectively, as illustrated in FIG. 6B. Types 610, 620, and 630 describes these locations as follows.
*s: τ0=object((⊥×zero),⊥,T,Ø)
s: τ3=simple((τ0×zero),⊥,<void*>,Ø)
c: τ4 =simple((τ0×unknown),⊥,<void*>,Ø)
Data Processing System
For one embodiment, pointer analysis by type inference for a program with structured memory objects and potentially inconsistent memory object accesses may be implemented in software for execution by any suitable data processing system configured with any suitable combination of hardware devices. FIG. 7 illustrates for one embodiment a data processing system 700 that may be programmed to perform pointer analysis by type inference for a program with structured memory objects and potentially inconsistent memory object accesses.
As illustrated in FIG. 7, data processing system 700 comprises a processor 702, a system bus 704, a static memory 706, a main memory 708, a mass memory 710, an alphanumeric input device 712, a cursor control device 714, and a display 716. System bus 704 couples processor 702, static memory 706, main memory 708, mass memory 710, alphanumeric input device 712, cursor control device 714, and display 716.
Processor 702 comprises a suitable processing device such as a microprocessor, for example, and may comprise a plurality of suitable processing devices. Processor 702 may execute instructions stored in static memory 706, main memory 708, and/or mass memory 710 for example. Processor 702 may process data stored in static memory 706, main memory 708, and/or mass memory 710 for example.
Static memory 706 may comprise read only memory (ROM) or any other suitable memory device. Static memory 706 may store, for example, a boot program for execution by processor 702 to initialize data processing system 700. Main memory 708 may comprise random access memory (RAM) or any other suitable memory device. Mass memory 710 may comprise a hard disk device, a floppy disk, an optical disk, a flash memory device, a file server device, or any other suitable memory device. For this detailed description, the term memory comprises a single memory device and any combination of suitable memory devices for the storage of data and instructions, for example.
System bus 704 provides for the communication of digital information between hardware devices for data processing system 700. Processor 702 may receive over system bus 704 information that is input by a user through alphanumeric input device 712 and/or cursor control device 714. Alphanumeric input device 712 may comprise a keyboard, for example, that comprises alphanumeric keys. Alphanumeric input device 712 may comprise other suitable keys, comprising function keys for example. Alphanumeric input device 712 may be used to input information or commands, for example, for data processing system 700. Cursor control device 714 may comprise a mouse, touch tablet, track-ball, and/or joystick, for example, for controlling the movement of a cursor displayed by display 716.
Processor 702 may also output over system bus 704 information that is to be displayed on display 716. Display 716 may comprise a cathode ray tube (CRT) or a liquid crystal display (LCD), for example, for displaying information to a user. Processor 702 may use system bus 704 to transmit information to and to receive information from other hardware devices such as mass memory 710 for example.
Data processing system 700 may be programmed to execute suitable program code or machine instructions directing data processing system 700 to perform pointer analysis by type inference for a program with structured memory objects and potentially inconsistent memory object accesses. For one embodiment, the executable program code or machine instructions for the analysis may be stored in main memory 708 and/or in mass memory 710, such as on a suitable magnetic or optical disk for example, for execution by processor 702. The program analyzed by data processing system 700 may also be stored in main memory 708 and/or in mass memory 710, such as on a suitable magnetic or optical disk for example.
In the foregoing description, the invention has been described with reference to specific exemplary embodiments thereof. It will, however, be evident that various modifications and changes may be made thereto without departing from the broader spirit or scope of the present invention as defined in the appended claims. The specification and drawings are, accordingly, to be regarded in an illustrative rather than a restrictive sense.
Figure US06202202-20010313-P00001
Figure US06202202-20010313-P00002
Figure US06202202-20010313-P00003
Figure US06202202-20010313-P00004
Figure US06202202-20010313-P00005
Figure US06202202-20010313-P00006
Figure US06202202-20010313-P00007
Figure US06202202-20010313-P00008
Figure US06202202-20010313-P00009
Figure US06202202-20010313-P00010
Figure US06202202-20010313-P00011
Figure US06202202-20010313-P00012
Figure US06202202-20010313-P00013
Figure US06202202-20010313-P00014
Figure US06202202-20010313-P00015
Figure US06202202-20010313-P00016
Figure US06202202-20010313-P00017

Claims (71)

What is claimed is:
1. A method for performing a pointer analysis for a program with a data processing system, the method comprising the steps of:
(a) identifying in the program one or more store usages accessing locations; and
(b) generating a store model to approximate run-time store usage for the program, the store model comprising types having components representing locations for the identified store usage(s) such that the types describe access patterns for the locations for the identified store usage(s) based on how the identified store usage(s) access the locations and such that the types representing the locations for the identified store usage(s) comply with a typing constraint.
2. The method of claim 1, wherein the generating step (b) comprises the step of representing one location with a type describing the one location is accessed as a structured memory object.
3. The method of claim 2, wherein the generating step (b) comprises the step of representing the one location with a type comprising location types describing locations of elements of the structured memory object.
4. The method of claim 1, wherein the generating step (b) comprises the step of representing a location representing an element of a structured memory object with a type comprising a location type describing a location representing the structured memory object.
5. The method of claim 1, wherein the generating step (b) comprises the step of representing one location with a type describing the one location is accessed inconsistently if identified store usages define different access patterns for the one location.
6. The method of claim 1, wherein the generating step (b) comprises the step of representing one location with a type describing the one location is accessed inconsistently if an identified store usage accesses the one location through an offset pointer.
7. The method of claim 1, wherein the generating step (b) comprises the step of describing a location pointer for an identified store usage with a location type representing a pointed-to location and with an offset describing how the location pointer points to the pointed-to location relative to a beginning of the pointed-to location.
8. The method of claim 1, wherein the generating step (b) comprises the step of representing one location with a type describing a size of the one location.
9. The method of claim 1, wherein the generating step (b) comprises the step of describing a content of one location with a location type and a function type.
10. The method of claim 1, wherein each type is represented by a type variable and an associated type constructor.
11. The method of claim 1, wherein the generating step (b) comprises the step of unifying different types representing values of locations if a select one of the different types describes a potential pointer value.
12. The method of claim 1, wherein the method performs the pointer analysis while compiling the program for execution by a data processing system.
13. The method of claim 1, wherein the method performs the pointer analysis for a program browser.
14. The method of claim 1, wherein the generating step (b) comprises the steps of:
i) determining whether the types representing the locations for one identified store usage comply with the typing constraint, and
ii) if the types representing the locations for the one identified store usage do not comply with the typing constraint, modifying types representing locations for the one identified store usage to comply with the typing constraint.
15. The method of claim 14, wherein the identifying step (a) comprises the step of identifying a form of a program statement describing the one identified store usage, and wherein the determining step (b)(i) comprises the step of determining whether the types representing the locations for the one identified store usage comply with a type rule specifying the typing constraint for the identified program statement form.
16. A method for performing a pointer analysis for a program with a data processing system, the method comprising the steps of:
identifying in the program a store usage accessing locations;
determining whether types representing the locations for the identified store usage comply with a typing constraint;
identifying any potential constraints for types representing locations for the identified store usage;
if the types representing the locations for the identified store usage do not comply with the typing constraint, modifying types representing locations for the identified store usage to comply with the typing constraint,
wherein the modifying step (d) comprises the step of representing locations for the identified store usage with a hierarchy of types describing access patterns for the locations for the identified store usage based on how the identified store usage accesses the locations; and
modifying types representing locations for any identified potential constraints affected by the modifying step (d).
17. The method of claim 16, wherein the modifying step (d) comprises the step of representing one of the locations for the identified store usage with a type describing the one location is accessed as a structured memory object.
18. The method of claim 16, wherein the modifying step (d) comprises the step of representing one of the locations for the identified store usage with a type describing the one location is accessed inconsistently if the identified store usage and another store usage in the program define different access patterns for the one location.
19. The method of claim 16, wherein the modifying step (d) comprises the step of representing one of the locations for the identified store usage with a type describing the one location is accessed inconsistently if the identified store usage accesses the one location through an offset pointer.
20. The method of claim 16, wherein the modifying step (d) comprises the step of describing a location pointer for the identified store usage with a location type representing a pointed-to location and with an offset describing how the location pointer points to the pointed-to location relative to a beginning of the pointed-to location.
21. The method of claim 16, wherein the modifying step (d) comprises the step of representing one of the locations for the identified store usage with a type describing a size of the one location.
22. The method of claim 16, wherein one of the types comprises a location type and a function type, each describing a content of a location represented by the one type.
23. The method of claim 16, wherein each type is represented by a type variable and an associated type constructor.
24. The method of claim 16, wherein the identifying step (c) identifies a potential constraint in a pending set.
25. The method of claim 16, wherein the identifying step (c) comprises the step of identifying from the identified store usage an access pattern relationship for types representing locations for the identified store usage.
26. The method of claim 16, wherein the identifying step (c) comprises the step of identifying from the identified store usage a pointer offset relationship for types representing locations for the identified store usage.
27. The method of claim 16, wherein the identifying step (c) comprises the step of identifying from the identified store usage any potential points-to relationships for a type representing a non-pointer value.
28. The method of claim 16, wherein the program describes a plurality of store usages and wherein the method analyzes each described store usage only one time in an order independent of program control flow.
29. The method of claim 16, wherein the identifying step (a) comprises the step of identifying a form of a program statement describing the store usage, and
wherein the determining step (b) comprises the step of determining whether the types representing the locations for the identified store usage comply with a type rule specifying the typing constraint for the identified program statement form.
30. The method of claim 16, wherein the method performs the pointer analysis while compiling the program for execution by a data processing system.
31. The method of claim 16, wherein the method performs the pointer analysis for a program browser.
32. A memory for storing software for execution by a data processing system to perform a pointer analysis for a program, the memory comprising:
program code stored by the memory for identifying a store usage in the program accessing locations; and
program code stored by the memory for representing locations for the identified store usage with types having components describing access patterns for the locations for the identified store usage based on how the locations for the identified store usage are accessed in the program such that the types representing the locations for the identified store usage comply with a typing constraint.
33. The computer-readable medium of claim 32, wherein the program code (b) comprises program code for representing one of the locations for the identified store usage with a type describing the one location is accessed as a structured memory object.
34. The computer-readable medium of claim 33, wherein the program code (b) comprises program code for representing the one location with a type comprising location types describing locations of elements of the structured memory object.
35. The computer-readable medium of claim 32, wherein the program code (b) comprises program code for representing a location representing an element of a structured memory object with a type comprising a location type describing a location representing the structured memory object.
36. The computer-readable medium of claim 32, wherein the program code (b) comprises program code for representing one of the locations for the identified store usage with a type describing the one location is accessed inconsistently if the identified store usage and another store usage in the progam define different access patterns for the one location.
37. The computer-readable medium of claim 32, wherein the program code (b) comprises program code for representing one of the locations for the identified store usage with a type describing the one location is accessed inconsistently if the identified store usage accesses the one location through an offset pointer.
38. The computer-readable medium of claim 32, wherein the program code (b) comprises program code for describing a location pointer for the identified store usage with a location type representing a pointed-to location and with an offset describing how the location pointer points to the pointed-to location relative to a beginning of the pointed-to location.
39. The computer-readable medium of claim 32, wherein the program code (b) comprises program code for representing one of the locations for the identified store usage with a type describing a size of the one location.
40. The computer-readable medium of claim 32, wherein the program code (b) comprises program code for describing a content of one of the locations for the identified store usage with a location type and a function type.
41. The computer-readable medium of claim 32, comprising program code for representing each type by a type variable and an associated type constructor.
42. The computer-readable medium of claim 32, wherein the program code (b) comprises program code for unifying different types representing values of locations for the identified store usage if a select one of the different types describes a potential pointer value.
43. The computer-readable medium of claim 32, comprising program code for compiling the program for execution by a data processing system.
44. The computer-readable medium of claim 32, comprising program code for a program browser.
45. The computer-readable medium of claim 32, wherein the program code (b) comprises:
i) program code for determining whether the types representing the locations for the identified store usage comply with the typing constraint, and
ii) program code for modifying types representing locations for the identified store usage to comply with the typing constraint if the types representing the locations for the identified store usage do not comply with the typing constraint.
46. The computer-readable medium of claim 45, wherein the program code (a) comprises program code for identifying a form of a program statement describing the store usage, and
wherein the program code (b)(i) comprises program code for determining whether the types representing the locations for the identified store usage comply with a type rule specifying the typing constraint for the identified program statement form.
47. The computer-readable medium of claim 32, wherein the program code (b) comprises:
i) program code for determining whether the types representing the locations for the identified store usage comply with the typing constraint,
ii) program code for identifying any potential constraints for types representing locations for the identified store usage,
iii) program code for modifying types representing locations for the identified store usage to comply with the typing constraint if the types representing the locations for the identified store usage do not comply with the typing constraint, and
iv) program code for modifying types representing locations for any identified potential constraints affected by the modification of types representing locations for the identified store usage.
48. The computer-readable medium of claim 47, wherein the program code (b)(ii) comprises program code for identifying a potential constraint in a pending set.
49. The computer-readable medium of claim 47, wherein the program code (b)(ii) comprises program code for identifying from the identified store usage an access pattern relationship for types representing locations for the identified store usage.
50. The computer-readable medium of claim 47, wherein the program code (b)(ii) comprises program code for identifying from the identified store usage a pointer offset relationship for types representing locations for the identified store usage.
51. The computer-readable medium of claim 47, wherein the program code (b)(ii) comprises program code for identifying from the identified store usage any potential points-to relationships for a type representing a non-pointer value.
52. The computer-readable medium of claim 47, wherein the program describes a plurality of store usages and wherein the program code of the computer-readable medium, when executed by a data processing system, analyzes each described store usage only one time in an order independent of program control flow.
53. A data processing system comprising:
a) a translator for translating a program in a first language into code in a second language;
b) a pointer analyzer for performing a pointer analysis for the program, the pointer analyzer for identifying in the program a store usage accessing locations and for representing locations for the identified store usage with a set of hierarchical types describing different access patterns for the locations for the identified store usage based on how the identified store usage accesses the locations such that the types representing the locations for the identified store usage comply with a typing constraint;
c) a store model for storing the types representing locations for the program; and
d) an optimizer for optimizing the code based on the store model.
54. The data processing system of claim 53, comprising a symbol table for identifying locations for the program.
55. The data processing system of claim 53, wherein the pointer analyzer represents one of the locations for the identified store usage with a type describing the one location is accessed as a structured memory object.
56. The data processing system of claim 53, wherein the pointer analyzer represents one of the locations for the identified store usage with a type describing the one location is accessed inconsistently if the identified store usage and another store usage in the program define different access patterns for the one location.
57. The data processing system of claim 53, wherein the pointer analyzer represents one of the locations for the identified store usage with a type describing the one location is accessed inconsistently if the identified store usage accesses the one location through an offset pointer.
58. The data processing system of claim 53, wherein the pointer analyzer describes a location pointer for the identified store usage with a location type representing a pointed-to location and with an offset describing how the location pointer points to the pointed-to location relative to a beginning of the pointed-to location.
59. The data processing system of claim 53, wherein the pointer analyzer represents one of the locations for the identified store usage with a type describing a size of the one location.
60. The data processing system of claim 53, wherein the pointer analyzer describes a content of one of the locations for the identified store usage with a location type and a function type.
61. The data processing system of claim 53, wherein each type is represented by a type variable and an associated type constructor.
62. The data processing system of claim 53, wherein the pointer analyzer determines whether the types representing the locations for the identified store usage comply with the typing constraint and modifies types representing locations for the identified store usage to comply with the typing constraint if the types representing the locations for the identified store usage do not comply with the typing constraint.
63. A method for performing a pointer analysis for a program with a data processing system, the method comprising:
(a) identifying in the program one or more store usages accessing locations; and
(b) generating a store model to approximate run-time store usage for the program, the store model comprising types representing locations for the identified store usage(s) such that the types describe access patterns for the locations for the identified store usage(s) based on how the identified store usage(s) access the locations wherein the types are comprised of components defining such access patterns.
64. The method of claim 63 wherein the types comprise a hierarchy of types.
65. The method of claim 64 wherein types ma y be p romot ed to ty pes higher in the hierarchy based on constraints.
66. The method of claim 64 wherein one type describes locations whose content values are used only as a unit, and a fuirther type describes locations whose content values are used only as structured objects.
67. The method of claim 64 whe rein one type describes locations whose content values are used in inconsistent manners by use of components comprising a size, a parent type, a location pointer type, and a function type.
68. The method of claim 63 wherein access patterns for various types are described by components selected from the group consisting of size, parent type, location pointer type, function type, element mapping type, and an offset type.
69. A method for identifying utilization of computer storage by a computer program, the method comprising:
identifying portions of the program that utilize storage; and
generating a store model by identifyWing each portion as a type from a hierarchy of types, each type identifying different access patterns based on how storage is accessed.
70. The method of claim 69 wherein the types comprise components selected from multiple components describing different access patterns.
71. A computer readable medium having instructions stored thereon for causing a computer to execute a method for performing a pointer analysis for a program with a data processing system, the method comprising:
(a) identifying in the program one or more store usages accessing locations; and
(b) generating a store model to approximate run-time store usage for the program, the store model comprising types representing locations for the identified store usage(s) such that the types describe access patterns for the locations for the identified store usage(s) based on how the identified store usage(s) access the locations wherein the types are comprised of components defining such access patterns.
US08/719,144 1996-09-24 1996-09-24 Pointer analysis by type inference for programs with structured memory objects and potentially inconsistent memory object accesses Expired - Lifetime US6202202B1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US08/719,144 US6202202B1 (en) 1996-09-24 1996-09-24 Pointer analysis by type inference for programs with structured memory objects and potentially inconsistent memory object accesses

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US08/719,144 US6202202B1 (en) 1996-09-24 1996-09-24 Pointer analysis by type inference for programs with structured memory objects and potentially inconsistent memory object accesses

Publications (1)

Publication Number Publication Date
US6202202B1 true US6202202B1 (en) 2001-03-13

Family

ID=24888907

Family Applications (1)

Application Number Title Priority Date Filing Date
US08/719,144 Expired - Lifetime US6202202B1 (en) 1996-09-24 1996-09-24 Pointer analysis by type inference for programs with structured memory objects and potentially inconsistent memory object accesses

Country Status (1)

Country Link
US (1) US6202202B1 (en)

Cited By (24)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020026632A1 (en) * 2000-07-28 2002-02-28 Airsys Atm Sa Universal computer code generator
US20030237077A1 (en) * 2002-06-24 2003-12-25 Rakesh Ghiya Identifying pure pointers to disambiguate memory references
US20040003382A1 (en) * 2002-06-26 2004-01-01 Microsoft Corporation Unification-based points-to-analysis using multilevel typing
US6813761B1 (en) * 2000-06-30 2004-11-02 Microsoft Corporation Methods for enhancing flow analysis
US20050060691A1 (en) * 2003-09-15 2005-03-17 Manuvir Das System and method for performing path-sensitive value flow analysis on a program
US20060048095A1 (en) * 2004-08-31 2006-03-02 Microsoft Corporation Local type alias inference system and method
US7133993B1 (en) * 2004-01-06 2006-11-07 Altera Corporation Inferring size of a processor memory address based on pointer usage
US20070169036A1 (en) * 2005-10-31 2007-07-19 Dhi Technologies, Inc. Incremental type inferencing engine
US20070277164A1 (en) * 2006-05-26 2007-11-29 Chandan Bellur Nandakumaraiah Searching computer programs that use different semantics
US20080262992A1 (en) * 2007-04-20 2008-10-23 Microsoft Corporation Type inference for object-oriented languages
US20080301657A1 (en) * 2007-06-04 2008-12-04 Bowler Christopher E Method of diagnosing alias violations in memory access commands in source code
US7739676B1 (en) * 2001-07-25 2010-06-15 The Math Works, Inc. Function values in computer programming languages having dynamic types and overloading
US20100162219A1 (en) * 2007-06-04 2010-06-24 International Business Machines Corporation Diagnosing Aliasing Violations in a Partial Program View
US7770152B1 (en) * 2005-05-20 2010-08-03 Oracle America, Inc. Method and apparatus for coordinating state and execution context of interpreted languages
US20100313190A1 (en) * 2009-06-04 2010-12-09 Microsoft Corporation Determining target types for generic pointers in source code
US20120304158A1 (en) * 2011-05-26 2012-11-29 Oracle International Corporation Points-to analysis as value flow
US8332943B2 (en) 2004-02-17 2012-12-11 Microsoft Corporation Tiered object-related trust decisions
US20140047416A1 (en) * 2012-08-09 2014-02-13 Filip J. Pizlo Failure Profiling for Continued Code Optimization
US9411556B1 (en) * 2015-09-30 2016-08-09 Semmle Limited Template dependency inlining
US9703537B2 (en) * 2015-11-02 2017-07-11 International Business Machines Corporation Method for defining alias sets
US10423397B2 (en) * 2016-12-29 2019-09-24 Grammatech, Inc. Systems and/or methods for type inference from machine code
US10642582B2 (en) * 2017-08-24 2020-05-05 Google Llc System of type inference for tuple graph programs method of executing a tuple graph program across a network
US10887235B2 (en) 2017-08-24 2021-01-05 Google Llc Method of executing a tuple graph program across a network
US11249733B2 (en) * 2020-01-23 2022-02-15 Samsung Electronics Co., Ltd. Electronic apparatus and control method thereof

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5689665A (en) * 1992-02-28 1997-11-18 International Business Machines Corporation Apparatus and method for displaying windows
US5790866A (en) 1995-02-13 1998-08-04 Kuck And Associates, Inc. Method of analyzing definitions and uses in programs with pointers and aggregates in an optimizing compiler

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5689665A (en) * 1992-02-28 1997-11-18 International Business Machines Corporation Apparatus and method for displaying windows
US5790866A (en) 1995-02-13 1998-08-04 Kuck And Associates, Inc. Method of analyzing definitions and uses in programs with pointers and aggregates in an optimizing compiler

Non-Patent Citations (44)

* Cited by examiner, † Cited by third party
Title
Agesen, Ole, Concrete Type Inference: Delivering Object-Oriented Applications, Ph.D. Thesis, Stanford University, Abstract and Table of Contents, pp. i-ii, iv, and vi-viii (Dec. 1995).
Aho et al; Compilers, Principles, Techniques, and Tools; pp. 4-9, 11, 399-400, 473-488 and 527-528, 1986. *
Aho, Alfred V., et al., Compilers: Principles, Techniques, and Tools, Addison-Wesley Publishing Company, Contents and Chapter 6, pp. vii-x and 343-388 (1986).
Aho, et. al., "Compilers, Principles, Techniques, and Tools", pp. 11 and 22. *
Aiken, Alexander, et al., "Better Static Memory Management: Improving Region-Based Analysis of Higher-Order Languages," SIGPLAN '95 Conference on Programming Language Design and Implementation, La Jolla, California, pp. 174-185 (Jun. 1995).
Andersen, Lars Ole, Program Analysis and Specialization for the C Programming Language, Ph.D. Thesis, DIKU, University of Copenhagen, Denmark (May 1994).
Austin, Todd M., et al., "Efficient Detection of All Pointer and Array Access Errors," SIGPLAM '94 Conference on Programming Language Design and Implementation, Orlando, Florida, pp. 290-301 (Jun. 1994).
Birkedal, Lars, et al., "From Region Inference to von Neumann Machines via Region Representation Inference," Proceedings of the 23rd SIGPLAN-SIGACT Symposium on Principles of Programming Languages, St. Petersburg, Florida, pp. 171-183 (Jan. 1996).
Booch et al, "Software Engineering with ADA" pp. 89-132, 1994. *
Burke, Michael, et al., "Flow-Insensitive Interprocedural Alias Analysis in the Presence of Pointers", Research Report RC 19546, IBM T.J. Watson Research Center, Yorktown Heights, New York, pp. 1-21 (Sep. 1994).
Chase, David R., et al., "Analysis of Pointers and Structures," Proceedings of the ACM SIGPLAN '90 Conference on Programming Language Design and Implementation, White Plains, New York, pp. 296-310 (Jun. 20-22, 1990).
Choi, Jong-Deok, et al., "Efficient Flow-Sensitive Interprocedural Computation of Pointer-Induced Aliases and Side Effects," Twentieth Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, Charleston, South Carolina, pp. 232-245 (Jan. 10-13, 1993).
Choi, Jong-Deok, et al., "On the Efficient Engineering of Ambitious Program Analysis," IEEE Transactions on Software Engineering, vol. 20, No. 2, pp. 105-114 (Feb. 1994).
Compleat C, by J.F. Peters and Hamed M. Sallam, pp. 298-309, 1986. *
Cousot, Patrick et al., "Abstract Interpretation: A Unified Lattice Model for Static Analysis of Programs by Construction or Approximation of Fixpoints," Proceedings of the Fourth Annual ACM Symposium on Principles of Programming Languages, pp. 238-252 (Jan. 1977).
Cytron Ron, et al., "Efficiently Computing Static Single Assignment Form and the Control Dependence Graph," ACM Transactions on Programming Languages and Systems, vol. 13, No. 4, pp. 451-490 (Oct. 1991).
Damas, Luis, et al., "Principal Type-Schemes for Functional Programs," Conference Record of the Ninth Annual ACM Symposium on Principles of Programming Languages, Albuquerque, New Mexico, pp. 207-212 (Jan. 1982).
Deutsch, Alain, "A Storeless Model of Aliasing and its Abstractions using Finite Representations of Right-Regular Equivalence Relations," Proceedings of the IEEE 1992 International Conference on Computer Languages, Oakland, California, pp. 2-13 (Apr. 1992).
Deutsch, Alain, "Interprocedural May-Alias Analysis for Pointers: Beyond k-limiting," SIGPLAN '94 Conference on Programming Language Design and Implementation, Orlando, Florida, pp. 230-241 (Jun. 1994).
Emami, Maryam, et al., "Context-Sensitive Interprocedural Points-to Analysis in the Presence of Function Pointers," ACAPS Technical Memo 54, Advanced Compilers, Architectures and Parallel Systems Group, School of Computer Science, McGill University, Montreal, Canada, pp. 1-28 (Nov. 12, 1993).
Heintze, Nevin, Set Based Program Analysis, Ph.D. Thesis, Carnegie Mellon University, Abstract Contents, and Chapter 1, pp. i-iv and 1-6 (1992).
Henglein, Fritz, "Efficient Type Inference for Higher-Order Binding-Time Analysis," Functional Programming Languages and Computer Architecture, 5th ACM Conference, Cambridge, Massachusetts, pp. 448-472 (Aug. 26-30, 1991).
Henglein, Fritz, "Type Inference with Polymorphic Recursion," ACM Transcations on Programming Languages and Systems, vol. 15, No. 2, pp. 253-289 (Apr. 1993).
Kahn, G., "Natural Semantics," Lecture Notes in Computer Science, vol. 247, STACS 87 4th Annual Symposium on Theoretical Aspects of Computer Science, Passau, Federal Republic of Germany, pp. 22-39 (Feb. 19-21, 1987).
Landi, William, et al., "A Safe Approximate Algorithm for Interprocedural Pointer Aliasing," Proceedings of the SIGPLAN '92 Conference on Programming Language Design and Implementation, pp. 235-248 (Jun. 1992).
Landi, William, et al., "Interprocedural Modification Side Effect Analysis With Pointer Aliasing," Proceedings of the SIGPLAN '93 Conference on Programming Language Design and Implementation, pp. 56-67 (Jun. 1993).
Mairson, Harry G., "Deciding ML Typability is Complete for Deterministic Exponential Time," Proceedings of the Seventeenth Annual ACM Symposium on Principles of Programming Languages, pp. 382-401 (Jan. 1990).
Mycroft, Alan, "Polymorphic Type Schemes and Recursive Definitions," Lecture Notes in Computer Science, vol. 167, Proceedings of the International Symposium on Programming 6th Colloquium, Toulouse, pp. 217-228 (Apr. 17-19, 1984).
O'Callahan, Robert, et al., "Detecting Shared Representations Using Type Inference," Technical Report CMU-CS-95-202, School of Computer Science, Carnegie Mellon University, Pittsburgh, Pennsylvania, pp. 1-21 (Sep. 1995).
Ruf, Erik, "Context-Insensitive Alias Analysis Reconsidered," SIGPLAN '95 Conference on Programming Language Design and Implementation, La Jolla, California, pp. 13-22 (Jun. 1995).
Steensgaard, Bjarne "Points-to Analysis by Type Inference of Programs with Structures and Unions," Lecture Notes on Computer Science, vol. 1060, Proceedings of the 1996 International Conference on Compiler Construction, Linkobing Sweden, pp. 136-150 (Apr. 24-26, 1996).
Steensgaard, Bjarne, "Points-to Analysis in Almost Linear Time," Proceedings of the 23rd SIGPLAN/SIGACT Symposium on Principles of Programming Languages, St. Petersburg, Florida, pp. 32-41 (Jan. 21-24, 1996).
Steensgaard, Bjarne, "Points-to Analysis in Almost Linear Time," Technical Report MSR-TR-95-08, Microsoft Research, Redmond, Washington, pp. 1-12 (Mar. 1995).
Steensgaard, Bjarne, "Sparse Functional Stores for Imperative Programs," ACM SIGPLAN Workshop on Intermediate Representations (IR' 95), San Francisco, California, pp. 62-70 (Jan. 22, 1995).
Talpin, Jean-Pierre, et al., "Syntactic Type Polymorphism for Recursive Function Definitions," Workshop on Types for Program Analysis, pp. 80-94 (May 26-27, 1995).
Talpin, Jean-Pierre, et al., Syntactic Control of Type Polymorphism for Recursive Function Definitions, Technical Report ECRC-94-29, European Computer-Industry Research Centre GmbH, pp. I-III, 1-15, and i-xiv (Jul. 1994, Revised Feb. 1995).
Tarjan, Robert E., Data Structures and Networks Algorithms, Regional Conference Series in Applied Mathematics vol. 44, Society for Industrial and Applied Mathematics (SIAM), Table of Contents, pp. v-vi (1983).
Tofte, Mads, et al., "Implementation of the Typed Call-by-Value lambd-calculus using a Stack of Regions," Proceedings of the 21st ACM SIGPLAM-SIGTACT Symposium on Principles of Programming Languages, Portland Oregon, pp. 188-201 (Jan. 17-21, 1994).
Tofte, Mads, et al., "Implementation of the Typed Call-by-Value λ-calculus using a Stack of Regions," Proceedings of the 21st ACM SIGPLAM-SIGTACT Symposium on Principles of Programming Languages, Portland Oregon, pp. 188-201 (Jan. 17-21, 1994).
Weihl, William E., "Interprocedural Data Flow Analysis in the Presence of Pointers, Procedure Variables, and Label Variables," Seventh Annual ACM Symposium on Principles of Programming Languages, Las Vegas, Nevada, pp. 83-94 (Jan. 28-30, 1980).
Weise, Daniel et al., "Value Dependence Graphs: Representation Without Taxation," Technical Report MSR-TR-94-03, Microsoft Research, Redmond, Washington, 14 pages (Arp. 13, 1994).
Weise, Daniel, et al., "Value Dependence Graphs: Representation Without Taxation," Proceedings of the 21st ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, Portland, Oregon, pp. 297-310 (Jan. 17-21, 1994).
Wilson, Robert P., et al., "Efficient Context-Sensitive Pointer Analysis for C Programs," SIGPLAN '95 Conference on Programming Language Design and Implementation, La Jolla, California, pp. 1-12 (Jun. 1995).
Zhang, Sean, et al., "Program Decomposition for Pointer-induced Aliasing Analysis,"Technical Report LCSR-TR-259, Laboratory of Computer Science Research, pp. 1-37 (Mar. 1996).

Cited By (43)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6813761B1 (en) * 2000-06-30 2004-11-02 Microsoft Corporation Methods for enhancing flow analysis
US20020026632A1 (en) * 2000-07-28 2002-02-28 Airsys Atm Sa Universal computer code generator
US7739676B1 (en) * 2001-07-25 2010-06-15 The Math Works, Inc. Function values in computer programming languages having dynamic types and overloading
US20030237077A1 (en) * 2002-06-24 2003-12-25 Rakesh Ghiya Identifying pure pointers to disambiguate memory references
US7127710B2 (en) 2002-06-24 2006-10-24 Intel Corporation Identifying pure pointers to disambiguate memory references
US20040003382A1 (en) * 2002-06-26 2004-01-01 Microsoft Corporation Unification-based points-to-analysis using multilevel typing
US7039908B2 (en) 2002-06-26 2006-05-02 Microsoft Corporation Unification-based points-to-analysis using multilevel typing
US20050060691A1 (en) * 2003-09-15 2005-03-17 Manuvir Das System and method for performing path-sensitive value flow analysis on a program
US7089537B2 (en) * 2003-09-15 2006-08-08 Microsoft Corporation System and method for performing path-sensitive value flow analysis on a program
US7133993B1 (en) * 2004-01-06 2006-11-07 Altera Corporation Inferring size of a processor memory address based on pointer usage
US10284576B2 (en) 2004-02-17 2019-05-07 Microsoft Technology Licensing, Llc Tiered object-related trust decisions
US8955126B2 (en) 2004-02-17 2015-02-10 Microsoft Corporation Tiered object-related trust decisions
US9208327B2 (en) 2004-02-17 2015-12-08 Microsoft Technology Licensing, Llc Tiered object-related trust decisions
US8468603B2 (en) 2004-02-17 2013-06-18 Microsoft Corporation Tiered object-related trust decisions
US8332943B2 (en) 2004-02-17 2012-12-11 Microsoft Corporation Tiered object-related trust decisions
US20060048095A1 (en) * 2004-08-31 2006-03-02 Microsoft Corporation Local type alias inference system and method
US7770152B1 (en) * 2005-05-20 2010-08-03 Oracle America, Inc. Method and apparatus for coordinating state and execution context of interpreted languages
US20070169036A1 (en) * 2005-10-31 2007-07-19 Dhi Technologies, Inc. Incremental type inferencing engine
US7827537B2 (en) * 2006-05-26 2010-11-02 Oracle America, Inc Searching computer programs that use different semantics
US20070277164A1 (en) * 2006-05-26 2007-11-29 Chandan Bellur Nandakumaraiah Searching computer programs that use different semantics
US7873592B2 (en) 2007-04-20 2011-01-18 Microsoft Corporation Type inference for object-oriented languages
US20080262992A1 (en) * 2007-04-20 2008-10-23 Microsoft Corporation Type inference for object-oriented languages
US8930927B2 (en) * 2007-06-04 2015-01-06 International Business Machines Corporation Diagnosing aliasing violations in a partial program view
US20100162219A1 (en) * 2007-06-04 2010-06-24 International Business Machines Corporation Diagnosing Aliasing Violations in a Partial Program View
US20080301657A1 (en) * 2007-06-04 2008-12-04 Bowler Christopher E Method of diagnosing alias violations in memory access commands in source code
US8839218B2 (en) 2007-06-04 2014-09-16 International Business Machines Corporation Diagnosing alias violations in memory access commands in source code
US9329845B2 (en) * 2009-06-04 2016-05-03 Microsoft Technology Licensing, Llc Determining target types for generic pointers in source code
US20100313190A1 (en) * 2009-06-04 2010-12-09 Microsoft Corporation Determining target types for generic pointers in source code
US8473927B2 (en) * 2011-05-26 2013-06-25 Oracle International Corporation Points-to analysis using value flow
US20120304158A1 (en) * 2011-05-26 2012-11-29 Oracle International Corporation Points-to analysis as value flow
US9256410B2 (en) * 2012-08-09 2016-02-09 Apple Inc. Failure profiling for continued code optimization
US20140047416A1 (en) * 2012-08-09 2014-02-13 Filip J. Pizlo Failure Profiling for Continued Code Optimization
US11016743B2 (en) 2012-08-09 2021-05-25 Apple Inc. Runtime state based code re-optimization
US9411556B1 (en) * 2015-09-30 2016-08-09 Semmle Limited Template dependency inlining
US10223088B2 (en) 2015-11-02 2019-03-05 International Business Machines Corporation Method for defining alias sets
US9720664B2 (en) * 2015-11-02 2017-08-01 International Business Machines Corporation Method for defining alias sets
US9703537B2 (en) * 2015-11-02 2017-07-11 International Business Machines Corporation Method for defining alias sets
US10423397B2 (en) * 2016-12-29 2019-09-24 Grammatech, Inc. Systems and/or methods for type inference from machine code
US10942718B2 (en) 2016-12-29 2021-03-09 Grammatech, Inc Systems and/or methods for type inference from machine code
US10642582B2 (en) * 2017-08-24 2020-05-05 Google Llc System of type inference for tuple graph programs method of executing a tuple graph program across a network
US10887235B2 (en) 2017-08-24 2021-01-05 Google Llc Method of executing a tuple graph program across a network
US11429355B2 (en) 2017-08-24 2022-08-30 Google Llc System of type inference for tuple graph programs
US11249733B2 (en) * 2020-01-23 2022-02-15 Samsung Electronics Co., Ltd. Electronic apparatus and control method thereof

Similar Documents

Publication Publication Date Title
US6202202B1 (en) Pointer analysis by type inference for programs with structured memory objects and potentially inconsistent memory object accesses
US6014518A (en) Terminating polymorphic type inference program analysis
US5956512A (en) Computer program debugging in the presence of compiler synthesized variables
EP0665493B1 (en) A typesafe framework for dynamically extensible objects
Johnson et al. TS: An optimizing compiler for Smalltalk
EP0643851B1 (en) Debugger program which includes correlation of computer program source code with optimized objet code
US5175856A (en) Computer with integrated hierarchical representation (ihr) of program wherein ihr file is available for debugging and optimizing during target execution
US5577253A (en) Analyzing inductive expressions in a multilanguage optimizing compiler
USRE38104E1 (en) Method and apparatus for resolving data references in generated code
US6378126B2 (en) Compilation of embedded language statements in a source code program
JP3110040B2 (en) Method and apparatus for compiling a computer program with register allocation between procedures
US5448737A (en) System and method for optimizing computer code using a compact data flow representation
Brooks et al. A new approach to debugging optimized code
US6072950A (en) Pointer analysis by type inference combined with a non-pointer analysis
US5535394A (en) Methods for distinguishing dynamically allocated storage along different call paths and for providing a compact representation of alias analysis
US20080216061A1 (en) Inferring Function Calls In An Ambiguous Language Computer Program
US20080178149A1 (en) Inferencing types of variables in a dynamically typed language
US20060212847A1 (en) Type checker for a typed intermediate representation of object-oriented languages
NZ241694A (en) Compiling computer code: optimizing intermediate language flow graph using routine to fold constant expressions
IE920606A1 (en) Interface for representing effects in a multilanguage¹optimizing compiler
IL100990A (en) Multilanguage optimizing compiler using templates in multiple pass code generation
JPH0432416B2 (en)
AU658399B2 (en) Symbol table for intermediate compiler language
US7039908B2 (en) Unification-based points-to-analysis using multilevel typing
Garcia et al. Design and implementation of an efficient hybrid dynamic and static typing language

Legal Events

Date Code Title Description
AS Assignment

Owner name: MICROSOFT CORPORATION, WASHINGTON

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:STEENSGAARD, BJARNE;REEL/FRAME:008220/0122

Effective date: 19960923

STCF Information on status: patent grant

Free format text: PATENTED CASE

FPAY Fee payment

Year of fee payment: 4

FPAY Fee payment

Year of fee payment: 8

FPAY Fee payment

Year of fee payment: 12

AS Assignment

Owner name: MICROSOFT TECHNOLOGY LICENSING, LLC, WASHINGTON

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:MICROSOFT CORPORATION;REEL/FRAME:034541/0001

Effective date: 20141014