US20090319564A1 - Intentionally-Linked Entities: A General-Purpose Database System - Google Patents

Intentionally-Linked Entities: A General-Purpose Database System Download PDF

Info

Publication number
US20090319564A1
US20090319564A1 US12/490,270 US49027009A US2009319564A1 US 20090319564 A1 US20090319564 A1 US 20090319564A1 US 49027009 A US49027009 A US 49027009A US 2009319564 A1 US2009319564 A1 US 2009319564A1
Authority
US
United States
Prior art keywords
entity
relationship
entities
database
relationships
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US12/490,270
Inventor
Vitit Kantabutra
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to US12/490,270 priority Critical patent/US20090319564A1/en
Publication of US20090319564A1 publication Critical patent/US20090319564A1/en
Abandoned legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/901Indexing; Data structures therefor; Storage structures
    • G06F16/9024Graphs; Linked lists

Definitions

  • This application relates to database systems, specifically to database systems in which there are data entities that have relationships amongst one another.
  • Relational Databases have a high level of built-in data redundancy that invites errors and inconsistencies. These shortcomings can make the design of a good database schema difficult, and can even make the simple act of data entry very annoying. Object-Oriented databases and the like only directly support simple relationships and can be hard to use, accounting for their lack of popularity. XML imposes an hierarchy on the entities, and only allows limited breakaway from the hierarchy with any degree of convenience. Additionally, pointers indicating the non-hierarchical relationships are represented as text rather than true pointers.
  • Hierarchical databases are oriented towards hierarchical relationships amongst data entities rather than more general relationships.
  • Network databases permit non-hierarchical relationships more natively, but are still hard to use because it is oriented towards binary, many-to-one relationships, requiring more general relationships to be available only by simulation. Additionally, querying a Network database is an exercise in “manual” (programmed) physical navigation of the network, which means that any update to the database requires code update as well.
  • the query language is procedural rather than declarative like SQL.
  • Relational databases need to be examined in more detail because it is still the predominant type of database, and hence is so readily available that it is used even in inappropriate circumstances.
  • Network databases needs to be examined in detail because it is the closest to what we are proposing to patent here, though also significantly different as we will point out.
  • Relational database scheme certainly has the advantage of simplicity over all the others except for the flat file database, which is not discussed here.
  • the only concept the user really has to know to begin using Relational databases is that of a two-dimensional table. Every data entity is represented merely by something that can be written into one or more table entries, such as a character string (a name, perhaps) or a number (I.D.), or a combination of a string and a number, for example. Often, such simplicity works very well in practice. However, real-world databases get complex very quickly, and often such simplicity doesn't work any more, as we will now see through a real example.
  • Our sample application is to store a database about a network of merchants and their clients in Mediaeval Spain.
  • the source of our data is a set of notarized documents or contracts, each representing some kind of business transaction from the 1500's.
  • each business transaction there was one or more merchants acting as servers or service providers and zero or more clients.
  • the transaction is easily represented as a row in a Relational database.
  • mistyping For instance, if each person's name (or an ID number) is used as a key, and it is misspelled or mistyped. Such an act of misspelling or mistyping amounts to the creation of a new person entity. This is an important flaw in Relational databases—it is caused by the fact that something as important as an entity (a person, no less! is represented by a mere character string (or an ID number).
  • Relational database tables can't have a variable number of columns.
  • One solution is to have enough columns to accommodate the maximum number of columns we will ever need. This is a poor solution, because it leads to a large number of NULL entries.
  • Relational databases are simple, but only to a casual user who does not intend to use them for a complex project where a great deal of efficiency or reliability is required.
  • the Network DBMS uses pointers, also called links or references, to represent binary, many-to-one relationships amongst entities, which in turn are represented by records.
  • pointers also called links or references
  • General relationships can also be represented, but only by simulation.
  • a many-to-one relationship is represented with a cyclical chain of pointers. All the records in the chain form a “set.” In that set, there is a unique entity in the “one” of the many-to-one relationship. This unique entity is called the “owner” of the set, whereas all the other entities, those of the “many,” are called the “members” of the set. There is a direct link from the owner of the set to only one of the members, and likewise only one of the members has a direct link to its owner in the set. This makes for inefficient searches. Note, though, that proponents of Network databases thought that Network databases are often more efficient than Relational databases because of the links in the former. But the links in Networks DBMS's can be quite indirect, which means that searches in ILE, with its links being more direct, should be generally more efficient than in either of the other two types of databases.
  • Searches in a Network database is done by means of pointer navigation or traversal. Such navigation is done by procedural, not declarative, code, and must be explicitly programmed by the application programmer. This is not only difficult because the application programmer has to know the exact structure of the database, but it also means that any change in the database could be bad news, because it frequently requires code change!
  • FIG. 1 shows an embodiment of an entire Intentionally-Linked Entities (ILE) Database system.
  • ILE Intentionally-Linked Entities
  • FIG. 2 shows a data structure or object representing an ILE database according to an embodiment.
  • FIG. 3 shows a data structure or object representing an entity set in an ILE database according to an embodiment.
  • FIG. 4 shows a data structure or object representing an entity in an ILE database according to an embodiment.
  • FIG. 5 shows a data structure or object representing a relationship set in an ILE database.
  • FIG. 6 shows a data structure or object representing an “entity set plus” (ESP).
  • An ESP data structure or object comprises the entity set as well as the names and types of per-relationship attributes of the entities in the entity set.
  • FIG. 7 shows a relationship data structure or object, as represented in an embodiment as an array of arrays of elements, where each element is a reference to an ESP object shown in the previous figure.
  • FIG. 8 shows how a many-to-one relationship between entities is defined in a Network database.
  • a Network database In a Network database, only many-to-one, binary relationships are implemented directly without simulation. Such a relationship is implemented with a circular linked list as shown in this figure. There is no direct link from the “Harrison” student record to the “MA235” enrollment record. Note that many real databases have many-to-one and/or non-binary relationships, making Network databases hard to use. Even this example here, which is similar to one from a written source, is unrealistic as it stands because only one student can be enrolled in each course! (Note that in fact, direct implementation of a relationship is only permitted if the “many” and the “one” records are of different types. That is, even a many-to-one, binary relationship must be simulated if the records of the “many” and the “one” are of the same type. This helps to make Network databases difficult to program.)
  • ILE Intentionally-Linked Entities
  • relationships among entities will be represented directly as true links among them.
  • general graphs as in Graph Theory
  • the data model will be similar to the Entity/Relationship data model, which was never implemented very well in the prior art partly due to the lack of good programming tools such as object-oriented languages and simple-to-use dynamic memory allocation. (The most valiant attempt in the past was the flopped Network Databases discussed earlier in this document.)
  • sufficient tools and programming languages have been developed so that complex linked data structures are now in more widespread use. Complex linked data structures are used in operating system kernels, for example.
  • an edge represents a binary relationship, that is, a relationship between two nodes, where the nodes commonly represent entities.
  • relationships with arities greater than two are possible, and in fact are convenient to create and naturally represented.
  • ILE data structures are more powerful than general graphs.
  • Reference 10 is an entire ILE system, which can contain an arbitrary number of databases.
  • the idea is that in ILE, we can enable the various databases in the system to communicate (share data) with each other.
  • the system is divided into database sets, references 20 and 20 ′, with the idea that it is possible to permit databases in the same set to communicate with each other, but that we could optionally disallow communications across different sets. In fact a more complex tree of databases may prove useful, but that is off topic for this patent.
  • each database set say reference 20 ′, there can be an arbitrary number of databases.
  • references 30 and 30 ′ are two databases in the same database set 20 ′. From now on we will just concentrate on one database, say reference 30 ′.
  • a database includes a data structure or object such as a hash that contains or holds references to all the entity sets (reference 40 ), which are data structures or objects that represents sets of data entities, such that all the entities in each such set are of the same kind. For example, in a university database all entities representing students could be in a single entity set.
  • a database also includes a data structure or object that contains or hold references to all the relationship sets (reference 60 ), which are data structures or objects that represent sets of relationships of like kind. For example, all the relationships between two people of the form “is the father of” form one relationship set.
  • FIG. 2 shows the contents of a database object in more detail.
  • the data structures of entity set and of relationship sets mentioned in the previous two paragraphs are shown as references 32 and 33 in FIG. 2 .
  • Ref. 31 is the database name or a reference thereto.
  • Ref. 34 is optional information (or a link thereto), such as notes about the database.
  • Ref. 51 is the name of the entity set, or a reference thereto.
  • Ref. 52 is a reference to the database to which this entity set belong. This is not necessary, but can make some operations more convenient.
  • Ref. 53 is an ordered data structure (such as an array, dynamically allocated) of the key attribute names and types.
  • ILE Much as ILE uses modern objects for its implementation, and is object-oriented in the sense that it can be embodied to permit objects as data entities, it is not an object-oriented database like Network databases, as in the sense used in Ullman [2] but is instead value-oriented like Relational databases. That is, ILE does not use storage location as key, but uses key attribute values as key instead.
  • Ref. 60 is a data structure that holds (a reference to) all the relationship sets in the current ILE database.
  • Ref. 70 and 70 ′ are sample relationship sets.
  • FIG. 5 shows a relationship set as implemented in an embodiment.
  • Ref. 71 is the relationship set name or a reference thereto.
  • Ref. 72 is a reference to the current database.
  • Ref. 73 is an ordered composite data structure, such as an array, of composite data structure such as an array of “entity set plus” objects or data structures, where an “entity set plus” object or data structure comprises an entity set and the per-entity attributes pertaining to how each entity in the relevant entity set enters a relationship in this relationship set.
  • Ref. 74 contains names and types of relationship attributes, arranged into an ordered data structure such as a dynamically allocated array.
  • References 80 , 80 ′, and 80 ′′ are entity objects or data structure.
  • FIG. 4 shows the details of an embodiment of an entity as a data structure or object.
  • Ref, 81 is a reference to the entity set to which this entity belongs.
  • Ref. 82 is a data structure or object representing the key attributes of this entity, whereas ref. 83 represents the non-key attributes.
  • Ref. 84 are references to the relationships in which this entity is involves. Included are means for indicating which role in each such relationship this entity plays.
  • FIG. 7 details a relationship data structure or object.
  • a relationship in one embodiment, is represented by an array of array (all arrays are dynamically allocated). Each element of this array is represented as reference 95 , and is actually called an “Entity-Plus” object, shown as references 88 , 88 ′, 88 ′′′ and 88 ′′′ in FIG. 1 .
  • An Entity-Plus data structure or object comprises an entity data structure or object plus attributes pertaining to this entity as it enters into a particular relationship. These attributes are determined by both the particular entity and the particular relationship.
  • relationship objects are possible, wherein at most one entity plays each role. Instead of having an entire array of “entity plus” objects representing each role, we use only one such object. This simpler embodiment will be represented as a separate set of claims.
  • a database is object-oriented if the storage location of an entity can be used as the entity's key.
  • value-oriented is “value-oriented.”
  • a database is value-oriented if an entity is identified only by attribute values. Relational databases are value-oriented, and its success relative to Network databases is due in a significant part to that fact. Learning from that success, ILE is meant to be value-oriented. It can be said that ILE has Relational DBMS's advantage of being value-orientedness, as well as Network DBMS's advantage of having links, although ILE's links are more direct than those of Network DBMS's.

Abstract

In accordance with one embodiment the subject of the patent is a method for storing a database comprising entity objects or data structures representing the data entities, and relationship objects or data structures representing the relationships amongst the entities. Each relationship object or data structure possesses links to the entity objects or data structures that play the various roles in the relationship. Where there is a link from a relationship to an entity, there is also a link from the entity to the relationship, facilitating queries and updates to the database system. It is possible and often desirable for an embodiment to permit not merely one, but possibly many (or zero) entities to play each role in a relationship. The database is value-oriented in the sense that the address of an entity is not part of the key, thus permitting value-comparison-based searches.

Description

    CROSS-REFERENCE TO RELATED APPLICATIONS
  • This application claims the benefit of provisional patent application Ser. No. 61/075,189, filed Jun. 24, 2008 by the present inventor.
  • OTHER RELEVANT APPLICATIONS
      • U.S. Pat. No. 7,483,920 Jan. 27, 2009 Mori, et al.: Database management system, database management method, and program
      • U.S. Pat. No. 7,333,986 Feb. 19, 2008 Minamino, et al.: Hierarchical database management system, hierarchical database management method, and hierarchical database management program
      • U.S. Pat. No. 6,633,886 Oct. 14, 2003 Chong: Method of implementing an acyclic directed graph structure using a relational database
    OTHER REFERENCES
  • C. J. Date and E. F. Codd, “The Relational and Network Approaches: Comparison of the Application Programming Interfaces,” in ACM SIGFIDET (now SIGMOD) workshop on Data description, access and control, 1974.
  • J. D. Ullman, Principles of Database and Knowledge-Base Systems, Vol. 1, Computer Science Press, 1988.
  • H. F. Korth and A. Silberschatz, Database System Concepts, second edition, McGraw-Hill, 1991.
  • H. Garcia-Molina, J. D. Ullman, and J. Widom, Database Systems: The Complete Book, second edition, Prentice-Hall, 2009.
  • University of Illinois, Urbana-Champaign database tutorial found at http://mias.uiuc.edu/files/tutorials/kcchang01.ppt
  • Introductory information on database management systems at http://www.scribd.com/doc/4355522/dbms
  • FEDERALLY SPONSORED RESEARCH
  • Not applicable.
  • BACKGROUND
  • 1. Field of the Invention
  • This application relates to database systems, specifically to database systems in which there are data entities that have relationships amongst one another.
  • 2. Prior Art
  • All but the most trivial databases comprise data entities of various sorts and interrelationships among them. For example, databases of social or political networks, technical systems, movie production networks, university information, and virtually all other non-trivial database systems involve entities and their interrelationships. Naive users might store information in text files or spreadsheets. More sophisticated users might turn to Relational database systems. Heretofore the best known ways to store databases involving entities and their interrelationships are, (1) Relational Databases, (2) Object-Oriented Databases, including variations like Object-Relational Databases, (3) XML, and (4) Hierarchical and Network Databases, these last types considered mostly obsolete. None of these is really satisfactory for storing data about systems of any complexity in terms of the interrelationships amongst data entities. Relational Databases have a high level of built-in data redundancy that invites errors and inconsistencies. These shortcomings can make the design of a good database schema difficult, and can even make the simple act of data entry very annoying. Object-Oriented databases and the like only directly support simple relationships and can be hard to use, accounting for their lack of popularity. XML imposes an hierarchy on the entities, and only allows limited breakaway from the hierarchy with any degree of convenience. Additionally, pointers indicating the non-hierarchical relationships are represented as text rather than true pointers.
  • Like XML, Hierarchical databases are oriented towards hierarchical relationships amongst data entities rather than more general relationships.
  • Network databases permit non-hierarchical relationships more natively, but are still hard to use because it is oriented towards binary, many-to-one relationships, requiring more general relationships to be available only by simulation. Additionally, querying a Network database is an exercise in “manual” (programmed) physical navigation of the network, which means that any update to the database requires code update as well. The query language is procedural rather than declarative like SQL.
  • We need to spend more attention to Relational and Network databases in our discussion of prior art. Relational databases need to be examined in more detail because it is still the predominant type of database, and hence is so readily available that it is used even in inappropriate circumstances. Network databases, on the other hand, needs to be examined in detail because it is the closest to what we are proposing to patent here, though also significantly different as we will point out.
  • The Relational database scheme certainly has the advantage of simplicity over all the others except for the flat file database, which is not discussed here. As Codd, its inventor, stated, the only concept the user really has to know to begin using Relational databases is that of a two-dimensional table. Every data entity is represented merely by something that can be written into one or more table entries, such as a character string (a name, perhaps) or a number (I.D.), or a combination of a string and a number, for example. Often, such simplicity works very well in practice. However, real-world databases get complex very quickly, and often such simplicity doesn't work any more, as we will now see through a real example.
  • Our sample application is to store a database about a network of merchants and their clients in Mediaeval Spain. The source of our data is a set of notarized documents or contracts, each representing some kind of business transaction from the 1500's. In each business transaction, there was one or more merchants acting as servers or service providers and zero or more clients. In case there is exactly one server and one client, the transaction is easily represented as a row in a Relational database. However, even in this case there is a potential for errors due to mistyping. For instance, if each person's name (or an ID number) is used as a key, and it is misspelled or mistyped. Such an act of misspelling or mistyping amounts to the creation of a new person entity. This is an important flaw in Relational databases—it is caused by the fact that something as important as an entity (a person, no less!) is represented by a mere character string (or an ID number).
  • Another problem arises when we try to use a Relational database if the numbers of servers and clients can vary, as they do in the actual application under consideration. Relational database tables can't have a variable number of columns. One solution is to have enough columns to accommodate the maximum number of columns we will ever need. This is a poor solution, because it leads to a large number of NULL entries.
  • Another well-known problem with Relational database tables is data redundancy, which often leads to data incoherence as well as errors caused by mistyping. Some of that can be removed by means of a process called normalization, which splits up a table into two or more tables. However, normalization can be quite complicated and hard to understand, defeating one major advantage, that of simplicity, touted by Relational DBMS' creator and proponents. In fact, in business practice few users even know much about database schema normalization techniques.
  • There is yet another cause for the complexity of Relational databases, belying the advertised simplicity. The proponents claim that there are nothing but values in tables, but in fact for the sake of efficiency pointers are needed, just like they are needed in other kinds of DBMS'. For example, indexes are needed for efficient searches, and indexes require large numbers of pointers. Additionally, pointers are often required for the storage of data on media such as disks.
  • In summary, Relational databases are simple, but only to a casual user who does not intend to use them for a complex project where a great deal of efficiency or reliability is required.
  • We will now turn to examining the Network DBMS. The Network DBMS uses pointers, also called links or references, to represent binary, many-to-one relationships amongst entities, which in turn are represented by records. General relationships (many-to-many or those with arities greater than 2) can also be represented, but only by simulation.
  • As shown in FIG. 8, a many-to-one relationship is represented with a cyclical chain of pointers. All the records in the chain form a “set.” In that set, there is a unique entity in the “one” of the many-to-one relationship. This unique entity is called the “owner” of the set, whereas all the other entities, those of the “many,” are called the “members” of the set. There is a direct link from the owner of the set to only one of the members, and likewise only one of the members has a direct link to its owner in the set. This makes for inefficient searches. Note, though, that proponents of Network databases thought that Network databases are often more efficient than Relational databases because of the links in the former. But the links in Networks DBMS's can be quite indirect, which means that searches in ILE, with its links being more direct, should be generally more efficient than in either of the other two types of databases.
  • Searches in a Network database is done by means of pointer navigation or traversal. Such navigation is done by procedural, not declarative, code, and must be explicitly programmed by the application programmer. This is not only difficult because the application programmer has to know the exact structure of the database, but it also means that any change in the database could be bad news, because it frequently requires code change!
  • Additionally, even simple queries may require traversing practically the entire network of records [6]. There is no automatic, easy-to-use search facility in Network databases.
  • DRAWINGS—FIGURES
  • FIG. 1 shows an embodiment of an entire Intentionally-Linked Entities (ILE) Database system.
  • FIG. 2 shows a data structure or object representing an ILE database according to an embodiment.
  • FIG. 3 shows a data structure or object representing an entity set in an ILE database according to an embodiment.
  • FIG. 4 shows a data structure or object representing an entity in an ILE database according to an embodiment.
  • FIG. 5 shows a data structure or object representing a relationship set in an ILE database.
  • FIG. 6 shows a data structure or object representing an “entity set plus” (ESP). An ESP data structure or object comprises the entity set as well as the names and types of per-relationship attributes of the entities in the entity set.
  • FIG. 7 shows a relationship data structure or object, as represented in an embodiment as an array of arrays of elements, where each element is a reference to an ESP object shown in the previous figure.
  • FIG. 8 (prior art) shows how a many-to-one relationship between entities is defined in a Network database. In a Network database, only many-to-one, binary relationships are implemented directly without simulation. Such a relationship is implemented with a circular linked list as shown in this figure. There is no direct link from the “Harrison” student record to the “MA235” enrollment record. Note that many real databases have many-to-one and/or non-binary relationships, making Network databases hard to use. Even this example here, which is similar to one from a written source, is unrealistic as it stands because only one student can be enrolled in each course! (Note that in fact, direct implementation of a relationship is only permitted if the “many” and the “one” records are of different types. That is, even a many-to-one, binary relationship must be simulated if the records of the “many” and the “one” are of the same type. This helps to make Network databases difficult to program.)
  • DRAWINGS—REFERENCE NUMERALS
      • 10 Set of all database sets.
      • 20, 20′ Database sets.
      • 30, 30′ Databases.
      • 31 Database name (or a reference thereto)
      • 32 A data structure of entity sets or a reference thereto. In an embodiment, such a structure is a hash of entity set references. The entity set name is used as the hash key to aid searches.
      • 33 A data structure of relationship sets or a reference thereto. In an embodiment, such a structure is a hash of relationship set references. The relationship set name is used as the hash key to aid searches.
      • 34 An optional data structure that may be used to store any useful information pertaining to the database.
      • 40 Set of all the entity sets in the database.
      • 50, 50′ Entity sets.
      • 51 Entity set name or a reference thereto.
      • 52 A reference to the database this entity set belongs to.
      • 53 An ordered data structure (can be embodied as a dynamically-allocated array) representing all the names (and optionally types) of key attributes. Note that in ILE, there are no hidden key implied by the storage location of an object as there was in Network databases. In this sense ILE is “value-oriented” rather than “object-oriented.”
      • 54 An ordered data structure (can be embodied as a dynamically-allocated array) representing all the names (and optionally types) of non-key attributes.
      • 55 A data structure or object containing all the entities belonging to this entity set, or a reference to such a data structure or object.
      • 56 References to all the relationship sets containing the relationships in which the entities belonging to this entity set are involved.
      • 57 Optional data structure that may be used to store any useful information pertaining to the entity set.
      • 58 “Entity Set Plus,” comprising an entity set data structure object as well as a data structure or object representing the names and types of attributes that pertain to both the entity and the relationship, also called the “per-relationship attributes” of an entity.
      • 59 Name and type of per-relationship attributes.
      • 60 Set of all relationship sets in the database.
      • 70 Relationship set
      • 71 A relationship set name or a reference thereto.
      • 72 A reference to the current database, that is, the database of which this relationship set is a part
      • 73 An ordered composite data structure, such as an array, of composite data structure such as an array of “entity set plus” objects or data structures, where an “entity set plus” object or data structure comprises an entity set and the per-entity attributes pertaining to how each entity in the relevant entity set enters a relationship in this relationship set.
      • 74 Names and types of relationship attributes, arranged into an ordered data structure such as a dynamically allocated array.
      • 80, 80′, 80″ Entities
      • 81 A reference to the entity set to which this entity belong.
      • 82 A data structure or object representing the key attributes of this entity. A hash keyed by attribute names is used in an embodiment of ILE.
      • 83 A data structure or object representing the non-key attributes of this entity. A hash keyed by attribute names is used in an embodiment of ILE.
      • 84 References to the relationships in which this entity is involves. Included are means for indicating which role in each such relationship this entity plays.
      • 88 “entity plus,” comprising an entity data structure or object plus attributes pertaining to this entity as it enters into a particular relationship. These attributes are determined by both the particular entity and the particular relationship.
      • 90, 90′ Relationships.
      • 95 An element of an array of arrays in a relationship data structure or object according to one embodiment. This element contains a reference to an “entity-set plus” (ESP) object which is described above.
      • 100 A student record in a Network database.
      • 110, 110′ Enrollment records in a Network database.
      • R0, R5, R0′ array elements containing pointers to entities playing roles 0 and 5, respectively.
    DETAILED DESCRIPTION
  • The subject of this patent is a new kind of database management system called Intentionally-Linked Entities, or ILE. In ILE, relationships among entities will be represented directly as true links among them. Thus general graphs (as in Graph Theory), and in fact more (to be explained below), can be represented naturally. The data model will be similar to the Entity/Relationship data model, which was never implemented very well in the prior art partly due to the lack of good programming tools such as object-oriented languages and simple-to-use dynamic memory allocation. (The most valiant attempt in the past was the flopped Network Databases discussed earlier in this document.) However, at the present time sufficient tools and programming languages have been developed so that complex linked data structures are now in more widespread use. Complex linked data structures are used in operating system kernels, for example. Interestingly enough, complex linked data structures have not been used in the database field except in index structures. The main idea behind the ILE database system is to use modern linked data structures, dynamically-allocated arrays, hashes, and objects in general in the main arena of database storage to the fullest extent possible.
  • What was meant above by saying that we can represent more than just general graphs in ILE? In a graph, an edge represents a binary relationship, that is, a relationship between two nodes, where the nodes commonly represent entities. In ILE, relationships with arities greater than two are possible, and in fact are convenient to create and naturally represented. Thus ILE data structures are more powerful than general graphs. In fact, in ILE, we can also store a new kind of attribute that pertain not to entities in a static way, but that pertain to the entities as they enter a specific relationship. These extra capabilities of ILE are important in the application of ILE to complex networks such as the ones to be referred to in the next paragraph.
  • We now turn to a more detailed description of ILE, as shown in FIG. 1. Reference 10 is an entire ILE system, which can contain an arbitrary number of databases. The idea is that in ILE, we can enable the various databases in the system to communicate (share data) with each other. The system is divided into database sets, references 20 and 20′, with the idea that it is possible to permit databases in the same set to communicate with each other, but that we could optionally disallow communications across different sets. In fact a more complex tree of databases may prove useful, but that is off topic for this patent. In each database set, say reference 20′, there can be an arbitrary number of databases. In FIG. 1, references 30 and 30′ are two databases in the same database set 20′. From now on we will just concentrate on one database, say reference 30′.
  • A database includes a data structure or object such as a hash that contains or holds references to all the entity sets (reference 40), which are data structures or objects that represents sets of data entities, such that all the entities in each such set are of the same kind. For example, in a university database all entities representing students could be in a single entity set.
  • A database also includes a data structure or object that contains or hold references to all the relationship sets (reference 60), which are data structures or objects that represent sets of relationships of like kind. For example, all the relationships between two people of the form “is the father of” form one relationship set.
  • FIG. 2 shows the contents of a database object in more detail. The data structures of entity set and of relationship sets mentioned in the previous two paragraphs are shown as references 32 and 33 in FIG. 2. Ref. 31 is the database name or a reference thereto. Ref. 34 is optional information (or a link thereto), such as notes about the database.
  • Now we look into an embodiment of a data structure or object that holds an entity set, which is shown as references 50 and 50′ in FIG. 1. Referring to FIG. 3. Ref. 51 is the name of the entity set, or a reference thereto. Ref. 52 is a reference to the database to which this entity set belong. This is not necessary, but can make some operations more convenient. Ref. 53 is an ordered data structure (such as an array, dynamically allocated) of the key attribute names and types.
  • Much as ILE uses modern objects for its implementation, and is object-oriented in the sense that it can be embodied to permit objects as data entities, it is not an object-oriented database like Network databases, as in the sense used in Ullman [2] but is instead value-oriented like Relational databases. That is, ILE does not use storage location as key, but uses key attribute values as key instead.
  • Back to FIG. 1, Ref. 60 is a data structure that holds (a reference to) all the relationship sets in the current ILE database. Ref. 70 and 70′ are sample relationship sets. FIG. 5 shows a relationship set as implemented in an embodiment. Ref. 71 is the relationship set name or a reference thereto. Ref. 72 is a reference to the current database. Ref. 73 is an ordered composite data structure, such as an array, of composite data structure such as an array of “entity set plus” objects or data structures, where an “entity set plus” object or data structure comprises an entity set and the per-entity attributes pertaining to how each entity in the relevant entity set enters a relationship in this relationship set. Ref. 74 contains names and types of relationship attributes, arranged into an ordered data structure such as a dynamically allocated array.
  • Describing now samples of individual entities, we once again refer to FIG. 1. References 80, 80′, and 80″ are entity objects or data structure. FIG. 4 shows the details of an embodiment of an entity as a data structure or object. Ref, 81 is a reference to the entity set to which this entity belongs. Ref. 82 is a data structure or object representing the key attributes of this entity, whereas ref. 83 represents the non-key attributes. Ref. 84 are references to the relationships in which this entity is involves. Included are means for indicating which role in each such relationship this entity plays.
  • Ref. 90, 90′ in FIG. 1 are relationship objects or data structures. FIG. 7 details a relationship data structure or object. A relationship, in one embodiment, is represented by an array of array (all arrays are dynamically allocated). Each element of this array is represented as reference 95, and is actually called an “Entity-Plus” object, shown as references 88, 88′, 88″′ and 88″′ in FIG. 1. An Entity-Plus data structure or object comprises an entity data structure or object plus attributes pertaining to this entity as it enters into a particular relationship. These attributes are determined by both the particular entity and the particular relationship.
  • A simpler embodiment of relationship objects is possible, wherein at most one entity plays each role. Instead of having an entire array of “entity plus” objects representing each role, we use only one such object. This simpler embodiment will be represented as a separate set of claims.
  • Finally note that the database is value-oriented, as opposed to object-oriented, in the sense that the address of an entity is not part of the key, thus permitting value-comparison-based searches. To understand this last point it is important to note that there was a different meaning to the phrase “object-oriented” than the one currently used. See Ullman [2] in the “Other references” section. There, a database is object-oriented if the storage location of an entity can be used as the entity's key. The opposite of object-oriented is “value-oriented.” A database is value-oriented if an entity is identified only by attribute values. Relational databases are value-oriented, and its success relative to Network databases is due in a significant part to that fact. Learning from that success, ILE is meant to be value-oriented. It can be said that ILE has Relational DBMS's advantage of being value-orientedness, as well as Network DBMS's advantage of having links, although ILE's links are more direct than those of Network DBMS's.

Claims (11)

1. A method for storing a database involving data entities and relationships of any finite arity amongst said entities, comprising:
a. storing each said entity in a data structure, which could be an object, henceforth referred to as an entity object,
b. storing each said relationship amongst entities in a data structure, which could be an object, henceforth referred to as a relationship object,
c. for each said relationship, grouping zero, one, or more said entities that serve in each role of said relationship into a composite data structure such as a dynamically-allocated array,
d. linking with pointers or references each said relationship object with the appropriate members of said composite of entities involved in the relationship represented by said relationship object,
e. providing users or client programs with convenient and direct means of creating said entity objects and relationship objects without having to simulate or create said objects from other types of records.
2. The method of claim 1 wherein said entities of like kind are grouped together into entity sets, and said relationships of like kind are grouped together into relationship sets.
3. The method of claim 2 wherein, in each said entity, there exist links between the entity and all the relationships in which said entity is involved, and associated with each said link is a means by which the entity's role in the relationship can be identified.
4. The method of claim 3 wherein said means of role identification is implemented as a hash of hashes of dynamically-allocated arrays, where each hash element at the outer level represents links from said entity set to the relations in any one particular relationship set whose name would be the hash key.
5. The method of claim 4 wherein each hash element at the inner level represents a particular role played by said entity, and the role index would serve as the hash key.
6. The method of claim 5 wherein each array element represents one relationship, and in one embodiment these relationships are not sorted in any particular order in the array.
7. The method of claim 1 wherein a query language is provided as means for find (search) operations taking attributes and entity sets as parameters, such that it is not necessary to traverse the entire database or traverse entities in irrelevant entity sets to answer queries.
8. A method for storing a database involving data entities and relationships of any finite arity amongst said entities, comprising:
a. storing each said entity in a data structure or an object, henceforth referred to as an entity object,
b. storing each said relationship amongst entities in a data structure or an object, henceforth referred to as a relationship object,
c. linking with pointers or references in both directions each said relationship object with the entities involved in the relationship represented by said relationship object,
d. providing users or client programs with convenient and direct means of creating said entity objects and relationship objects without having to simulate or create said objects from other types of records.
9. The method of claim 8 wherein said entities of like kind are grouped together into entity sets, and said relationships of like kind are grouped together into relationship sets.
10. The method of claim 9 wherein, in each said entity, there exist links between the entity and all the relationships in which said entity is involved, and associated with each said link is a means by which the entity's role in the relationship can be identified.
11. The method of claim 8 wherein a query language is provided as means for find (search) operations taking attributes and entity sets as parameters, such that it is not necessary to traverse the entire database or traverse entities in irrelevant entity sets to answer queries.
US12/490,270 2008-06-24 2009-06-23 Intentionally-Linked Entities: A General-Purpose Database System Abandoned US20090319564A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US12/490,270 US20090319564A1 (en) 2008-06-24 2009-06-23 Intentionally-Linked Entities: A General-Purpose Database System

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US7518908P 2008-06-24 2008-06-24
US12/490,270 US20090319564A1 (en) 2008-06-24 2009-06-23 Intentionally-Linked Entities: A General-Purpose Database System

Publications (1)

Publication Number Publication Date
US20090319564A1 true US20090319564A1 (en) 2009-12-24

Family

ID=41432335

Family Applications (1)

Application Number Title Priority Date Filing Date
US12/490,270 Abandoned US20090319564A1 (en) 2008-06-24 2009-06-23 Intentionally-Linked Entities: A General-Purpose Database System

Country Status (1)

Country Link
US (1) US20090319564A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120272225A1 (en) * 2011-04-25 2012-10-25 Microsoft Corporation Incremental upgrade of entity-relationship systems

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5740440A (en) * 1995-01-06 1998-04-14 Objective Software Technology Dynamic object visualization and browsing system
US5870751A (en) * 1995-06-19 1999-02-09 International Business Machines Corporation Database arranged as a semantic network
US6633886B1 (en) * 1998-03-06 2003-10-14 Kah Yuen John Francis Chong Method of implementing an acyclic directed graph structure using a relational data-base
US7039948B2 (en) * 2001-03-06 2006-05-02 Hewlett-Packard Development Company, L.P. Service control manager security manager lookup
US7333986B2 (en) * 2004-03-31 2008-02-19 Kabushiki Kaisha Toshiba Hierarchical database management system, hierarchical database management method, and hierarchical database management program
US20080092068A1 (en) * 2006-02-06 2008-04-17 Michael Norring Method for automating construction of the flow of data driven applications in an entity model
US7483920B2 (en) * 2004-11-12 2009-01-27 International Business Machines Corporation Database management system, database management method, and program
US7882132B2 (en) * 2003-10-09 2011-02-01 Oracle International Corporation Support for RDBMS in LDAP system

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5740440A (en) * 1995-01-06 1998-04-14 Objective Software Technology Dynamic object visualization and browsing system
US5870751A (en) * 1995-06-19 1999-02-09 International Business Machines Corporation Database arranged as a semantic network
US6633886B1 (en) * 1998-03-06 2003-10-14 Kah Yuen John Francis Chong Method of implementing an acyclic directed graph structure using a relational data-base
US7039948B2 (en) * 2001-03-06 2006-05-02 Hewlett-Packard Development Company, L.P. Service control manager security manager lookup
US7882132B2 (en) * 2003-10-09 2011-02-01 Oracle International Corporation Support for RDBMS in LDAP system
US7333986B2 (en) * 2004-03-31 2008-02-19 Kabushiki Kaisha Toshiba Hierarchical database management system, hierarchical database management method, and hierarchical database management program
US7483920B2 (en) * 2004-11-12 2009-01-27 International Business Machines Corporation Database management system, database management method, and program
US20080092068A1 (en) * 2006-02-06 2008-04-17 Michael Norring Method for automating construction of the flow of data driven applications in an entity model

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
A. Burns. The relationship detector: Uncovering hidden relationships in object-oriented programs. Technical Report 550, ETH Zurich, October 2006. Masterthesis, pp 1-52. *
J. Rumbaugh. Relations as Semantic Constructs in an Object-Oriented Language, OOPSLA '87 Proceedings, 1987, pp. 466-481. *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120272225A1 (en) * 2011-04-25 2012-10-25 Microsoft Corporation Incremental upgrade of entity-relationship systems
US8607217B2 (en) * 2011-04-25 2013-12-10 Microsoft Corporation Incremental upgrade of entity-relationship systems

Similar Documents

Publication Publication Date Title
US5721911A (en) Mechanism for metadata for an information catalog system
US6662188B1 (en) Metadata model
US8356029B2 (en) Method and system for reconstruction of object model data in a relational database
EP1247165B1 (en) System and method for transforming a relational database to a hierarchical database
US7783596B2 (en) System and method for an immutable identification scheme in a large-scale computer system
Ricardo et al. Databases illuminated
US20060265411A1 (en) Containment hierarchy in a database system
Hellerstein et al. Readings in database systems
US20040015486A1 (en) System and method for storing and retrieving data
Chu et al. A relational approach to incrementally extracting and querying structure in unstructured data
US11216516B2 (en) Method and system for scalable search using microservice and cloud based search with records indexes
US7624117B2 (en) Complex data assembly identifier thesaurus
US7130856B2 (en) Map and data location provider
US20090319564A1 (en) Intentionally-Linked Entities: A General-Purpose Database System
Palopoli et al. Experiences using DIKE, a system for supporting cooperative information system and data warehouse design
Ruldeviyani et al. Enhancing query performance of library information systems using NoSQL DBMS: Case study on library information systems of Universitas Indonesia
Sharma et al. CBDR: An efficient storage repository for cultural big data
Paterson et al. Teaching nosql with ravendb and neo4j
US11860956B2 (en) Metadata based bi-directional data distribution of associated data
Klaib et al. Development of Database Structure and Indexing Technique for the Wireless Response System
Dasararaju et al. Data Management—Relational Database Systems (RDBMS)
Mukhopadhyay et al. ETL with Python
Schlader Archaeological databases: what are they and what do they mean?
Wang et al. Teaching Tip: Teaching NoSQL Databases in a Database Course for Business Students
Zemzans Exploring NoSQL Databases: Comparison of Databases

Legal Events

Date Code Title Description
STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION