WO2003107142A2 - Data movement platform - Google Patents

Data movement platform Download PDF

Info

Publication number
WO2003107142A2
WO2003107142A2 PCT/US2003/019350 US0319350W WO03107142A2 WO 2003107142 A2 WO2003107142 A2 WO 2003107142A2 US 0319350 W US0319350 W US 0319350W WO 03107142 A2 WO03107142 A2 WO 03107142A2
Authority
WO
WIPO (PCT)
Prior art keywords
data
data storage
movement
component
component assembly
Prior art date
Application number
PCT/US2003/019350
Other languages
French (fr)
Other versions
WO2003107142A3 (en
Inventor
Dan Arnon
Amos Benninga
David Chase
Michael Condict
Mathieu GAGNÉ
Kaleb Keithley
Andrew Shultz
Rostislav Vavrick
Simon Zaslavsky
Original Assignee
Oryxa, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Oryxa, Inc. filed Critical Oryxa, Inc.
Priority to AU2003247560A priority Critical patent/AU2003247560A1/en
Publication of WO2003107142A2 publication Critical patent/WO2003107142A2/en
Publication of WO2003107142A3 publication Critical patent/WO2003107142A3/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/30Creation or generation of source code
    • G06F8/36Software reuse
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F2003/0697Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers device management, e.g. handlers, drivers, I/O schedulers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems

Definitions

  • the present invention relates generally to data movement and, in particular, a data movement language and run-time execution environment for data movement.
  • Data movement is typically handled by specialized hardware devices such as memory chips, disk drives, direct memory access (DMA) controllers, and network and protocol driver chips. Access to these specialized hardware devices is typically made through an operating system (OS) and/or through firmware that is tailored to these devices. Networks and various protocol driver chips can also move data between different devices as long as the devices understand a common protocol.
  • OS operating system
  • DMA direct memory access
  • the meaning of data is often not provided to the recipient (i.e., the data store).
  • the data store is treated like a "peripheral" of a system. For example, if a server is connected to a storage array, all of the data processing occurs on the server while none of the processing takes place on the storage array.
  • a storage array often has processing power, however, that meets or even surpasses the processing power of the data processing device (e.g., the server). Thus, the processing power of the storage array is frequently not utilized during data movement.
  • the code is developed for and loaded onto the specific piece of hardware.
  • the program may cause the hardware to stop executing if, for instance, the code has an error in it.
  • the company might, for example, bring in a consultant to analyze and fix the problem or send the hardware back to the manufacturer for repair. Either can result in increased costs and delay.
  • the present invention provides a data movement platform enabling development and run-time execution of data movement software.
  • This platform provides developers with a framework that enables reusability and provides distributed software components for data movement.
  • the software is extensible to any number of devices.
  • the software can also execute in a simulated environment (e.g., without devices), thereby enabling testing while significantly reducing the need for using expensive storage hardware.
  • the invention relates to a system for data movement.
  • the system includes a data storage application, a data storage network, multiple components, a component assembly, and a run-time execution environment. Each component implements data movement functionality.
  • the components are assembled into the component assembly to embody the data storage application.
  • the run-time execution environment executes the data storage application embodied in the component assembly by implementing the data movement functionality within the data storage network.
  • the data storage network includes a data storage array, a data storage switch, and/or a geographically disbursed network of switches and arrays.
  • the run-time execution environment executes on a data storage node or endpoint within the data storage network.
  • the run-time execution environment can also include a class repository for accepting classes, an administrative module for administering the component assembly, a fast path executor for executing input/output (I/O) requests, and/or a data path module to optimize data movement within the data storage network by optimizing the execution of the data storage application.
  • the run-time execution environment executes on multiple hardware platforms, so as to enable data movement across various kinds of hardware.
  • the run-time execution environment may also include a services module and a querying module.
  • the services module enables a data service for a component in the component assembly and/or the component assembly.
  • the data service includes logging, locking, a time service, an event service, and a shared data service.
  • the querying module can query a component in the component assembly and/or the component assembly.
  • the invention includes a method for executing a data storage application within a data storage network. The method includes implementing data movement functionality in components, assembling components into a component assembly that embodies the data storage application, and executing the data storage application by implementing the data movement functionality using a run-time execution environment.
  • the method also includes loading a component assembly, deploying a component assembly, staging a component assembly, and/or removing a component assembly.
  • the method may also include optimizing the execution of the data storage application within the data storage network.
  • the executing step can also include installing the component assembly on a data storage endpoint / node.
  • the invention includes a data movement class having data storage objects and a router.
  • the router is a declaration, such as an Extensible Markup Language (XML) declaration, that specifies movement of data between the data storage objects.
  • the router can specify, for example, a variable movement of data, a fixed movement of data, and a mixed fixed / variable movement of data.
  • the class can also include functional hooks that enable modification of the movement of data that is otherwise specified by the router.
  • the functional hooks can include one or more of event hooks, control hooks, and run-time execution environment hooks.
  • each data storage object is either an imported data storage object, an exported data storage object, or a private data storage object.
  • a private data storage object is a data storage object that only an administrator of the class can view / update.
  • the invention includes a system for developing a component assembly.
  • the system includes data movement classes, components that each implement data movement functionality by referencing one or more data movement classes, and a development module that assembles the components into a component assembly embodying a data storage application.
  • each class can include data storage objects and a router specifying movement of data between the objects.
  • the specification can further include a matching between the objects.
  • the development module includes an acyclic graph of the data movement classes in a format that enables the assembly of the component assembly from components referring to the classes.
  • the acyclic graph can also include a locator map specifying an execution location of a class.
  • the system can also include a test harness to test the classes, components, and/or component assembly.
  • the invention includes a method for moving data that includes the steps of obtaining a data movement program written in a programmable data movement language and executing the data movement program to move data between a first and second data storage object.
  • the programmable data movement language is independent of the data storage objects and includes a router that maps the first data storage object to the second data storage object.
  • the data movement program includes instructions to move data through specific types of routers. These instructions may be the result of a scan of data storage objects.
  • the invention can also include a system for moving data having a first and second data storage object, a programmable data movement language that is independent of the first data storage object, the second data storage object, and the system, and a run-time execution environment enabling data movement between the first and second data storage objects using the programmable data movement language.
  • the system includes a data mover to move the data between the first data storage object and the second data storage object.
  • the system can include a data movement program written in the programmable data movement language specifying the data movement between the data storage objects.
  • the run-time execution environment can include a compiler to compile the data movement program and a control path module that enables dynamic data movement between the data storage objects.
  • the control path module uses hooks to dynamically vary the data movement and/or the data movement program.
  • FIG. 1 is a block diagram of an embodiment of a data storage network having a runtime execution environment on each of four nodes.
  • FIG. 2 is a block diagram of an embodiment of the data storage network of Fig. 1 having components assembled into a component assembly.
  • Fig. 3 is a flow diagram illustrating an embodiment of the steps performed to execute a data storage application in the data storage network of Fig. 1.
  • Fig. 4 is a more detailed block diagram of an embodiment of the component assembly of Fig. 2 and components of Fig. 2 referring to one or more data movement classes.
  • Fig. 5 is a block diagram of an embodiment of a development module having a directed, acyclic component graph of the data movement classes of Fig. 4.
  • Fig. 6 is a block diagram of an embodiment of a location tree of the data storage network of Fig. 1.
  • Fig. 7 is a more detailed block diagram of an embodiment of a data movement class of Fig. 3.
  • Fig. 8 is a more detailed block diagram of an embodiment of the run-time execution environment of Fig. 1.
  • Fig. 9 is a block diagram illustrating an embodiment of a data movement flow within the run-time execution environment of Fig. 8.
  • Fig. 1 OA is a flow diagram illustrating an embodiment of management tasks performed by the run-time execution environment of Fig. 9.
  • Fig. 1 OB is a flow diagram illustrating an embodiment of the steps performed by the run-time execution environment of Fig. 9 to execute client I/O requests.
  • Fig. 11 illustrates a more detailed block diagram of an embodiment of the techniques that the run-time execution environment uses to service I/O requests. Detailed Description
  • a data storage network 104 includes a first node 108a, a second node 108b, a third node 108c, and a fourth node 108d (generally 108) connected via a fabric 112.
  • the data storage network 104 is a network that operates on data.
  • the data storage network 104 is a network that facilitates the transmission, processing, storage and/or retrieval of data between client applications (i.e., clients) and physical storage resources. Examples of client applications include databases, mail servers and web servers.
  • the client applications can execute on one or more devices (e.g., computers) in communication with the data storage network 104.
  • the client is a user of the data storage network 104 and/or node 108.
  • An example of the data storage network 104 includes a local area network (LAN), a wide area network (WAN), a Fibre Channel network, and the like.
  • the fabric 112 enables communication between the nodes 108 or some subset of the nodes 108 in the data storage network 104.
  • the fabric 112 is a switch with multiple ports that routes packets of information received by an input port to an output port.
  • the fabric 112 includes multiple switches that connect to one another through cables, such as optical cables.
  • the physical storage resources of the data storage network 104 are the physical locations at which the data resides.
  • the physical storage resources can include one or more nodes 108.
  • Examples of a node 108 include, but are not limited to, a hard disk, a storage controller (e.g., having one or more central processing units (CPUs) and memory), a server, a collection of disks and storage controllers (i.e., disk arrays), a fabric packaged together with a collection of storage controllers (i.e., a storage switch), and a collection of storage controllers without disks or fabrics (i.e., diskless storage appliance).
  • Each node 108 has an input and output port and connects to the fabric 112 via a respective cable 116a, 116b, 116c, 116d (generally 116) or other communications media.
  • a respective cable 116a, 116b, 116c, 116d generally 116 or other communications media.
  • one or more of the nodes 108 communicate with the fabric 112 via a wireless communication link.
  • the data storage network 104 communicates using one or more data transmission protocols, such as Ethernet, TCP/IP, SCSI, iSCSI, Fibre Channel, and Infiniband.
  • the nodes 108 can be connected to, any number of fabrics 112.
  • the node 108 is a controller that includes a CPU or multiple CPUs. Moreover, the CPUs of the node 108 can be organized in a Symmetric Multi Processor (SMP) configuration. Such SMP configuration of a node 108 enables a collection of CPUs and memories to behave as a single CPU with memory.
  • the node 108 may also include a programmable chip that performs storage functionality on a storage controller or in the fabric 112.
  • the nodes 108 also include storage software that executes on the CPUs of one or more nodes 108 and provides storage functionality.
  • the storage software enables a node 108 (e.g., a storage controller) to present simulated disks to clients that are different than the disks that the controller actually controls.
  • a node 108 e.g., a storage controller
  • An example of such software would be storage virtualization software.
  • each run-time execution environment 120 is an environment enabling the execution of a data storage application within the data storage network 104 (e.g., with its respective node 108).
  • the run-time execution environment 120 enables efficient data movement between data storage objects and enables the implementation of a programmable data storage language to generate a data movement program.
  • each run-time execution environment 120 on each node 108 within a group together form a distributed, data storage network run-time execution environment 124 for the data storage network 104.
  • a number of nodes 108 can be viewed as one.
  • the description below for the run-time execution environment 120 on a node 108 also applies to the data storage network run-time execution environment 124.
  • the run-time execution environment 120 enables the execution of one or more data storage applications 208.
  • the data storage application 208 performs a data movement operation on one or more pieces of data using data movement functionality.
  • Exemplary data movement functions include mirroring, caching, striping, remote replication, point in time duplication, snapshot, and partitioning.
  • the run-time execution environment 120 uses this functionality to execute the data storage application 208.
  • the run-time execution environment 120 employs one or more components (e.g., a first component 212a, a second component 212b, and any number of additional components 212c (generally 212)).
  • each component 212 enables an independent data movement function by referring to a data movement class and instantiating the class.
  • the run-time execution environment 120 uses the reference to determine which class to instantiate for the purpose of deploying this component.
  • the data movement class may be loaded into the run-time execution environment 120 before the run-time execution environment 120 executes the data storage application 208 or during execution of the data storage application 208.
  • a group of components 212 act together to perform the data storage operation of the application 208.
  • This group of components is a component assembly 216 and embodies the data storage application 208.
  • the run-time execution environment 120 enables the building of the component assembly 216 from the components 212.
  • the component assembly 216 describes how the components 212 work together to achieve the data movement operation of the data storage application 208.
  • the description can also describe deployment characteristics of the components 212.
  • the component assembly 216 can describe how the components 212 in the component assembly 216 are deployed on different run-time execution environments 120 residing on different nodes 108.
  • the run-time execution environment 120 can load the component assembly 216 during run-time of the data storage application 208.
  • the run-time execution environment 120 instantiates and/or deploys the components 212 within a component assembly 216.
  • the run-time execution environment 120 can also initialize the components 212 in the component assembly 216. In one embodiment, the initialization depends on the component assembly 216.
  • the run-time execution environment 120 can analyze the component assembly 216, determine the data storage application 208 that the assembly 216 embodies, and adjust the initialization of one or more of the components 212 based on the component assembly 216.
  • the run-time execution environment 120 then executes the component assembly 216 to perform the data storage application 208 on one or more nodes 108.
  • the run-time execution environment 120 can also shut down (i.e., de-stage) one or more components 212. This may depend on the component assembly 216 or the node 108 on which the component assembly 216 executes. In one embodiment, the shutting down of a component 212 depends on the component assembly 216.
  • the component 212 has a public state and a private state.
  • the public state is available to the run-time execution environment 120 and includes the condition of the component. Examples of the condition of the component 212 include whether the component 212 is being loaded, deployed, staged, de-staged, and/or whether the component 212 can accept input/output (I/O).
  • the public state changes based on the runtime execution environment 120 and the component 212. For example, if the run-time execution environment 120 deploys the component 212, the public state of the component 212 changes to represent that the component 212 is deployed.
  • the private state includes the parameters of the component.
  • the parameters can include the content or function of the component or whether the component is on or off.
  • the private state can include a description that the component 212 performs remote mirroring or remote mirroring in-sync to local mirroring.
  • the developer of the component 212 e.g., the administrator
  • the component assembly 216 can also have a public and private state.
  • the private state of the component assembly 216 can include, for example, a description of the data storage application 208 that the component assembly 216 embodies.
  • the private state can also include a list of the components 212 that make up the component assembly 216 and the parameters of the components 212.
  • the state of the component assembly 216 can additionally include how the components 212 are connected to create the component assembly 216.
  • the run-time execution environment 120 provides a robust environment for the execution of the component assembly 216, such as by ensuring optimal component assembly execution even if one or more nodes 108 are disfunctional or damaged.
  • the run-time execution environment 120 enables the component assembly 216 to function at its optimal level whenever possible by redirecting work from the damaged node 108a to other nodes 108 that have the same components 212 deployed on them, if these nodes 108 exist.
  • an administrator determines a particular data storage application 208 that the administrator wants to execute on one or more nodes 108 (step 304).
  • the administrator may want to deploy a data storage application 208 that provides synchronous remote mirroring between the first node 108a and a storage array consisting of the second node 108b, the third node 108c, and the fourth node 108d for ten logical volumes.
  • This data storage application 208 requires at least two pieces of data storage functionality - mirroring functionality and caching functionality.
  • the administrator develops the application 208 with particular components 212 that implement these data storage functions (step 308). For instance, the first component 212a provides the mirroring functionality and the second component 212b provides the caching functionality.
  • ten mirroring components 212a are needed (i.e., one for each volume) and two caching components are needed (i.e., one for the first node 108a and one for the storage array).
  • the administrator then uses these components 212a, 212b to generate the application 208. The administrator does this by connecting the components 212a, 212b in a fashion that is appropriate for the requisite data storage application 208.
  • the administrator assembles the components 212a, 212b into the component assembly 216 to embody the data storage application 208 (step 312).
  • the administrator can also describe the deployment characteristics of the components, such as to deploy each of the mirroring components 212a on the nodes 108b, 108c, and 108d of the storage array, deploy the first caching component 212b on the nodes 108b, 108c, and 108d of the storage array, and deploy the second caching component 212b on the first node 108a (i.e., the storage appliance).
  • the run-time execution environment 120 subsequently executes the data storage application 208 to perform the synchronous remote mirroring between the storage array and the storage appliance (step 316).
  • Each component 212 refers to one or more data movement classes (e.g., the first data movement class 404a, the second data movement class 404b, and/or any additional data movement classes 404c (generally 404). Any number of classes 404 can be present.
  • the reference to a class 404 is a pointer to the class 404 and is shown as a corresponding reference arrow 408a, 408b, 408c, and 408d (generally 408).
  • one component e.g., the first component 212a
  • each component 212 can also refer to locations of the class 404.
  • the development module 412 creates the component assembly 216 from the components 212.
  • the development module 412 is a software module that resides on one or more computing devices.
  • the development module 412 can reside on a personal computer (e.g., 286, 386, 486, Pentium, Pentium II, Macintosh computer), Windows- based terminal, network computer, wireless device, information appliance, RISC Power PC, X- device, workstation, mini computer, main frame computer, personal digital assistant, node 108, or other computing device that can communicate with a node 108.
  • the development module 412 enables an administrator to use the data movement classes 404 to create the component assembly 216.
  • the development module 412 retrieves one or more data movement classes 404 (and/or components 212 / component assemblies 216) from a network (e.g., the World Wide Web or the Internet).
  • the development module 412 can provide data movement classes 404 (and/or components 212 / component assemblies 216) for other administrators to access over a network, such as via a web service.
  • each class 404 has one or more storage objects (e.g., a first storage object 416a, 416b, 416c (generally 416) and a second storage object 420a, 420b, 420c (generally 420)).
  • the storage object 416, 420 is an abstract repository of user data.
  • the storage object 416, 420 is a data structure, such as an array, a list, or a linked list.
  • the storage object 416, 420 can have a size (i.e., an amount of data that the storage object 416, 420 can store), a data type, the ability to store and retrieve data, the ability to report problems with the correctness or availability of the data, the ability to be managed (e.g., through a management interface), or any combination or subset of these characteristics.
  • the first storage object 416 is an imported storage object and the second storage object 420 is an exported storage object.
  • the imported storage object 416 can be a repository of data external to the class 404 that an administrator (e.g., a class developer) assumes to be given when developing a class.
  • the exported storage object 420 is a virtual repository of data that the class 404 creates by using external storage objects.
  • classes 404 other than the class 404 that created the exported storage object 420 can use the exported storage object 420.
  • the first class 404a creates an exported storage object 420a from an imported storage object 416.
  • the second class 404b can now use the exported storage object 420a as its repository of data.
  • the development module 412 includes a directed, acyclic component graph 504 having one or more vertices 508a, 508b, 508c (generally 508). Each vertex 508 of the graph 504 describes a component 212.
  • the description of the component 212 includes instantiation information.
  • the instantiation information can include the class reference 408 and construct parameters for the class 404.
  • the runtime execution environment 120 uses instantiation information to determine that the run-time execution environment 120 has to instantiate the first data movement class 404a to construct the first component 212a.
  • the run-time execution environment 120 also determines, from the instantiation information, other necessary parameters to construct the first component 212a. In one embodiment, these parameters are specific to the component 212 and may vary from component 212 to component 212, even components 212 referring to the same class 404. Examples of these parameters include the size and type of the component 212.
  • the description can also include deployment information for one or more components 212.
  • the deployment information includes a location reference.
  • the run-time execution environment 120 uses the deployment information to determine that a component 212 requires a class 404 to be loaded on the nodes 108 that belong to a particular location.
  • the instantiation of a data storage object 416, 420 of a class 404 is a facet of the component 212.
  • the imported data storage objects 416 are instantiated into a corresponding imported facet 512a, 512b, 512c, 512d, 512e (generally 512).
  • the exported data storage objects 420 are instantiated into a corresponding exported facet 516a, 516b, 516c, 516d (generally 516).
  • the imported facets 512 represent the actual storage that the component 212 consumes.
  • the exported facets 516 represent the actual storage that the component provides.
  • each facet 512, 516 has particular characteristics. These characteristics can depend on the instantiation information of the component 412 and/or on properties of the storage objects 416, 420 of the class 404. Exemplary characteristics include size and type of the facet 512, 516.
  • the component 212 imports the exported facet 516 of another component 212.
  • the second component 212b (as shown by the second vertex 508b) includes the second exported facet 516b.
  • the second component 212b makes the second exported facet 516b available for use by other components 212.
  • the first component 212a (as shown by the first vertex 508a) imports the second exported facet 516b into the second imported facet 512b of the first component 212a. This importing from the second vertex 508b to the first vertex 508a is shown in the component graph 504 with a first importation arrow (or "directed edge") 520a.
  • the second component 212b provides data storage that the first component 212a uses.
  • the first component's importation of the third exported facet 516c of the third component 212c (as shown by the third vertex 508c) into the first imported facet 512a of the first component 212a is shown with a second importation arrow 520b.
  • the development module 412 also includes a verifier 524.
  • the verifier 524 can check that the importation arrows 520a, 520b (generally 520) only connect facets 512, 516 of identical characteristics (e.g., size and type).
  • the verifier 524 determines that one or more arrows 520 connect facets 512, 516 having different characteristics, then the verifier 524 rejects the component assembly 216 as embodied in the graph 504. Although shown as being within the development module 412, the verifier 524 can be external to the development module 412.
  • Fig. 6 shows a block diagram of a location tree 604 having a hierarchy of locations (e.g., the first through seventh location 608a-608g (generally 608)). Each location 608 is a set of nodes 108 (i.e., one node 108 or multiple nodes 108). In one embodiment, the development module 412 creates the location tree 604 to describe the data storage network 104. [0065] In one embodiment, the location tree 604 has leaves which correspond to the simple locations with one node 108 (i.e., the first location 608a, second location 608b, third location 608c, and fourth location 608d).
  • the location tree 604 also includes compound locations 608 having more than one node 108 (i.e., the fifth location 608e, sixth location 608f, and seventh location 608g).
  • the compound locations 608 contain all of the nodes 108 that are in its child locations 608 (i.e., the leaves connected to each compound location 608).
  • the run-time execution environment 120 and/or the development module 412) can change which (and how many) nodes 108 are in a location 608. This change may be at a predetermined time or in response to an event or occurrence.
  • the run-time execution environment 120 may remove the first node 108a from a location 608 and then add the second node 108b to the location 608 at the same time or a later time.
  • a location 608 represents a cluster of nodes 108 (e.g., CPUs). This cluster of nodes 108 enables distributed computation.
  • the run-time execution environment 120 uses the location tree 604 to describe the locations of deployment and execution of the class code of the components 212 in a component assembly 216.
  • the use of locations 608 can, for example, free the component assembly 216 from being directly associated with the physical data storage network 104, which in turn allows the set of nodes 108 in a location 608 to change without changing the location 608 itself, consequently making the component assembly 216 portable.
  • the locations 608 also enable an administrator to describe fault tolerance parameters of the data storage network 104.
  • the run-time execution environment 120 can adjust the location 608 so that the data storage application 208 embodied by the component assembly 216 continues to operate with little or no functional degradation. In one embodiment, the run-time execution environment 120 diverts the function of the non-functioning node 108 to other nodes 108 in the location 608.
  • the data movement class 404 creates complex storage objects (e.g., an exported storage object) from any number of simpler storage objects (e.g., imported storage objects).
  • the data movement class 404 creates a complex storage object from a first imported data storage object 702a.
  • the class 404 uses the imported storage object 702a or objects (e.g., the first imported data storage object 702a, a second imported data storage object 702b, and/or a third imported data storage object 702c (generally 702)) as building blocks for creating the exported data storage object 704.
  • the data movement class 404 includes a declarative part that includes a declaration of storage objects 702, 704 and a declaration of one or more routers 706.
  • the data movement class 404 can also include a procedural part that includes one or more functional hooks 708 associated with the router 706.
  • the class 404 connects any number of data storage objects, such as the first imported data storage object 702a and the exported data storage object 704, with the router 706.
  • the router 706 is a mapping that describes how the data layout in the exported storage object 704 is related to the data layout in the first imported data storage object 702a.
  • the mapping, or description, 712 can be fixed, dynamic, or partly fixed and partly dynamic.
  • a fixed mapping 712 (or portion of mapping) states explicit relationships between sections of data in the exported storage objects 704 and sections of data in the imported storage objects 702.
  • a dynamic mapping 712 (or portion of mapping) states an algorithm for run-time whenever the mapping 712 from the exported storage objects' data layout to the imported storage objects' data layout needs to be determined for sections of data not covered by the fixed mapping 712.
  • the algorithm can depend on, for example, the type of router 706 used. Further, the algorithm can involve executing the functional hook 708 or hooks 708 that take data section descriptions as parameters.
  • the data movement class 404 can implement mirroring using the first imported data storage object 702a, the second imported data storage object 702b, and the exported data storage object 704.
  • These data storage objects 702, 704 are data storage objects of the same data structure, such as a one dimensional array of 1 million blocks.
  • the router 706 contains fixed mappings in the data movement description 712 for write requests into the exported data storage object 704 and dynamic mappings with a first functional hook 716 for read requests out of the exported data storage object 704.
  • the fixed mapping for write requests specifies, for instance, that the router 706 translates any write request to a first (e.g., predetermined) address in the exported data storage object 704 to a write request to the same address in both the first imported data storage object 702a and the second imported data storage object 702b.
  • the data written to the exported data storage object 704 is mirrored in both the first and second imported data storage objects 702a, 702b.
  • the dynamic mapping specifies, for instance, that any read request to read from the first address of the exported data storage object 704 is processed by an algorithm.
  • the algorithm calls the first functional hook ("F") 716 with the first address ("A") as a parameter of the function call.
  • the result of the algorithm F(A) is a bit that determines whether the data is read from the first address in the first imported data storage object 702a or from the first address in the second imported data storage object 702b. Therefore, the dynamic mapping results in a variable mirror service policy.
  • the run-time execution environment 120 includes several modules that perform the functions of the run-time execution environment 120.
  • the run-time execution environment 120 includes an administrative module 804, a querying module 808, a fast path executor 812, a control path module 816, a class repository 820, a services module 824, a data path module 828, and a porting module 832. Although shown as including all of these modules 804-832, the run-time execution environment 120 can include one or more subset of these modules 804 - 832.
  • modules 804-832 are illustrated as being within a single run-time execution environment 120, one or more of the modules 804-832 can be within one or more run-time execution environment 120. Additionally, the modules 804-832 can be written in any programming language and/or scripting languages, such as, but not limited to, Java, Visual Basic, C++, and the like.
  • the administrative module 804 enables the management of the run-time execution environment 120.
  • the management of the run-time execution environment 120 can include loading, unloading, and managing the component assembly 216 in the run-time execution environment 120.
  • an administrator uses the administrative module 804 to manage the run-time execution environment 120, such as via a client computer communicating with the data storage network 104 over a network.
  • the administrative module 804 can provide an interface to the administrator, such as a display screen with selection options and buttons (e.g., enabling the administrator to select a component assembly 216).
  • the querying module 808 enables the management of the private states of the component 212 or component assembly 216.
  • an administrator or computer program can use the querying module 808 to query the component 212 or component assembly 216 and determine the parameters of the component 212, such as what function the component 212 performs.
  • the querying module 808 also enables the management of the public state of the component 212 or component assembly 216.
  • the fast path executor 812 is a gateway between the run-time execution environment 120 and the data storage clients that the data storage application serves.
  • the fast path executor 812 services requests received by data storage clients.
  • the fast path executor 812 also communicates with a requesting data storage client on the status of the request (e.g., successfully completed, failed attempt, or still processing).
  • the fast path executor 812 forwards the client request to the control path module 816.
  • the control path module 816 executes the client request. Further, the control path module 816 executes other I/O requests. In one embodiment, the functional hook 708 also initiates I/O requests. Moreover, the control path module 816 can also execute any other type of hook, such as a hook initiated by the run-time execution environment 120.
  • the class repository 820 is a storage repository (e.g., database) of data movement classes 404.
  • the instantiation information of a component 212 refers to a class 404 in the class repository 820.
  • the administrative module 804 loads and/or deploys a component assembly 216
  • the administrative module 804 refers to the class repository 820 to instantiate the class 404 on a node 108 where the component 212 is deployed.
  • the services module 824 provides services that are made available to the functional hooks 708. Examples of these services include distributed locking and distributed data sharing, distributed logging, and timing services.
  • the data path module 828 provides the data path service to the functional hooks 708. This service enables a functional hook 708 to change its router's mapping at run-time.
  • the porting module 832 provides hardware specific services to particular hardware platforms.
  • the type of porting module 832 depends on the hardware in the node 108 to which the run-time execution environment 120 is ported.
  • the porting module 832 can also provide a porting interface enabling an administrator to select the hardware services that the module 832 provides.
  • an independent computer e.g., laptop
  • the run-time execution environment 120 performs management tasks relating to components 212 and/or component assemblies 216.
  • management tasks include the class repository 820 receiving one or more new classes 904 (step 1004).
  • an administrator loads the new classes 904 into the class repository 820 (step 1008).
  • the management tasks can also include a loading step, a deployment step, a staging step, and/or a de-staging step for component assemblies 216.
  • the administrative module 804 receives one or more component assembly updates 908 (step 1012).
  • component assembly updates 908 can be a new component assembly 216 or a run-time update to an existing, staged component assembly 216.
  • the administrative module 804 instantiates the classes 404 indicated by the components 212 at the locations 608 indicated by the components 212 (step 1016).
  • the administrative module 804 performs class instantiations 912 using the class repository 820.
  • the administrative module 804 creates a compiled data movement digest 920 (step 1024).
  • the digest 920 is a data movement program that details the steps that the run-time execution environment 120 has to perform to process future client I/O requests 916.
  • the administrative module 804 can create the digest 920 by analyzing the routers 706 of the various components 212 in the component assembly 216.
  • the administrative module 804 can follow the flow of data inside each component 212 (i.e., from each exported facet 516 to its corresponding imported facet 512), through the component's routers, and between components (i.e., through importation arrows 520). During this analysis, if the administrative module 804 determines that one or more of the client I/O requests 916 will not call a functional hook 708, the administrative module 804 transmits this information to the fast path executor 812 via the digest 920.
  • the fast path executor 812 When the fast path executor 812 receives the digests 920 and if the fast path executor 812 determines that the digest 920 does not include a client I/O request 916 that calls a functional hook 708, the fast path executor 812 can delegate the execution of these types of client I/O requests 916. For example, the fast path executor 812 can forward these types of client I/O requests 916 to hardware acceleration elements, such as storage processors. This occurs when the fast path executor 812 issues a data path hardware service request 924. Therefore, in one embodiment, a CPU in a node 108 executing the run-time execution environment 120 does not have to service these client I/O requests 916.
  • the administrative module 804 performs a staging step.
  • the administrative module 804 transitions the component assembly 216 to an online state so that the fast path executor 812 can execute the component assembly 216 in order to serve the client I/O requests 916 (step 1028).
  • the administrative module 804 executes run-time execution environment hooks 928 to stage the component assembly 216 (step 1032).
  • the run-time execution environment hooks 928 are functional hooks that the data movement class 404 provides to assist in the staging of component assemblies 216.
  • the run-time execution environment 120 uses the run-time execution environment hooks 928 to determine the parameters of each component 212 (e.g., type and size of the facets 512, 516) as part of the staging process. Thus, during the staging process, the run-time execution environment 120 communicates the component parameters to each component 212 (i.e., the runtime execution environment 120 fills in the form / layout that the class 404 provides for the parameters) by invoking the run-time execution environment hook 928. Moreover, in one embodiment the run-time execution environment 120 uses the run-time execution environment hooks 928 to determine which components 212 connect to other components 212. Thus, the runtime execution environment hooks 928 perform load-time adjustments of the components 212. In one embodiment, the run-time execution environment 120 executes the run-time execution environment hooks 928 starting with the components 212 at the leaves of the component graph 504 and traveling up the component graph 504.
  • the administrative module 804 can also de-stage a component assembly 216.
  • the administrative module 804 de-stages a component assembly 216 when the data storage network 104 or node 108 is shut-down or when the administrative module 804 receives a component assembly update 908 that replaces a part of the component assembly 216.
  • the administrative module 804 executes the run-time execution environment hooks 928 to de- stage a component assembly 216 (step 1036).
  • the run-time execution environment 120 executes the run-time execution environment hooks 928 beginning with the roots of the component graph 504 (e.g., the first vertex 508a) and traveling down the component graph 504 (i.e., opposite of the staging process).
  • the run-time execution environment 120 also executes the client I/O requests 916.
  • the fast path executor 812 determines if the client I/O request 916 calls any functional hooks (step 1040). If the request 916 does not call a hook, the fast path executor 812 then determines if the hardware associated with the I/O request 916 can execute the request 916 (step 1044). If the hardware can execute the request 916, then the fast path executor 812 delegates client I/O requests 916 (that do not call functional hooks 708) to the hardware (e.g., hardware acceleration elements) (step 1048).
  • the hardware e.g., hardware acceleration elements
  • the fast path executor 812 executes the request 916 (step 1052). If the digest 920 indicates that component functional hooks 708 have to be invoked as part of the execution, the fast path executor 812 communicates a hook execution request 932 to the control path module 816 to invoke the hook (step 1056).
  • the control path module 816 executes all data movement class hooks (step 1060).
  • the hooks of a class 404 include both the run-time execution environment hooks 928 and router hooks 708. Further, the hooks of a class 404 include code that can create hook initiated I/O requests 936 to and from imported storage objects 416 of the class 404.
  • the router hooks 708 can update the data movement description 712. To achieve this update, the control path module 816 calls the data path module 828 with a router update request 940. Further, the router hooks 708 and/or the run-time execution environment hooks 928 can invoke services such as data locking and/or data logging that are not data-path specific.
  • the control path module 816 can transmit a non-data request 942 to the services module 824. In one embodiment, the services module 824 requests these services from the porting module 832 by sending services module hardware services requests 944.
  • the porting module 832 contains non-portable code and therefore may need to be written separately for each hardware environment and operation system environment to which the run-time execution environment 120 is ported.
  • the porting module 832 can have a portable interface through which other modules can request services that are implemented differently on different hardware and different operating systems.
  • the services module 824 provides some or all of these services 944, such as locking and data sharing, through a portable service module hardware services interface.
  • the control path module 816 can also provide I/O services to component hooks that read, write, or copy data from / to / between imported facets 512.
  • the porting module 832 provides control path hardware services to the control path module 816 after receiving a control path hardware services request 948.
  • the fast path executor 812 and/or the data path module 828 load and modify the compiled data movement digests 920 onto hardware acceleration tools, such as storage processors, by communicating a data path module hardware services request 952 to the porting module 832.
  • Fig. 11 illustrates a more detailed block diagram of the technique that the run-time execution environment 120 (i.e., the control path module 816) uses to service client I/O requests 916 and hook-initiated I/O requests.
  • Clients transmit requests to one or more exported facets 516 of a component 212. These facets are exported by components 212 that form a part of a component assembly 216.
  • Each component 212 includes a corresponding router 706 that connects the exported facets 516 to imported facets 512 through a mixture of fixed router mappings (e.g., a first, second, and third fixed router mapping 1104a, 1104b, 1104c, respectively (generally 1104)) and variable router mappings (e.g., a first, second, and third fixed router mapping 1108a, 1108b, 1108c, respectively (generally 1108)) (i.e., functional hooks).
  • the component assembly 216 describes how imported facets 512 of one component 212 are identified with exported facets 516 of other components 212.
  • the run-time execution environment 120 collects the information that the components 212 provide (i.e., the fixed mapping 1104 and the variable mapping 1108) together with the connections between the components 212) to create a data movement program 1112.
  • the data movement program 1112 can include a sequence of instructions that perform simple manipulations on data descriptions 712.
  • a data description 712 describes a subset of the storage embodied in the first exported facet 516a and the run-time execution environment 120 executes the data movement program 1112
  • the run-time execution environment 120 generates a description of the subset of the physical storage resources (e.g., the first physical storage resource 1116a, the second physical storage resource 1116b, and the third physical storage resource 1116c (generally 1116)) where the data actually resides.
  • the physical storage resources 1116 can be, for instance, a hard drive, a floppy disk, a memory board, etc.
  • the description is an output of the data movement program 1112.
  • the first exported facet 516a represents one million blocks of storage, with each block being 512 bytes long.
  • a description of a subset of that storage is a list of block extents within the one-million block range.
  • a client can issue a write request 916 for the range of blocks extending from block 50 to block 100, inclusive.
  • the data description can be: [516a: 50-100].
  • the first router 706a connecting the first exported facet 516a to the first imported facet 512a and second imported facet 512b can have both fixed and dynamic mapping rules.
  • the first router 706a includes the fixed router mapping 1104a and the variable mapping 1108a.
  • variable mapping 1108a is null. Therefore, the mapping of the first router 706a is entirely fixed via the fixed router mapping 1104a.
  • the fixed mapping 1104a can map the storage represented by the first exported facet 516a onto the first and second imported facets 512a, 512b, respectively. In the data movement program 1112, this correlates to a duplication operation that translates the range [516a: 50-100] to the pair of ranges [512a: 50-100] and [512b: 50-100]. [0096] In one embodiment, the next step in the data movement program 1112 relates to the structure of the component assembly 216.
  • the data movement program 1112 translates the pair [512a: 50-100] and [512b: 50-100] to the new pair [516b: 50-100] and [516c: 50-100] because the component assembly 216 connects the first imported facet 512a with the second exported facet 516b of a second router 706b and connects the second imported facet 512b with the third exported facet 516c of a third router 706c.
  • the second router 706b connects the second exported facet 516b to the third imported facet 512c and fourth imported facet 512d. [0097]
  • the second router 706b has the second fixed router mapping 1104b and the second variable mapping 1108b.
  • the fixed mapping 1104b states that the range of blocks given for the second exported facet 516b does not change but the identity of the imported facet to which the range is directed to is determined by the variable mapping 1108b (i.e., a functional hook).
  • the data movement program 1112 includes instructions that take the data description [516b: 50-100] and applies the functional hook 1108b to the description to obtain a result X.
  • X can either be the third imported facet 512c or the fourth imported facet 512d. Therefore, the resulting data description is [X: 50-100], where X is either the third or fourth imported facet 512c, 512d, respectively, depending on the value returned by the functional hook 1108b.
  • the data movement program 1112 ends up at a description consisting of data intervals only within the storage resources 1116.
  • the data movement program invokes a data mover 1124 to move the data to the storage resources 1116 (e.g., node 108).
  • the data mover 1124 takes the results of the data movement program 1112 (i.e., the description) and moves the data directly into the correct addresses in the storage resources 1116.
  • the run-time execution environment 120 follows the data movement descriptions generated by each component 212 separately and the component assembly 216 as a whole, and serves read, write, and copy requests from clients by moving the data from its starting point (e.g., the first exported facet 516a) to its ultimate end point (e.g., the first storage resource 1116a) without using intermediate copies. Therefore, although described above as moving data to the storage resource 1116 (write operation), the data mover 1124 can also move data from one or more of the storage resources 1116 (read operation) or between the storage resources 1116 (copy operation). The run-time execution environment 120 therefore enables extremely efficient data transfers because of the direct movement of data between a starting point of the first component 212 and an end point.

Abstract

The invention relates to systems and methods for data movement. The system can include a data storage application, a data storage network, multiple components, a component assembly, and a run-time execution environment. Each component implements data movement functionality and are assembled into the component assembly to embody the data storage application. The run-time execution environment executes the data storage application embodied in the component assembly by implementing the data movement functionality within the data storage network. Moreover, the system can also include a programmable data movement language that enables the creation of a data movement program.

Description

DATA MOVEMENT PLATFORM
Cross-Reference to Related Applications
[0001] This application claims priority to U.S. provisional patent application serial number 60/390,043, filed June 18, 2002. The provisional application serial number 60/390,043 is incorporated by reference herein. Field of the Invention
[0002] The present invention relates generally to data movement and, in particular, a data movement language and run-time execution environment for data movement. Background of the Invention
[0003] Data movement is typically handled by specialized hardware devices such as memory chips, disk drives, direct memory access (DMA) controllers, and network and protocol driver chips. Access to these specialized hardware devices is typically made through an operating system (OS) and/or through firmware that is tailored to these devices. Networks and various protocol driver chips can also move data between different devices as long as the devices understand a common protocol.
[0004] Traditionally, when two devices exchange data, the meaning of data is often not provided to the recipient (i.e., the data store). The data store is treated like a "peripheral" of a system. For example, if a server is connected to a storage array, all of the data processing occurs on the server while none of the processing takes place on the storage array. A storage array often has processing power, however, that meets or even surpasses the processing power of the data processing device (e.g., the server). Thus, the processing power of the storage array is frequently not utilized during data movement.
[0005] Another result of this processing mismatch is that data movement firmware is typically proprietary to a data storage device. This dependence on the hardware typically results in a lack of reusability of the firmware, which in turn leads to decreased functionality and usability. Moreover, third parties often cannot develop, maintain, or upgrade the firmware for the data storage device because of the proprietary nature of the firmware. Additionally, firmware developed for one class of devices, such as personal computers, typically is not available or compatible with other classes of devices, such as handheld computers or switches / routers. [0006] The proprietary nature of data movement firmware can slow development of data storage products. For example, if a company wants to test a data movement program for a particular piece of hardware (e.g., an expensive, high-end data storage array), the code is developed for and loaded onto the specific piece of hardware. Upon execution, the program may cause the hardware to stop executing if, for instance, the code has an error in it. To fix this particular piece of hardware, the company might, for example, bring in a consultant to analyze and fix the problem or send the hardware back to the manufacturer for repair. Either can result in increased costs and delay. Summary of the Invention
[0007] Thus, there is a need to enable the movement of data between devices while providing reusability, enabling third parties to develop, maintain, and upgrade data movement software, enable compatibility of data movement code between multiple classes of devices, and facilitate remote testing of data storage applications / programs without burdening the hardware that may execute the data storage application / program.
[0008] The present invention provides a data movement platform enabling development and run-time execution of data movement software. This platform provides developers with a framework that enables reusability and provides distributed software components for data movement. Further, the software is extensible to any number of devices. The software can also execute in a simulated environment (e.g., without devices), thereby enabling testing while significantly reducing the need for using expensive storage hardware. [0009] In one aspect, the invention relates to a system for data movement. The system includes a data storage application, a data storage network, multiple components, a component assembly, and a run-time execution environment. Each component implements data movement functionality. The components are assembled into the component assembly to embody the data storage application. The run-time execution environment executes the data storage application embodied in the component assembly by implementing the data movement functionality within the data storage network.
[0010] In one embodiment, the data storage network includes a data storage array, a data storage switch, and/or a geographically disbursed network of switches and arrays. Further, the run-time execution environment executes on a data storage node or endpoint within the data storage network. The run-time execution environment can also include a class repository for accepting classes, an administrative module for administering the component assembly, a fast path executor for executing input/output (I/O) requests, and/or a data path module to optimize data movement within the data storage network by optimizing the execution of the data storage application. In one embodiment, the run-time execution environment executes on multiple hardware platforms, so as to enable data movement across various kinds of hardware. [0011] The run-time execution environment may also include a services module and a querying module. The services module enables a data service for a component in the component assembly and/or the component assembly. In one embodiment, the data service includes logging, locking, a time service, an event service, and a shared data service. The querying module can query a component in the component assembly and/or the component assembly. [0012] In another aspect, the invention includes a method for executing a data storage application within a data storage network. The method includes implementing data movement functionality in components, assembling components into a component assembly that embodies the data storage application, and executing the data storage application by implementing the data movement functionality using a run-time execution environment.
[0013] In some embodiments, the method also includes loading a component assembly, deploying a component assembly, staging a component assembly, and/or removing a component assembly. The method may also include optimizing the execution of the data storage application within the data storage network. The executing step can also include installing the component assembly on a data storage endpoint / node.
[0014] In yet another aspect, the invention includes a data movement class having data storage objects and a router. The router is a declaration, such as an Extensible Markup Language (XML) declaration, that specifies movement of data between the data storage objects. The router can specify, for example, a variable movement of data, a fixed movement of data, and a mixed fixed / variable movement of data. The class can also include functional hooks that enable modification of the movement of data that is otherwise specified by the router. The functional hooks can include one or more of event hooks, control hooks, and run-time execution environment hooks.
[0015] In one embodiment, each data storage object is either an imported data storage object, an exported data storage object, or a private data storage object. In one embodiment, a private data storage object is a data storage object that only an administrator of the class can view / update.
[0016] In another aspect, the invention includes a system for developing a component assembly. The system includes data movement classes, components that each implement data movement functionality by referencing one or more data movement classes, and a development module that assembles the components into a component assembly embodying a data storage application. As described above, each class can include data storage objects and a router specifying movement of data between the objects. The specification can further include a matching between the objects.
[0017] In one embodiment, the development module includes an acyclic graph of the data movement classes in a format that enables the assembly of the component assembly from components referring to the classes. The acyclic graph can also include a locator map specifying an execution location of a class. Further, the system can also include a test harness to test the classes, components, and/or component assembly.
[0018] In yet another aspect, the invention includes a method for moving data that includes the steps of obtaining a data movement program written in a programmable data movement language and executing the data movement program to move data between a first and second data storage object. The programmable data movement language is independent of the data storage objects and includes a router that maps the first data storage object to the second data storage object. In one embodiment, the data movement program includes instructions to move data through specific types of routers. These instructions may be the result of a scan of data storage objects.
[0019] In another aspect, the invention can also include a system for moving data having a first and second data storage object, a programmable data movement language that is independent of the first data storage object, the second data storage object, and the system, and a run-time execution environment enabling data movement between the first and second data storage objects using the programmable data movement language.
[0020] In one embodiment, the system includes a data mover to move the data between the first data storage object and the second data storage object. Further, the system can include a data movement program written in the programmable data movement language specifying the data movement between the data storage objects. Moreover, the run-time execution environment can include a compiler to compile the data movement program and a control path module that enables dynamic data movement between the data storage objects. In some embodiments, the control path module uses hooks to dynamically vary the data movement and/or the data movement program. Brief Description of the Drawings [0021] In the drawings, like reference characters generally refer to the same parts throughout the different views. Also, the drawings are not necessarily to scale, emphasis instead generally being placed on illustrating the principles of the technology.
[0022] Fig. 1 is a block diagram of an embodiment of a data storage network having a runtime execution environment on each of four nodes.
[0023] Fig. 2 is a block diagram of an embodiment of the data storage network of Fig. 1 having components assembled into a component assembly.
[0024] Fig. 3 is a flow diagram illustrating an embodiment of the steps performed to execute a data storage application in the data storage network of Fig. 1.
[0025] Fig. 4 is a more detailed block diagram of an embodiment of the component assembly of Fig. 2 and components of Fig. 2 referring to one or more data movement classes. [0026] Fig. 5 is a block diagram of an embodiment of a development module having a directed, acyclic component graph of the data movement classes of Fig. 4. [0027] Fig. 6 is a block diagram of an embodiment of a location tree of the data storage network of Fig. 1.
[0028] Fig. 7 is a more detailed block diagram of an embodiment of a data movement class of Fig. 3.
[0029] Fig. 8 is a more detailed block diagram of an embodiment of the run-time execution environment of Fig. 1.
[0030] Fig. 9 is a block diagram illustrating an embodiment of a data movement flow within the run-time execution environment of Fig. 8.
[0031] Fig. 1 OA is a flow diagram illustrating an embodiment of management tasks performed by the run-time execution environment of Fig. 9.
[0032] Fig. 1 OB is a flow diagram illustrating an embodiment of the steps performed by the run-time execution environment of Fig. 9 to execute client I/O requests.
[0033] Fig. 11 illustrates a more detailed block diagram of an embodiment of the techniques that the run-time execution environment uses to service I/O requests. Detailed Description
[0034] Referring to Fig. 1, a data storage network 104 includes a first node 108a, a second node 108b, a third node 108c, and a fourth node 108d (generally 108) connected via a fabric 112. The data storage network 104 is a network that operates on data. In one embodiment, the data storage network 104 is a network that facilitates the transmission, processing, storage and/or retrieval of data between client applications (i.e., clients) and physical storage resources. Examples of client applications include databases, mail servers and web servers. The client applications can execute on one or more devices (e.g., computers) in communication with the data storage network 104. In another embodiment, the client is a user of the data storage network 104 and/or node 108. An example of the data storage network 104 includes a local area network (LAN), a wide area network (WAN), a Fibre Channel network, and the like. [0035] The fabric 112 enables communication between the nodes 108 or some subset of the nodes 108 in the data storage network 104. In one embodiment, the fabric 112 is a switch with multiple ports that routes packets of information received by an input port to an output port. Alternatively, the fabric 112 includes multiple switches that connect to one another through cables, such as optical cables.
[0036] The physical storage resources of the data storage network 104 are the physical locations at which the data resides. The physical storage resources can include one or more nodes 108. Examples of a node 108 include, but are not limited to, a hard disk, a storage controller (e.g., having one or more central processing units (CPUs) and memory), a server, a collection of disks and storage controllers (i.e., disk arrays), a fabric packaged together with a collection of storage controllers (i.e., a storage switch), and a collection of storage controllers without disks or fabrics (i.e., diskless storage appliance). Each node 108 has an input and output port and connects to the fabric 112 via a respective cable 116a, 116b, 116c, 116d (generally 116) or other communications media. For example, in one embodiment one or more of the nodes 108 communicate with the fabric 112 via a wireless communication link. The data storage network 104 communicates using one or more data transmission protocols, such as Ethernet, TCP/IP, SCSI, iSCSI, Fibre Channel, and Infiniband. Although shown with one fabric 112, there can be, and the nodes 108 can be connected to, any number of fabrics 112.
[0037] In one embodiment, the node 108 is a controller that includes a CPU or multiple CPUs. Moreover, the CPUs of the node 108 can be organized in a Symmetric Multi Processor (SMP) configuration. Such SMP configuration of a node 108 enables a collection of CPUs and memories to behave as a single CPU with memory. The node 108 may also include a programmable chip that performs storage functionality on a storage controller or in the fabric 112.
[0038] In one embodiment, the nodes 108 also include storage software that executes on the CPUs of one or more nodes 108 and provides storage functionality. For instance, the storage software enables a node 108 (e.g., a storage controller) to present simulated disks to clients that are different than the disks that the controller actually controls. An example of such software would be storage virtualization software.
[0039] In one embodiment, the storage software on each node 108 includes a respective runtime execution environment 120a, 120b, 120c, 120d (generally 120). As described in more detail below, each run-time execution environment 120 is an environment enabling the execution of a data storage application within the data storage network 104 (e.g., with its respective node 108). The run-time execution environment 120 enables efficient data movement between data storage objects and enables the implementation of a programmable data storage language to generate a data movement program.
[0040] In one embodiment, each run-time execution environment 120 on each node 108 within a group together form a distributed, data storage network run-time execution environment 124 for the data storage network 104. In this way, a number of nodes 108 can be viewed as one. The description below for the run-time execution environment 120 on a node 108 also applies to the data storage network run-time execution environment 124.
[0041] Referring to Fig. 2, the run-time execution environment 120 enables the execution of one or more data storage applications 208. The data storage application 208 performs a data movement operation on one or more pieces of data using data movement functionality. Exemplary data movement functions include mirroring, caching, striping, remote replication, point in time duplication, snapshot, and partitioning.
[0042] The run-time execution environment 120 uses this functionality to execute the data storage application 208. To execute the data movement functionality (e.g., mirroring) associated with the data storage application 208, the run-time execution environment 120 employs one or more components (e.g., a first component 212a, a second component 212b, and any number of additional components 212c (generally 212)). In one embodiment, each component 212 enables an independent data movement function by referring to a data movement class and instantiating the class. In one embodiment and as described in more detail below, the run-time execution environment 120 uses the reference to determine which class to instantiate for the purpose of deploying this component. Further, the data movement class may be loaded into the run-time execution environment 120 before the run-time execution environment 120 executes the data storage application 208 or during execution of the data storage application 208. [0043] To execute the data storage application 208, a group of components 212 act together to perform the data storage operation of the application 208. This group of components is a component assembly 216 and embodies the data storage application 208. The run-time execution environment 120 enables the building of the component assembly 216 from the components 212.
[0044] In particular and in one embodiment, the component assembly 216 describes how the components 212 work together to achieve the data movement operation of the data storage application 208. The description can also describe deployment characteristics of the components 212. For example, the component assembly 216 can describe how the components 212 in the component assembly 216 are deployed on different run-time execution environments 120 residing on different nodes 108.
[0045] The run-time execution environment 120 can load the component assembly 216 during run-time of the data storage application 208. In some embodiments, the run-time execution environment 120 instantiates and/or deploys the components 212 within a component assembly 216. The run-time execution environment 120 can also initialize the components 212 in the component assembly 216. In one embodiment, the initialization depends on the component assembly 216. Thus, the run-time execution environment 120 can analyze the component assembly 216, determine the data storage application 208 that the assembly 216 embodies, and adjust the initialization of one or more of the components 212 based on the component assembly 216. The run-time execution environment 120 then executes the component assembly 216 to perform the data storage application 208 on one or more nodes 108. [0046] As described in more detail with respect to Figs. 9, the run-time execution environment 120 can also shut down (i.e., de-stage) one or more components 212. This may depend on the component assembly 216 or the node 108 on which the component assembly 216 executes. In one embodiment, the shutting down of a component 212 depends on the component assembly 216.
[0047] In one embodiment, the component 212 has a public state and a private state. The public state is available to the run-time execution environment 120 and includes the condition of the component. Examples of the condition of the component 212 include whether the component 212 is being loaded, deployed, staged, de-staged, and/or whether the component 212 can accept input/output (I/O). In one embodiment, the public state changes based on the runtime execution environment 120 and the component 212. For example, if the run-time execution environment 120 deploys the component 212, the public state of the component 212 changes to represent that the component 212 is deployed.
[0048] The private state includes the parameters of the component. The parameters can include the content or function of the component or whether the component is on or off. For example, the private state can include a description that the component 212 performs remote mirroring or remote mirroring in-sync to local mirroring. In one embodiment, the developer of the component 212 (e.g., the administrator) can view and/or change the private state of the component 212.
[0049] The component assembly 216 can also have a public and private state. The private state of the component assembly 216 can include, for example, a description of the data storage application 208 that the component assembly 216 embodies. The private state can also include a list of the components 212 that make up the component assembly 216 and the parameters of the components 212. Moreover, the state of the component assembly 216 can additionally include how the components 212 are connected to create the component assembly 216. [0050] The run-time execution environment 120 provides a robust environment for the execution of the component assembly 216, such as by ensuring optimal component assembly execution even if one or more nodes 108 are disfunctional or damaged. For example, if the first node 108a loses communications with the rest of the nodes 108 in the data storage network 104, the run-time execution environment 120 enables the component assembly 216 to function at its optimal level whenever possible by redirecting work from the damaged node 108a to other nodes 108 that have the same components 212 deployed on them, if these nodes 108 exist. [0051] Also referring now to Fig. 3, an administrator determines a particular data storage application 208 that the administrator wants to execute on one or more nodes 108 (step 304). For example, the administrator may want to deploy a data storage application 208 that provides synchronous remote mirroring between the first node 108a and a storage array consisting of the second node 108b, the third node 108c, and the fourth node 108d for ten logical volumes. This data storage application 208 requires at least two pieces of data storage functionality - mirroring functionality and caching functionality. In one embodiment, the administrator develops the application 208 with particular components 212 that implement these data storage functions (step 308). For instance, the first component 212a provides the mirroring functionality and the second component 212b provides the caching functionality. For this data storage application 208, ten mirroring components 212a are needed (i.e., one for each volume) and two caching components are needed (i.e., one for the first node 108a and one for the storage array). [0052] The administrator then uses these components 212a, 212b to generate the application 208. The administrator does this by connecting the components 212a, 212b in a fashion that is appropriate for the requisite data storage application 208. In one embodiment, the administrator assembles the components 212a, 212b into the component assembly 216 to embody the data storage application 208 (step 312). The administrator can also describe the deployment characteristics of the components, such as to deploy each of the mirroring components 212a on the nodes 108b, 108c, and 108d of the storage array, deploy the first caching component 212b on the nodes 108b, 108c, and 108d of the storage array, and deploy the second caching component 212b on the first node 108a (i.e., the storage appliance). The run-time execution environment 120 subsequently executes the data storage application 208 to perform the synchronous remote mirroring between the storage array and the storage appliance (step 316). Although many of the steps are described above as being performed by an administrator, any combination of the steps can be performed by an administrator and/or by the run-time execution environment 120. [0053] Fig. 4 shows a block diagram of an embodiment of the components 212 and the component assembly 216. Each component 212 refers to one or more data movement classes (e.g., the first data movement class 404a, the second data movement class 404b, and/or any additional data movement classes 404c (generally 404). Any number of classes 404 can be present. The reference to a class 404 is a pointer to the class 404 and is shown as a corresponding reference arrow 408a, 408b, 408c, and 408d (generally 408). In some embodiments, one component (e.g., the first component 212a) refers to multiple classes 404 (e.g., the first class 404a and the second class 404b). As described in more detail below, each component 212 can also refer to locations of the class 404.
[0054] In one embodiment, the development module 412 creates the component assembly 216 from the components 212. The development module 412 is a software module that resides on one or more computing devices. For example, the development module 412 can reside on a personal computer (e.g., 286, 386, 486, Pentium, Pentium II, Macintosh computer), Windows- based terminal, network computer, wireless device, information appliance, RISC Power PC, X- device, workstation, mini computer, main frame computer, personal digital assistant, node 108, or other computing device that can communicate with a node 108.
[0055] The development module 412 enables an administrator to use the data movement classes 404 to create the component assembly 216. In one embodiment, the development module 412 retrieves one or more data movement classes 404 (and/or components 212 / component assemblies 216) from a network (e.g., the World Wide Web or the Internet). Moreover, the development module 412 can provide data movement classes 404 (and/or components 212 / component assemblies 216) for other administrators to access over a network, such as via a web service. [0056] As described further below, each class 404 has one or more storage objects (e.g., a first storage object 416a, 416b, 416c (generally 416) and a second storage object 420a, 420b, 420c (generally 420)). The storage object 416, 420 is an abstract repository of user data. In one embodiment, the storage object 416, 420 is a data structure, such as an array, a list, or a linked list. The storage object 416, 420 can have a size (i.e., an amount of data that the storage object 416, 420 can store), a data type, the ability to store and retrieve data, the ability to report problems with the correctness or availability of the data, the ability to be managed (e.g., through a management interface), or any combination or subset of these characteristics. [0057] In one embodiment, the first storage object 416 is an imported storage object and the second storage object 420 is an exported storage object. As described further with respect to Fig. 7, the imported storage object 416 can be a repository of data external to the class 404 that an administrator (e.g., a class developer) assumes to be given when developing a class. The exported storage object 420 is a virtual repository of data that the class 404 creates by using external storage objects. In some embodiments, classes 404 other than the class 404 that created the exported storage object 420 can use the exported storage object 420. For example, the first class 404a creates an exported storage object 420a from an imported storage object 416. The second class 404b can now use the exported storage object 420a as its repository of data. [0058] In one embodiment and also referring to Fig. 5, the development module 412 includes a directed, acyclic component graph 504 having one or more vertices 508a, 508b, 508c (generally 508). Each vertex 508 of the graph 504 describes a component 212. The description of the component 212 includes instantiation information. The instantiation information can include the class reference 408 and construct parameters for the class 404. For example, the runtime execution environment 120 uses instantiation information to determine that the run-time execution environment 120 has to instantiate the first data movement class 404a to construct the first component 212a. The run-time execution environment 120 also determines, from the instantiation information, other necessary parameters to construct the first component 212a. In one embodiment, these parameters are specific to the component 212 and may vary from component 212 to component 212, even components 212 referring to the same class 404. Examples of these parameters include the size and type of the component 212. [0059] In addition to the instantiation information, the description can also include deployment information for one or more components 212. In one embodiment, the deployment information includes a location reference. For example, the run-time execution environment 120 uses the deployment information to determine that a component 212 requires a class 404 to be loaded on the nodes 108 that belong to a particular location.
[0060] The instantiation of a data storage object 416, 420 of a class 404 is a facet of the component 212. In particular, the imported data storage objects 416 are instantiated into a corresponding imported facet 512a, 512b, 512c, 512d, 512e (generally 512). Likewise, the exported data storage objects 420 are instantiated into a corresponding exported facet 516a, 516b, 516c, 516d (generally 516). In one embodiment, the imported facets 512 represent the actual storage that the component 212 consumes. The exported facets 516 represent the actual storage that the component provides.
[0061] Further, each facet 512, 516 has particular characteristics. These characteristics can depend on the instantiation information of the component 412 and/or on properties of the storage objects 416, 420 of the class 404. Exemplary characteristics include size and type of the facet 512, 516.
[0062] In one embodiment, the component 212 imports the exported facet 516 of another component 212. For example, the second component 212b (as shown by the second vertex 508b) includes the second exported facet 516b. As stated above, the second component 212b makes the second exported facet 516b available for use by other components 212. Thus, the first component 212a (as shown by the first vertex 508a) imports the second exported facet 516b into the second imported facet 512b of the first component 212a. This importing from the second vertex 508b to the first vertex 508a is shown in the component graph 504 with a first importation arrow (or "directed edge") 520a. Thus, the second component 212b provides data storage that the first component 212a uses. The first component's importation of the third exported facet 516c of the third component 212c (as shown by the third vertex 508c) into the first imported facet 512a of the first component 212a is shown with a second importation arrow 520b. [0063] In some embodiments, the development module 412 also includes a verifier 524. The verifier 524 can check that the importation arrows 520a, 520b (generally 520) only connect facets 512, 516 of identical characteristics (e.g., size and type). If the verifier 524 determines that one or more arrows 520 connect facets 512, 516 having different characteristics, then the verifier 524 rejects the component assembly 216 as embodied in the graph 504. Although shown as being within the development module 412, the verifier 524 can be external to the development module 412.
[0064] Fig. 6 shows a block diagram of a location tree 604 having a hierarchy of locations (e.g., the first through seventh location 608a-608g (generally 608)). Each location 608 is a set of nodes 108 (i.e., one node 108 or multiple nodes 108). In one embodiment, the development module 412 creates the location tree 604 to describe the data storage network 104. [0065] In one embodiment, the location tree 604 has leaves which correspond to the simple locations with one node 108 (i.e., the first location 608a, second location 608b, third location 608c, and fourth location 608d). The location tree 604 also includes compound locations 608 having more than one node 108 (i.e., the fifth location 608e, sixth location 608f, and seventh location 608g). In one embodiment, the compound locations 608 contain all of the nodes 108 that are in its child locations 608 (i.e., the leaves connected to each compound location 608). [0066] The run-time execution environment 120 (and/or the development module 412) can change which (and how many) nodes 108 are in a location 608. This change may be at a predetermined time or in response to an event or occurrence. For example, the run-time execution environment 120 may remove the first node 108a from a location 608 and then add the second node 108b to the location 608 at the same time or a later time. As the nodes 108 communicate via the fabric 112, a location 608 represents a cluster of nodes 108 (e.g., CPUs). This cluster of nodes 108 enables distributed computation.
[0067] In one embodiment, the run-time execution environment 120 uses the location tree 604 to describe the locations of deployment and execution of the class code of the components 212 in a component assembly 216. The use of locations 608 can, for example, free the component assembly 216 from being directly associated with the physical data storage network 104, which in turn allows the set of nodes 108 in a location 608 to change without changing the location 608 itself, consequently making the component assembly 216 portable. The locations 608 also enable an administrator to describe fault tolerance parameters of the data storage network 104. For example, if a failure occurs in one node 108 of a set of nodes 108 associated with a location 608, and if the other nodes 108 in the set are functioning properly, the run-time execution environment 120 can adjust the location 608 so that the data storage application 208 embodied by the component assembly 216 continues to operate with little or no functional degradation. In one embodiment, the run-time execution environment 120 diverts the function of the non-functioning node 108 to other nodes 108 in the location 608.
[0068] In more detail about the data movement classes 404, and referring to Fig. 7, the data movement class 404 creates complex storage objects (e.g., an exported storage object) from any number of simpler storage objects (e.g., imported storage objects). For example, the data movement class 404 creates a complex storage object from a first imported data storage object 702a. The class 404 uses the imported storage object 702a or objects (e.g., the first imported data storage object 702a, a second imported data storage object 702b, and/or a third imported data storage object 702c (generally 702)) as building blocks for creating the exported data storage object 704.
[0069] In one embodiment, the data movement class 404 includes a declarative part that includes a declaration of storage objects 702, 704 and a declaration of one or more routers 706. The data movement class 404 can also include a procedural part that includes one or more functional hooks 708 associated with the router 706.
[0070] In particular, the class 404 connects any number of data storage objects, such as the first imported data storage object 702a and the exported data storage object 704, with the router 706. The router 706 is a mapping that describes how the data layout in the exported storage object 704 is related to the data layout in the first imported data storage object 702a. The mapping, or description, 712 can be fixed, dynamic, or partly fixed and partly dynamic. In one embodiment, a fixed mapping 712 (or portion of mapping) states explicit relationships between sections of data in the exported storage objects 704 and sections of data in the imported storage objects 702. Further, a dynamic mapping 712 (or portion of mapping) states an algorithm for run-time whenever the mapping 712 from the exported storage objects' data layout to the imported storage objects' data layout needs to be determined for sections of data not covered by the fixed mapping 712. The algorithm can depend on, for example, the type of router 706 used. Further, the algorithm can involve executing the functional hook 708 or hooks 708 that take data section descriptions as parameters.
[0071] For example, the data movement class 404 can implement mirroring using the first imported data storage object 702a, the second imported data storage object 702b, and the exported data storage object 704. These data storage objects 702, 704 are data storage objects of the same data structure, such as a one dimensional array of 1 million blocks. The router 706 contains fixed mappings in the data movement description 712 for write requests into the exported data storage object 704 and dynamic mappings with a first functional hook 716 for read requests out of the exported data storage object 704. The fixed mapping for write requests specifies, for instance, that the router 706 translates any write request to a first (e.g., predetermined) address in the exported data storage object 704 to a write request to the same address in both the first imported data storage object 702a and the second imported data storage object 702b. Thus, the data written to the exported data storage object 704 is mirrored in both the first and second imported data storage objects 702a, 702b. [0072] The dynamic mapping specifies, for instance, that any read request to read from the first address of the exported data storage object 704 is processed by an algorithm. In one embodiment, the algorithm calls the first functional hook ("F") 716 with the first address ("A") as a parameter of the function call. The result of the algorithm F(A) is a bit that determines whether the data is read from the first address in the first imported data storage object 702a or from the first address in the second imported data storage object 702b. Therefore, the dynamic mapping results in a variable mirror service policy.
[0073] Referring to Fig. 8, and in an embodiment of the run-time execution environment 120, the run-time execution environment 120 includes several modules that perform the functions of the run-time execution environment 120. In one embodiment, the run-time execution environment 120 includes an administrative module 804, a querying module 808, a fast path executor 812, a control path module 816, a class repository 820, a services module 824, a data path module 828, and a porting module 832. Although shown as including all of these modules 804-832, the run-time execution environment 120 can include one or more subset of these modules 804 - 832. Moreover, although the modules 804-832 are illustrated as being within a single run-time execution environment 120, one or more of the modules 804-832 can be within one or more run-time execution environment 120. Additionally, the modules 804-832 can be written in any programming language and/or scripting languages, such as, but not limited to, Java, Visual Basic, C++, and the like.
[0074] The administrative module 804 enables the management of the run-time execution environment 120. The management of the run-time execution environment 120 can include loading, unloading, and managing the component assembly 216 in the run-time execution environment 120. In one embodiment, an administrator uses the administrative module 804 to manage the run-time execution environment 120, such as via a client computer communicating with the data storage network 104 over a network. The administrative module 804 can provide an interface to the administrator, such as a display screen with selection options and buttons (e.g., enabling the administrator to select a component assembly 216). [0075] The querying module 808 enables the management of the private states of the component 212 or component assembly 216. Thus, an administrator or computer program (e.g., data storage application 208) can use the querying module 808 to query the component 212 or component assembly 216 and determine the parameters of the component 212, such as what function the component 212 performs. In additional embodiments, the querying module 808 also enables the management of the public state of the component 212 or component assembly 216. [0076] The fast path executor 812 is a gateway between the run-time execution environment 120 and the data storage clients that the data storage application serves. The fast path executor 812 services requests received by data storage clients. In one embodiment, the fast path executor 812 also communicates with a requesting data storage client on the status of the request (e.g., successfully completed, failed attempt, or still processing).
[0077] In one embodiment and upon receipt of a client request, the fast path executor 812 forwards the client request to the control path module 816. The control path module 816 executes the client request. Further, the control path module 816 executes other I/O requests. In one embodiment, the functional hook 708 also initiates I/O requests. Moreover, the control path module 816 can also execute any other type of hook, such as a hook initiated by the run-time execution environment 120.
[0078] The class repository 820 is a storage repository (e.g., database) of data movement classes 404. In particular, the instantiation information of a component 212 refers to a class 404 in the class repository 820. Thus, when the administrative module 804 loads and/or deploys a component assembly 216, the administrative module 804 refers to the class repository 820 to instantiate the class 404 on a node 108 where the component 212 is deployed. [0079] The services module 824 provides services that are made available to the functional hooks 708. Examples of these services include distributed locking and distributed data sharing, distributed logging, and timing services. The data path module 828 provides the data path service to the functional hooks 708. This service enables a functional hook 708 to change its router's mapping at run-time.
[0080] The porting module 832 provides hardware specific services to particular hardware platforms. In one embodiment, the type of porting module 832 depends on the hardware in the node 108 to which the run-time execution environment 120 is ported. The porting module 832 can also provide a porting interface enabling an administrator to select the hardware services that the module 832 provides. In one embodiment, an independent computer (e.g., laptop) supplies the porting module 832 in a simulated form, thus creating a test harness enabling testing and test execution of the run-time execution environment 120 and component assembly 216. [0081] Referring to Figs. 9 and 10 A, in one embodiment, the run-time execution environment 120 performs management tasks relating to components 212 and/or component assemblies 216. These management tasks include the class repository 820 receiving one or more new classes 904 (step 1004). In one embodiment, an administrator loads the new classes 904 into the class repository 820 (step 1008). [0082] The management tasks can also include a loading step, a deployment step, a staging step, and/or a de-staging step for component assemblies 216. As part of loading a component assembly 216, the administrative module 804 receives one or more component assembly updates 908 (step 1012). These component assembly updates 908 can be a new component assembly 216 or a run-time update to an existing, staged component assembly 216. [0083] During the deployment step of a component assembly 216, the administrative module 804 instantiates the classes 404 indicated by the components 212 at the locations 608 indicated by the components 212 (step 1016). The administrative module 804 performs class instantiations 912 using the class repository 820. In one embodiment, the administrative module 804 creates a compiled data movement digest 920 (step 1024). The digest 920 is a data movement program that details the steps that the run-time execution environment 120 has to perform to process future client I/O requests 916. The administrative module 804 can create the digest 920 by analyzing the routers 706 of the various components 212 in the component assembly 216. For example, the administrative module 804 can follow the flow of data inside each component 212 (i.e., from each exported facet 516 to its corresponding imported facet 512), through the component's routers, and between components (i.e., through importation arrows 520). During this analysis, if the administrative module 804 determines that one or more of the client I/O requests 916 will not call a functional hook 708, the administrative module 804 transmits this information to the fast path executor 812 via the digest 920. [0084] When the fast path executor 812 receives the digests 920 and if the fast path executor 812 determines that the digest 920 does not include a client I/O request 916 that calls a functional hook 708, the fast path executor 812 can delegate the execution of these types of client I/O requests 916. For example, the fast path executor 812 can forward these types of client I/O requests 916 to hardware acceleration elements, such as storage processors. This occurs when the fast path executor 812 issues a data path hardware service request 924. Therefore, in one embodiment, a CPU in a node 108 executing the run-time execution environment 120 does not have to service these client I/O requests 916.
[0085] After the deployment step, the administrative module 804 performs a staging step. In this step, the administrative module 804 transitions the component assembly 216 to an online state so that the fast path executor 812 can execute the component assembly 216 in order to serve the client I/O requests 916 (step 1028). In one embodiment, the administrative module 804 executes run-time execution environment hooks 928 to stage the component assembly 216 (step 1032). The run-time execution environment hooks 928 are functional hooks that the data movement class 404 provides to assist in the staging of component assemblies 216. In one embodiment, the run-time execution environment 120 uses the run-time execution environment hooks 928 to determine the parameters of each component 212 (e.g., type and size of the facets 512, 516) as part of the staging process. Thus, during the staging process, the run-time execution environment 120 communicates the component parameters to each component 212 (i.e., the runtime execution environment 120 fills in the form / layout that the class 404 provides for the parameters) by invoking the run-time execution environment hook 928. Moreover, in one embodiment the run-time execution environment 120 uses the run-time execution environment hooks 928 to determine which components 212 connect to other components 212. Thus, the runtime execution environment hooks 928 perform load-time adjustments of the components 212. In one embodiment, the run-time execution environment 120 executes the run-time execution environment hooks 928 starting with the components 212 at the leaves of the component graph 504 and traveling up the component graph 504.
[0086] The administrative module 804 can also de-stage a component assembly 216. The administrative module 804 de-stages a component assembly 216 when the data storage network 104 or node 108 is shut-down or when the administrative module 804 receives a component assembly update 908 that replaces a part of the component assembly 216. In one embodiment, the administrative module 804 executes the run-time execution environment hooks 928 to de- stage a component assembly 216 (step 1036). In one embodiment, the run-time execution environment 120 executes the run-time execution environment hooks 928 beginning with the roots of the component graph 504 (e.g., the first vertex 508a) and traveling down the component graph 504 (i.e., opposite of the staging process).
[0087] In addition to the management tasks and also referring to Fig. 10B, the run-time execution environment 120 also executes the client I/O requests 916. As described above, the fast path executor 812 determines if the client I/O request 916 calls any functional hooks (step 1040). If the request 916 does not call a hook, the fast path executor 812 then determines if the hardware associated with the I/O request 916 can execute the request 916 (step 1044). If the hardware can execute the request 916, then the fast path executor 812 delegates client I/O requests 916 (that do not call functional hooks 708) to the hardware (e.g., hardware acceleration elements) (step 1048). If the client I/O request 916 cannot be executed in hardware, the fast path executor 812 executes the request 916 (step 1052). If the digest 920 indicates that component functional hooks 708 have to be invoked as part of the execution, the fast path executor 812 communicates a hook execution request 932 to the control path module 816 to invoke the hook (step 1056).
[0088] The control path module 816 executes all data movement class hooks (step 1060). In one embodiment, the hooks of a class 404 include both the run-time execution environment hooks 928 and router hooks 708. Further, the hooks of a class 404 include code that can create hook initiated I/O requests 936 to and from imported storage objects 416 of the class 404. [0089] As described above, the router hooks 708 can update the data movement description 712. To achieve this update, the control path module 816 calls the data path module 828 with a router update request 940. Further, the router hooks 708 and/or the run-time execution environment hooks 928 can invoke services such as data locking and/or data logging that are not data-path specific. For example, the control path module 816 can transmit a non-data request 942 to the services module 824. In one embodiment, the services module 824 requests these services from the porting module 832 by sending services module hardware services requests 944.
[0090] As stated above, the porting module 832 contains non-portable code and therefore may need to be written separately for each hardware environment and operation system environment to which the run-time execution environment 120 is ported. The porting module 832 can have a portable interface through which other modules can request services that are implemented differently on different hardware and different operating systems. In one embodiment, the services module 824 provides some or all of these services 944, such as locking and data sharing, through a portable service module hardware services interface. [0091] The control path module 816 can also provide I/O services to component hooks that read, write, or copy data from / to / between imported facets 512. In one embodiment, the porting module 832 provides control path hardware services to the control path module 816 after receiving a control path hardware services request 948. Further, the fast path executor 812 and/or the data path module 828 load and modify the compiled data movement digests 920 onto hardware acceleration tools, such as storage processors, by communicating a data path module hardware services request 952 to the porting module 832.
[0092] Fig. 11 illustrates a more detailed block diagram of the technique that the run-time execution environment 120 (i.e., the control path module 816) uses to service client I/O requests 916 and hook-initiated I/O requests. Clients transmit requests to one or more exported facets 516 of a component 212. These facets are exported by components 212 that form a part of a component assembly 216. Each component 212 includes a corresponding router 706 that connects the exported facets 516 to imported facets 512 through a mixture of fixed router mappings (e.g., a first, second, and third fixed router mapping 1104a, 1104b, 1104c, respectively (generally 1104)) and variable router mappings (e.g., a first, second, and third fixed router mapping 1108a, 1108b, 1108c, respectively (generally 1108)) (i.e., functional hooks). In one embodiment, the component assembly 216 describes how imported facets 512 of one component 212 are identified with exported facets 516 of other components 212. [0093] The run-time execution environment 120 collects the information that the components 212 provide (i.e., the fixed mapping 1104 and the variable mapping 1108) together with the connections between the components 212) to create a data movement program 1112. The data movement program 1112 can include a sequence of instructions that perform simple manipulations on data descriptions 712. For instance, if a data description 712 describes a subset of the storage embodied in the first exported facet 516a and the run-time execution environment 120 executes the data movement program 1112, the run-time execution environment 120 generates a description of the subset of the physical storage resources (e.g., the first physical storage resource 1116a, the second physical storage resource 1116b, and the third physical storage resource 1116c (generally 1116)) where the data actually resides. The physical storage resources 1116 can be, for instance, a hard drive, a floppy disk, a memory board, etc. In one embodiment, the description is an output of the data movement program 1112. [0094] For example, the first exported facet 516a represents one million blocks of storage, with each block being 512 bytes long. In one embodiment, a description of a subset of that storage is a list of block extents within the one-million block range. To write data to that storage, a client can issue a write request 916 for the range of blocks extending from block 50 to block 100, inclusive. Thus, the data description can be: [516a: 50-100]. [0095] The first router 706a connecting the first exported facet 516a to the first imported facet 512a and second imported facet 512b can have both fixed and dynamic mapping rules. In particular, the first router 706a includes the fixed router mapping 1104a and the variable mapping 1108a. In one embodiment, the variable mapping 1108a is null. Therefore, the mapping of the first router 706a is entirely fixed via the fixed router mapping 1104a. The fixed mapping 1104a can map the storage represented by the first exported facet 516a onto the first and second imported facets 512a, 512b, respectively. In the data movement program 1112, this correlates to a duplication operation that translates the range [516a: 50-100] to the pair of ranges [512a: 50-100] and [512b: 50-100]. [0096] In one embodiment, the next step in the data movement program 1112 relates to the structure of the component assembly 216. The data movement program 1112 translates the pair [512a: 50-100] and [512b: 50-100] to the new pair [516b: 50-100] and [516c: 50-100] because the component assembly 216 connects the first imported facet 512a with the second exported facet 516b of a second router 706b and connects the second imported facet 512b with the third exported facet 516c of a third router 706c. The second router 706b connects the second exported facet 516b to the third imported facet 512c and fourth imported facet 512d. [0097] In one embodiment, the second router 706b has the second fixed router mapping 1104b and the second variable mapping 1108b. In this example, the fixed mapping 1104b states that the range of blocks given for the second exported facet 516b does not change but the identity of the imported facet to which the range is directed to is determined by the variable mapping 1108b (i.e., a functional hook). The data movement program 1112 includes instructions that take the data description [516b: 50-100] and applies the functional hook 1108b to the description to obtain a result X. X can either be the third imported facet 512c or the fourth imported facet 512d. Therefore, the resulting data description is [X: 50-100], where X is either the third or fourth imported facet 512c, 512d, respectively, depending on the value returned by the functional hook 1108b.
[0098] The data movement program 1112 ends up at a description consisting of data intervals only within the storage resources 1116. When reaching this point, the data movement program invokes a data mover 1124 to move the data to the storage resources 1116 (e.g., node 108). Thus, the data mover 1124 takes the results of the data movement program 1112 (i.e., the description) and moves the data directly into the correct addresses in the storage resources 1116. Thus, the run-time execution environment 120 follows the data movement descriptions generated by each component 212 separately and the component assembly 216 as a whole, and serves read, write, and copy requests from clients by moving the data from its starting point (e.g., the first exported facet 516a) to its ultimate end point (e.g., the first storage resource 1116a) without using intermediate copies. Therefore, although described above as moving data to the storage resource 1116 (write operation), the data mover 1124 can also move data from one or more of the storage resources 1116 (read operation) or between the storage resources 1116 (copy operation). The run-time execution environment 120 therefore enables extremely efficient data transfers because of the direct movement of data between a starting point of the first component 212 and an end point. [0099] Having described certain embodiments of the invention, it will now become apparent to one of skill in the art that other embodiments incorporating the concepts of the invention may be used. Therefore, the invention should not be limited to certain embodiments, but rather should be limited only by the spirit and scope of the following claims. What is claimed is:

Claims

1. A system for data movement, comprising:
(a) a data storage application;
(b) a data storage network;
(c) a plurality of components, each component implementing data movement functionality;
(d) a component assembly assembled from the plurality of components to embody the data storage application within the data storage network; and
(e) a run-time execution environment within the data storage network for executing the data storage application embodied in the component assembly by implementing the data movement functionality within the data storage network.
2. The system of claim 1 wherein the data storage network further comprises at least one of a data storage array, a data storage switch, and a geographically disbursed network of switches and arrays.
3. The system of claim 1 wherein the run-time execution environment executes on a data storage endpoint within the data storage network.
4. The system of claim 1 wherein each component in the plurality of components further comprises an instantiation of a class.
5. The system of claim 4 wherein the run-time execution environment further comprises at least one of:
(d-a) a class repository accepting another class for the creation of another component;
(d-b) an administrative module administering the component assembly;
(d-c) a fast path executor executing input / output requests to at least one of the component assembly, a component in the plurality of components, and a data storage endpoint; and
(d-d) a data path module optimizing data movement within the data storage network by optimizing the execution of the data storage application.
6. The system of claim 5 wherein the administrative module administering the component assembly further comprises at least one of loading a component assembly, removing a component assembly, and deploying a component assembly within the run-time execution environment.
7. The system of claim 5 wherein the fast path executor communicates with at least one of the data path module and the porting module when executing the input/output requests.
8. The system of claim 1 wherein the run-time execution environment further comprises at least one of:
(d-a) a services module enabling a data service for at least one of a component in the plurality of components and the component assembly; and
(d-b) a querying module querying at least one of a component in the plurality of components and the component assembly.
9. The system of claim 8 wherein the data service further comprises at least one of logging, locking, a time service, an event service, and a shared data service.
10. The system of claim 3 wherein the run-time execution environment further comprises a porting module to install the component assembly onto the data storage endpoint.
11. A method for executing a data storage application within a data storage network, comprising:
(a) implementing data movement functionality in a plurality of components;
(b) assembling the plurality of components into a component assembly embodying the data storage application; and
(c) executing the data storage application embodied in the component assembly by implementing the data movement functionality using a run-time execution environment within the data storage network.
12. The method of claim 11 further comprising executing the run-time execution environment on a data storage endpoint within the data storage network.
13. The method of claim 12 wherein the executing step further comprises:
(c-a) accepting a class for the creation of another component; (c-b) administering the component assembly;
(c-c) executing input / output requests to at least one of the component assembly, a component in the plurality of components, and the data storage endpoint; and (c-d) optimizing data movement within the data storage network.
14. The method of claim 13 wherein the administering of the component assembly further comprises at least one of loading a component assembly, removing a component assembly, deploying a component assembly, staging a component assembly, and de-staging a component assembly within the run-time execution environment.
15. The method of claim 13 wherein the optimization step further comprises optimizing the execution of the data storage application within the data storage network.
16. The method of claim 11 wherein the executing step further comprises: (c-a) enabling a data service for at least one of a component in the plurality of components and the component assembly; and
(c-b) querying at least one of a component in the plurality of components and the component assembly.
17. The method of claim 12 wherein the executing step further comprises installing the component assembly on the data storage endpoint.
18. The method of claim 17 wherein the enabling of a data service further comprises performing at least one of logging, locking, a time service, an event service, and a shared data service.
19. A data movement class, comprising:
(a) a plurality of data storage objects; and
(b) a router specifying movement of data between the plurality of data storage objects.
20. The data movement class of claim 19 further comprising functional hooks in the router enabling modification of the movement of data otherwise specified by the router.
21. The system of claim 19 wherein the router further comprises a plurality of routers.
22. The system of claim 20 wherein the functional hooks enable at least one of logging, event notification, and administrative interaction.
23. The system of claim 19 wherein the plurality of data storage objects further comprises at least one of an imported data storage object, an exported data storage object, and a private data storage object.
24. The system of claim 19 wherein the router specifies at least one of a variable movement of data, a fixed movement of data, and a mixed fixed and variable movement of data.
25. The system of claim 20 wherein the functional hooks further comprise at least one of event hooks, control hooks, and run-time execution environment hooks.
26. A system for developing a component assembly comprising:
(a) a plurality of data movement classes;
(b) a plurality of components, each component in the plurality of components implementing data movement functionality by referencing at least one class in the plurality of data movement classes; and
(c) a development module assembling the plurality of components into a component assembly embodying a data storage application.
27. The system of claim 26 wherein each class in the plurality of data movement classes further comprises a plurality of data storage objects and a router specifying movement of data between the plurality of data storage objects.
28. The system of claim 27 wherein the specification of movement of data between the plurality of data storage objects further comprises a matching between the objects.
29. The system of claim 26 wherein the development module further comprises a user of the system.
30. The system of claim 26 wherein the development module further comprises an acyclic graph of the plurality of components.
31. The system of claim 30 wherein the acyclic graph enables the assembly of the component assembly from the plurality of components.
32. The system of claim 30 wherein the acyclic graph further comprises a locator map specifying an execution location of a component in the plurality of components.
33. The system of claim 32 wherein the locator map comprises distributed locations.
34. The system of claim 33 wherein the distributed locations further comprise a plurality of processing elements.
35. The system of claim 34 wherein the locator map enables replication of at least one component in the plurality of components.
36. The system of claim 34 wherein the locator map enables fault tolerance in the presence of a failure of at least one processing element in the plurality of processing elements.
37. The system of claim 26 further comprising a test harness enabling testing of at least one class in the plurality of data movement classes.
38. A method for moving data, comprising:
(a) obtaining a data movement program written in a programmable data movement language; and
(b) executing the data movement program to move data between a first data storage object and a second data storage object.
39. The method of claim 38 wherein the obtaining of the data movement program further comprises building the data movement program.
40. The method of claim 38 wherein the programmable data movement language is independent of the first data storage object and the second data storage object.
41. The method of claim 38 wherein the data movement language further comprises a router mapping the first data storage object to the second data storage object.
42. The method of claim 41 wherein the data movement program further comprises instructions to move data through specific types of routers.
43. The method of claim 42 further comprising constructing the instructions from a scan of a plurality of data storage objects connected by a plurality of routers.
44. The method of claim 41 wherein the router further comprises at least one of a fixed mapping and a variable mapping.
45. The method of claim 38 further comprising receiving an I/O request to move data at least one of to the first data storage object and from the first data storage object.
46. The method of claim 45 further comprising determining that the data requested for movement at least one of to the first data storage object and from the first data storage object is directly moved at least one of from the second data storage object and to the second data storage object.
47. The method of claim 46 further comprising building the data movement program to directly move the data at least one of to the second data storage object and from the second data storage object.
48. The method of claim 38 further comprising generating a data movement digest written in the programmable data movement language that details the steps performed to process the I/O request.
49. The method of claim 45 further comprising changing the data movement program in response to the I/O request.
50. A system for moving data, comprising:
(a) a first data storage object;
(b) a second data storage object;
(c) a programmable data movement language independent of the first data storage object, the second data storage object, and the system for moving the data; and
(d) a run-time execution environment enabling data movement between the first data storage object and the second data storage object using the programmable data movement language.
51. The system of claim 50 further comprising a router specifying a mapping of data between the first data storage object and the second data storage object.
52. The system of claim 50 further comprising a plurality of data storage objects.
53. The system of claim 50 further comprising a data mover to move the data between the first data storage object and the second data storage object.
54. The system of claim 50 further comprising a data movement program written in the programmable data movement language specifying the data movement between the first data storage object and the second data storage object.
55. The system of claim 54 wherein the run-time execution environment further comprises a compiler to compile the data movement program.
56. The system of claim 55 wherein the run-time execution environment further comprises a control path module enabling dynamic data movement between the first data storage object and the second data storage object.
57. The system of claim 56 wherein the control path module uses hooks to dynamically vary the data movement.
58. The system of claim 57 wherein the hooks vary the data movement program.
PCT/US2003/019350 2002-06-18 2003-06-18 Data movement platform WO2003107142A2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU2003247560A AU2003247560A1 (en) 2002-06-18 2003-06-18 Data movement platform

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US39004302P 2002-06-18 2002-06-18
US60/390,043 2002-06-18

Publications (2)

Publication Number Publication Date
WO2003107142A2 true WO2003107142A2 (en) 2003-12-24
WO2003107142A3 WO2003107142A3 (en) 2004-06-17

Family

ID=29736690

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2003/019350 WO2003107142A2 (en) 2002-06-18 2003-06-18 Data movement platform

Country Status (2)

Country Link
AU (1) AU2003247560A1 (en)
WO (1) WO2003107142A2 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10942665B2 (en) 2019-05-10 2021-03-09 International Business Machines Corporation Efficient move and copy

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5745758A (en) * 1991-09-20 1998-04-28 Shaw; Venson M. System for regulating multicomputer data transfer by allocating time slot to designated processing task according to communication bandwidth capabilities and modifying time slots when bandwidth change
US5960200A (en) * 1996-05-03 1999-09-28 I-Cube System to transition an enterprise to a distributed infrastructure
US6157955A (en) * 1998-06-15 2000-12-05 Intel Corporation Packet processing system including a policy engine having a classification unit
US6256676B1 (en) * 1998-11-18 2001-07-03 Saga Software, Inc. Agent-adapter architecture for use in enterprise application integration systems
US6330717B1 (en) * 1998-03-27 2001-12-11 Sony Corporation Of Japan Process and system for developing an application program for a distributed adaptive run-time platform
US6393386B1 (en) * 1998-03-26 2002-05-21 Visual Networks Technologies, Inc. Dynamic modeling of complex networks and prediction of impacts of faults therein

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5745758A (en) * 1991-09-20 1998-04-28 Shaw; Venson M. System for regulating multicomputer data transfer by allocating time slot to designated processing task according to communication bandwidth capabilities and modifying time slots when bandwidth change
US5960200A (en) * 1996-05-03 1999-09-28 I-Cube System to transition an enterprise to a distributed infrastructure
US6393386B1 (en) * 1998-03-26 2002-05-21 Visual Networks Technologies, Inc. Dynamic modeling of complex networks and prediction of impacts of faults therein
US6330717B1 (en) * 1998-03-27 2001-12-11 Sony Corporation Of Japan Process and system for developing an application program for a distributed adaptive run-time platform
US6157955A (en) * 1998-06-15 2000-12-05 Intel Corporation Packet processing system including a policy engine having a classification unit
US6256676B1 (en) * 1998-11-18 2001-07-03 Saga Software, Inc. Agent-adapter architecture for use in enterprise application integration systems

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10942665B2 (en) 2019-05-10 2021-03-09 International Business Machines Corporation Efficient move and copy

Also Published As

Publication number Publication date
AU2003247560A1 (en) 2003-12-31
AU2003247560A8 (en) 2003-12-31
WO2003107142A3 (en) 2004-06-17

Similar Documents

Publication Publication Date Title
US7676806B2 (en) Deployment, maintenance and configuration of complex hardware and software systems
US7603669B2 (en) Upgrade and downgrade of data resource components
US7062516B2 (en) Methods, systems, and articles of manufacture for implementing a runtime logging service storage infrastructure
US20030005090A1 (en) System and method for integrating network services
US8122106B2 (en) Integrating design, deployment, and management phases for systems
US6334158B1 (en) User-interactive system and method for integrating applications
US8621419B2 (en) Automating the life cycle of a distributed computing application
US7684964B2 (en) Model and system state synchronization
US8949364B2 (en) Apparatus, method and system for rapid delivery of distributed applications
US20020188941A1 (en) Efficient installation of software packages
US20030200350A1 (en) Class dependency graph-based class loading and reloading
US20020144256A1 (en) Method of deployment for concurrent execution of multiple versions of an integration model on an integration server
US20030018964A1 (en) Object model and framework for installation of software packages using a distributed directory
US20020078255A1 (en) Pluggable instantiable distributed objects
US20020178439A1 (en) Method and system for providing a programming interface for loading and saving archives in enterprise applications
US7596720B2 (en) Application health checks
US20080288622A1 (en) Managing Server Farms
US20170364844A1 (en) Automated-application-release-management subsystem that supports insertion of advice-based crosscutting functionality into pipelines
US20030140126A1 (en) Method of deployment for concurrent execution of multiple versions of an integration model
AU1010099A (en) Automated web interface generation for software coded applications
US20080040631A1 (en) Enabling High Availability and Load Balancing for JMX MBeans
US7434041B2 (en) Infrastructure for verifying configuration and health of a multi-node computer system
US5797006A (en) Application integration architecture for a data processing platform
WO2003107142A2 (en) Data movement platform
US7756691B2 (en) Establishing relationships between components in simulation systems

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A2

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NI NO NZ OM PG PH PL PT RO RU SC SD SE SG SK SL TJ TM TN TR TT TZ UA UG US UZ VC VN YU ZA ZM ZW

AL Designated countries for regional patents

Kind code of ref document: A2

Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IT LU MC NL PT RO SE SI SK TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
122 Ep: pct application non-entry in european phase
NENP Non-entry into the national phase in:

Ref country code: JP

WWW Wipo information: withdrawn in national office

Country of ref document: JP