CN102193777A - Loop control flow diversion - Google Patents

Loop control flow diversion Download PDF

Info

Publication number
CN102193777A
CN102193777A CN2011100658760A CN201110065876A CN102193777A CN 102193777 A CN102193777 A CN 102193777A CN 2011100658760 A CN2011100658760 A CN 2011100658760A CN 201110065876 A CN201110065876 A CN 201110065876A CN 102193777 A CN102193777 A CN 102193777A
Authority
CN
China
Prior art keywords
circulation
address
thread
loop
unit
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN2011100658760A
Other languages
Chinese (zh)
Other versions
CN102193777B (en
Inventor
S·莫热
M·M·马格鲁德
F·V·佩斯彻-盖里
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Microsoft Technology Licensing LLC
Original Assignee
Microsoft Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Microsoft Corp filed Critical Microsoft Corp
Publication of CN102193777A publication Critical patent/CN102193777A/en
Application granted granted Critical
Publication of CN102193777B publication Critical patent/CN102193777B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/32Address formation of the next instruction, e.g. by incrementing the instruction counter
    • G06F9/322Address formation of the next instruction, e.g. by incrementing the instruction counter for non-sequential address
    • G06F9/325Address formation of the next instruction, e.g. by incrementing the instruction counter for non-sequential address for loops, e.g. loop detection or loop counter
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/32Address formation of the next instruction, e.g. by incrementing the instruction counter
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline, look ahead
    • G06F9/3836Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution
    • G06F9/3851Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution from multiple instruction streams, e.g. multistreaming
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/40Transformation of program code
    • G06F8/41Compilation
    • G06F8/45Exploiting coarse grain parallelism in compilation, i.e. parallelism between groups of instructions
    • G06F8/451Code distribution
    • G06F8/452Loops

Abstract

Loop control flow diversion supports thread synchronization, garbage collection, and other situations involving suspension of long-running loops. Divertible loops have a loop body, a loop top, an indirection cell containing a loop top address, and a loop jump instruction sequence which references the indirection cell. In normal execution, control flows through the indirection cell to the loop top. After the indirection cell is altered, however, execution flow is diverted to a point away from the loop top. Operations such as garbage collection are performed while the loop (and hence the thread(s) using the loop) is thus diverted. The kernel or another thread then restores the loop top address into the indirection cell, and execution flow again continues through the restored indirection cell to the loop top.

Description

The cycle control circulation moves
Technical field
The present invention relates to cycle control stream, relate in particular to the cycle control circulation and move.
Background technology
At software inhouse, the control stream mechanism help to specify is carried out or is assessed each statement, instructs, calls and other order.In the source code rank, and sometimes in lower rank, different programming languages can provide different control stream mechanisms with different runtime environments.
In machine or assembly language rank, control flow commands is come work by the change programmable counter usually.In source code inside, given language can provide control stream statement, these control stream statements: continue to carry out in the position of the statement that is different from the back (for example, unconditional branch (branch), redirect (jump), forward to (goto)); Only when satisfying some condition, carry out the statement (for example, conditional branching (branch)) of appointment; Depend on statement zero degree or more times (for example, the circulation (loop)) of the condition execution appointment of some appointment; Carry out one group of statement at a distance and also then turn back to its place of leaving (for example, routine, function, method, coroutine); And/or unconditionally end to carry out.
In many cases, given control stream mechanism can use in the control stream mechanism of some equivalences any to obtain to the effect of the data of program.The routine main body can be inline, for example, and the not behavior (though execution speed and/or memory requirement can change) about exporting of reprogramming at the data of given input.Equally, in the middle of the loop-body or the end have the circulation of exit criteria can have the instruction sequence of its exit criteria equivalence on the tested function in the top of loop-body.(at least conceptive) such structure has one and enters and do not have one than those usually with a control flow structure that withdraws from and enter with a control flow structure that withdraws from and be more readily understood, because can get complicated in the control rheology of using and can not make as single statement Anywhere of program.Such control flow structure is called as " composition ".For example, many circulations are structures of forming.
Summary of the invention
Calculate in the situation at some, it is helpful or or even necessary synchronously suspending some or all program threads.For example, synchronizing thread can be the synchronous part of process that specific a series of actions was reached an agreement or submitted to the part of the timely execution of high priority code in the part, real-time system of garbage collected (reclaiming untapped storer) and/or permission process.Yet some thread comprises the long-play circulation, and such as the circulation that does not have routine call, and the insertion of the method for some synchronously such thread makes program code program state note or global state complicated and the reduction execution performance check.
Be provided for the substitute mode that the cycle control circulation of thread synchronization or other purposes moves at some embodiment of this discussion.For example, suppose that the executable module by virtual execution system or other inner core managings comprises the circulation with loop-body and circulation top.One indirect unit comprises the circulation top address,, points to the address at circulation top that is.The cycling jump instruction sequence is quoted this indirect unit.In this context, the iteration of the first thread execution loop-body.Carry out to load the address that is included in indirect the unit in, and flow to the circulation top by the execution that first thread is continued in the address of appointment in indirect unit.The content that second thread of carrying out is changed indirect unit makes this indirect unit comprise the address except the circulation top address, for example, and kernel Synchronization routines address; This change can be carried out atomically.In any case the execution of first thread stream is transferred to point away from the circulation top by the indirect unit that is modified.Such as garbage collected etc. operation synchronously be performed and circulate (and this round-robin thread of use thus) therefore shifted.Kernel or another thread then revert to the circulation top address in the indirect unit, and the execution of first thread stream proceeds to the circulation top again by the indirect unit that is reduced.
Loop jump can be carried out not limiting under the situation which register remains valid from loop-body bottom to the continuous conversion at circulation top.In some modification, a plurality of circulations and/or a plurality of thread are transferred.The contextual operating system support of execution of revising a thread from another thread is forcibly needed not to be available also needn't being used.
In certain embodiments, code generator provides the circulation with loop-body and circulation top.Code generator is related with the circulation top with indirect unit, makes the cycling jump instruction sequence be included in the redirect that is included in the address in the indirect unit.Code generator is also specified the original value memory location, and this memory location is resized to preserve the expression of circulation top address, that is, the version of code of this address itself or this address is such as the version of compression.Code generator sends the redirection target code, and this redirection target code will be prepared corresponding to the identifier of corresponding unit indirectly when being performed and pass control to the re-orientation processes routine.In brief, as discussed, code generator generates and is used to influence the code that the cycle control circulation moves.Code generator, and/or the code that so generates can reside in the storer of computer system, operationally communicate by letter with the logic processor of system.
The example that provides only is illustrative.This general introduction is not intended to identify the key feature or the essential feature of theme required for protection, is not intended to be used to limit the scope of theme required for protection yet.On the contrary, providing this general introduction is some notion that will further describe in the following detailed description for the form introduction of simplifying.The present invention defines with claims, and under content of the present invention and situation that claims conflict, should be as the criterion with claims.
Description of drawings
To provide with reference to the accompanying drawings more specifically and describe.These accompanying drawings only show selected aspect, and therefore do not determine fully to cover or scope.
Fig. 1 is the block diagram that is illustrated in the computer system in the operating environment that can be present on a plurality of network nodes, this computer system has at least one processor, at least one storer, but comprise at least one circulation, kernel and other executive item, and show the storage medium embodiment of configuration;
Fig. 2 is the block diagram that is illustrated in indirect unit, original value memory location and other loop jump items in the exemplary architecture;
Fig. 3 is the process flow diagram that the storage medium embodiment of the step of some processes and configuration is shown;
Fig. 4 illustrates ordinary cycle to carry out, that is, and and the control flow graph of the execution that transferable therein circulation is not transferred; And
Fig. 5 illustrates " (hijacked) that be held as a hostage " circulation to carry out, that is, and and the control flow graph of the execution that transferable therein circulation is transferred.
Embodiment
General view
In order to carry out such as some system tasks such as garbage collected, virtual execution system may need synchronously to suspend all program threads.In order to suspend long-play thread (for example, the circulation of not calling) in time, compiler can be inserted into program state note (such as variable liveness information) or global state check in the program code.Yet this extra information and logic often make program code become complicated and reduce its execution performance.
But a kind of method of suspending the long-play thread is so-called interruption code fully, for example, and by Microsoft
Figure BSA00000454397900041
.NET runtime environment (Microsoft and " .NET " are the marks of Microsoft) of realizing.Comprise that the round-robin program technic do not called comes note with the information of the liveness of describing every individual machine instruction place of garbage collected pointer in this method.This note allows the arbitrfary point of virtual execution system in this method to hang up the thread of this code of operation forcibly and its execution is redirected to a point of safes.Yet this is possible under the situation of operating system support particular thread condition managing function (changing the contextual ability of another thread such as a thread forcibly) only.
The other method of suspending the long-play thread comprises the thermodynamic state verification that is inserted in the long-play circulation, for example, as is used for Mac OS
Figure BSA00000454397900042
The Silverlight of Microsoft of software
Figure BSA00000454397900043
(Silverlight is the mark of Microsoft, and Mac OS is the mark of Apple) that scheme realizes.But this method is used on the operating system such as the required support of the complete interruption codes of some shortage such as Mac OS.Yet this method relates to adds in the circulation extra code with explicitly check and the synchronous needs of garbage collector to.
Some embodiment described here provides a kind of diverse ways.In certain embodiments, for example, virtual execution system (VES) has used and has been called the standard technique of kidnapping (being redirected the return address) in the return address calling the control of returning the station for acquiring program threads, and uses the cycle control circulation technology of moving described here to obtain and carry out the control that lacks the long-play round-robin program threads that calls.Use loop jump, the long-play loop code generates and is changed so that return the stream of control at the first bruss of loop-body by indirect unit when controlling redirect.When operation, when garbage collector (for example) needs thread to stop, VES revises the content of this indirect unit to point to the special-purpose compilation of circulation thunk (thunk), and this causes thread to be joined at this round-robin next iteration and garbage collector the most at last.
More generally, comprise in the each side of this loop jump that provides mechanism by the thread of VES management is kidnapped with acting on by loop jump in indirect unit, thereby provide wherein to having only an extra point to need the implementation of garbage collector liveness information in each candidate's loop program.The unit is written as the part of this mechanism to obtain the control of thread in the circulation by another VES thread indirectly.Existing on the x86 class processor architecture of indirect jump instruction, to the main body of loop code, all registers keep can using code generator.Described loop jump mode can be transplanted on all main operating systems known for inventor, because be unwanted to applying the special purpose operating system support that thread context changes by force for loop jump.Other each side of loop jump are below described.
Referring now to the exemplary embodiment shown in accompanying drawing, and will use concrete syntax to describe these embodiment herein.But association area and have the change of those skilled in the art's feature that will expect, shown here and further revise and other application of principle shown here should be considered to be in the scope of claims.
The meaning of each term is illustrated in the present invention, so claims should be noticed carefully that these are illustrated and read.Provided concrete example, but those skilled in the relevant art will understand, other examples also fall in the meaning of employed term, and in the scope of one or more claims.Each term not necessarily need to have with they in general the use, in specific industry uses or the meaning of the same meaning that in the use of specific dictionary or one group of dictionary, is had.Use Reference numeral to help show the width of term with various wording.Omit Reference numeral in given one section text and might not mean that the content of this accompanying drawing do not discussed by the text.The inventor claims and exercises it for its oneself lexicographical right.Each term herein can the explicit or definition implicitly in the other places of detailed description and/or application documents.
As used herein, " computer system " can comprise for example one or more servers, mainboard, processing node, personal computer (portable or non-portable), personal digital assistant, honeycomb or mobile phone and/or other equipment by one or more processors of instruction control are provided at least in part.Instruction can be the form with the software in storer and/or the special circuit.Particularly, move on workstation or laptop computer although can expect many embodiment, other embodiment can move on other computing equipments, and any one or a plurality of this kind equipment part that can be given embodiment.
" multithreading " computer system is to support the computer system of a plurality of execution threads.Term " thread " should be understood to include and can or experience synchronous any code, and available another title calls, as " task ", " process " or " coroutine ".Thread can be concurrently, sequentially or with the combination that executed in parallel (for example, multiprocessing) and order are carried out (for example, time slicing) move.Designed the multithreading environment with various configurations.Execution thread can move concurrently, and perhaps thread can be organized and supply executed in parallel but reality execution in turn successively.Multithreading can be for example by in multiprocessing environment, moving different threads on the different IPs, realizing by the different threads on the single processor core being carried out time slicing or make up by certain of time slicing and multiprocessor threading.Thread context switch can be for example by the thread scheduler of kernel, initiate by the user's space signal or by the combination of user's space and kernel operations.Thread can be operated on shared data in turn, and perhaps each thread can be for example operated on its oneself data.
" logic processor " or " processor " is single independently hardware thread processing unit.For example, the hyperthreadization four nuclear chips of two threads of each nuclear operation have 8 logic processors.Processor can be general, and perhaps they can be at customizing such as special-purposes such as graphics process, signal Processing, floating-point arithmetic processing, encryption, I/O processing.
" multiprocessor " computer system is the computer system with a plurality of logic processors.Multi-processor environment occurs with various configurations.In a given configuration, all processors can be of equal value on function, and in another configuration, some processor can by have that different hardware ability, different software are distributed or both and be different from other processors.Depend on configuration, processor can be on single bus tight coupling each other, perhaps they can be loose couplings.In some configuration, the processors sharing central memory, they have its own local storage separately in some configuration, and exist in some configuration and share and local storage.
" kernel " described here comprises virtual execution system, operating system, system supervisor, virtual machine and similar hardware interface software.
" code " expression processor instruction, data (comprising constant, variable and data structure) or instruction and data.
" automatically " mean by using robotization (for example, by the general-purpose computations hardware of software at concrete operations configuration discussed herein), and do not use robotization relative.Particularly, " automatically " step of carrying out be not carry out with hand on the paper or brain the people in carry out; They are carried out with machine.
Run through this paper, the use of optional plural number is meaned have one or more indicated features.For example, " (all) circulates " means " one or more circulation " or is equivalent to " at least one circulation ".
As long as with reference to data or instruction, the computer-readable memory of just having understood these project configuration, thereby it is transformed to special article, but not be present on the paper simply, in people's the brain or the conduct momentary signal on the circuit for example.
Operating environment
With reference to figure 1, the operating environment 100 that is used for an embodiment can comprise computer system 102.Computer system 102 can be a multiprocessor computer system, also can not be.Operating environment can comprise the one or more machines in the given computer system, and is that they can be trooped, the client-server networking and/or the equity networking.
Human user 104 can be by using display, keyboard and other peripherals 106 mutual with computer system 102.System manager, developer, engineering technical personnel and final user are the user 104 of particular type separately.The robotization agency who represents one or more people to take action also can be user 104.Memory device and/or networked devices can be considered to peripherals in certain embodiments.Unshowned other computer systems can be used via the one or more connections to network 108 of Network Interface Unit for example to come and computer system 102 or mutual with another system embodiment among Fig. 1.
Computer system 102 comprises at least one logic processor 110.Also comprise the non-transient state storage medium 112 of one or more computer-readables as other suitable computer systems such as system 102.Medium 112 can be different physical type.Medium 112 can be the non-transitory state medium (with such as only transitory state medium such as the circuit formation contrast of transmitting signal) of volatile memory, nonvolatile memory, fixing in-situ medium, removable medium, magnetic medium, light medium and/or other types.Particularly, medium 114 such as configurations such as CD, DVD, memory stick or other removable nonvolatile storage media can become the funtion part of computer system when being inserted into or otherwise install, thereby makes its content to be come to use for processor 110 by access.Movably Pei Zhi medium 114 is examples of computer-readable recording medium 112.Other examples of some of computer-readable recording medium 112 comprise built-in RAM, ROM,, hard disk and other memory devices that can not easily move by user 104.
Medium 114 usefulness can be disposed by the instruction 116 that processor 110 is carried out; " can carry out " at this and be used for comprising machine code, interpretable code and the code that on virtual machine for example, moves with wide in range meaning.Medium 114 also uses the data 118 by instructing 116 execution to create, revise, quote and/or otherwise use to dispose.The medium 114 at instruction 116 and data 118 their places of configuration; When this storer was the funtion part of given computer system, instruction 116 and data 118 also disposed this computer system.In certain embodiments, the part of data 118 has been represented the true project such as product performance, stock, physical measurement, setting, image, reading, target, amount etc.Discuss as this paper, such data also are converted, for example by change, indirectly, shift, reduction, binding, dispose, carry out, revise, show, create, load and/or other operations.
But executive item 120 comprises software application component, such as (all) modules 122, (all) threads 124 and (all) circulations 126.Exploitation/runtime environment 128 also comprises the software development assembly, such as (all) linkers, (all) bindings device, (all) interpreters, (all) compilers and other (all) code generators 130, and assembly when comprising (all) running softwares, such as (all) garbage collector 132 with comprise (all) kernels 134 of various (all) rudimentary routines 136.Be provided for carrying out Managed Code and support Microsoft
Figure BSA00000454397900071
The virtual execution system 138 of the environment of common intermediate language instruction set is an example of kernel.The classification of assembly only is to be convenient to discuss when whether mainly or fully being application component, developer component and/or operation about a certain assembly; Can differently be classified for locking assembly by different users and/or in different environment.
Given operating environment 100 can comprise the Integrated Development Environment (IDE) 140 that one group of collaborative SDK (Software Development Kit) is provided to the developer.Particularly, for some embodiments, some in the suitable operating environment comprise or help establishment to be configured to the Microsoft of support program exploitation
Figure BSA00000454397900081
Visual Studio
Figure BSA00000454397900082
Development environment (mark of Microsoft).Some suitable operating environments comprise Java
Figure BSA00000454397900083
Environment (mark of Sun Microsystems company limited), and the certain operations environment comprises the environment of utilization such as C++ or C# language such as (" C-Sharp "), but the instruction of this paper is applicable to various programming languages, programming model and program, and the effort of the use thread outside the field of software development itself, long-play circulation (in their main body, lacking the circulation of calling), or the two.
Component software shown in the figure and other can partly or wholly reside in one or more media 112, thereby dispose these media.Except storer and (all) processors, operating environment also comprises other hardware, as bus, power supply and accelerometer, or the like.
Some in Fig. 1 with the profile form illustrate with emphasize they be not shown in the essential part of operating environment, but can carry out interoperability with the item in the operating environment of this discussion.In any accompanying drawing or any embodiment, can not draw the project that does not adopt the profile form must be unwanted just.
System
Fig. 2 illustrates the architecture that is applicable to some embodiments.Common circulation 126 is become the transferable circulation 200 in this discussion.Each transferable circulation 200 has main body 202, top 204, top address 206 and redirect sequence 208.Loop-body 202 comprises one or more statements or instruction; Statement can be realized by one or more instructions.Circulation top 204 can be first statement/instruction of loop-body, and the top of perhaps circulating can be to be sent to the test that is performed before the circulation top in control just.Circulation top address 206 can be the storage address at circulation top 204,, in the process of implementation processor 110 is pointed to the value at circulation top that is.Redirect sequence 208 instruction processorunits are to the redirect of circulation top.In standard cycle, the redirect sequence is directly pointed to the circulation top.But in transferable circulation, redirect sequence 208 is pointed to indirect unit 210, following discussion, unit 210 and then point to circulation top 204 (common execution) or point to another point (execution that is transferred) indirectly.
More generally, as used herein, " circulation " comprises the circulation of using following sentence structure instruction sequence format layout; Illustrate optional in the square bracket:
Figure BSA00000454397900084
Figure BSA00000454397900091
As used herein, " circulation " is also included within this circulation that provides as false code or examples shown, and is equivalent to any the circulation that is equivalent on circulation with above sentence structure instruction sequence form and/or the function in false code given herein or the examples shown on the function.Though use the term such as " top " and " retreating " herein sometimes, general direction to/redirect by indirect unit 210 of carrying out is not limited to redirect backward or jumps to the position at the place, top of instruction sequence.Can use and realize transferable circulation toward the redirect of other directions.
As some pseudo-code example of round-robin, and do not get rid of and can use in this indirect unit 210 that provides and other examples of the non-sequence code sequence that other mechanism shifts, below each false code represent the circulation that can shift like that as discussed in this:
Figure BSA00000454397900101
In given embodiment, processor 110 can have load framework, complex instruction set computer (CISC) framework or some other framework.Especially, as used herein the term such as " quoting " and " instruction sequence " is not got rid of when carrying out circulation time from the isolated loading of branch.
When transferable circulation 200 was transferred, the content of unit 210 was modified indirectly.The original contents of unit 210, that is, circulation top address 206 before may be stored in the original value memory location 212 with coding form.Indirect unit 210 is become to change into points to redirection target code 214, this redirection target code 214 uses (all) identifiers 224 to come mark this change and then control is delivered to transfer destination place 216, such as locate the synchronous synchronous points 218 of thread 124 and kernel 134 at it.In the transfer destination place, control can be given re-orientation processes routine 220, such as based on garbage collector liveness information 226, controls and is given kernel garbage collected 132 routines 136.Re-orientation processes routine 220 reverts to original address in (all) indirect unit 210 from (all) memory locations 212, and the ordinary cycle execution is allowed to proceed.Redirection target code 214 can be arranged in the bundle 222 to reduce the code size.
With reference to Fig. 1 and 2, some embodiment provides logic processor 110 and storage medium 112 to computer system 102, and this storage medium 112 is configured to come the conversion loop structure by indirect unit described here and other control circulation shifting function is installed by circuit, firmware and/or software.This mechanism can be in conjunction with a limited quantity circulation and any limited quantity thread use arbitrarily.
For example, an embodiment of structure that relates to the code of generation comprises computer system 102, the storer that this computer system 102 has logic processor 110 and operationally communicates by letter with this logic processor.One executable module 122 resides in the storer, and config memory thus.Executable module comprises a plurality of circulations 200, and each circulation has corresponding loop-body 202 and corresponding circulation top 204.Module 122 also comprises a plurality of indirect unit 210, and each order unit at a time comprises corresponding circulation top address 206,, points to the address at corresponding circulation top 204 that is.Each circulation 200 also comprises corresponding cycling jump instruction sequence 208, and this sequence is included in the redirect that is included in the address in the corresponding unit 210 indirectly.A plurality of corresponding original values memory location 212 is also in system.Each original value memory location 212 is resized to preserve the expression of corresponding circulation top address 206.This expression can be the coding (for example, the version of compression) of address, and perhaps it can be uncoded address.This system also comprises a plurality of corresponding redirection target codes 214, and each redirection target code is delivered to single shared re-orientation processes routine 220 with control when carrying out.That is, all redirection target codes are delivered to identical re-orientation processes routine 220 with control.For example, all loop jumps can be led to identical garbage collector 132.
In certain embodiments, it is transparent that cycle control is shifted for circulation, thereby the result of similar thread suspension is provided.The round-robin code can point to original (circulation top) address as indirect unit 210 always and generate.As a result of, the redirection target code 214 that is transferred to of control and/or re-orientation processes routine 220 rebulid semantic equivalence into original state (or if (if) with processor state (execution register value)), and to allow VES or other kernel reconstruct to preserve processor state in the mode of the state of the thread at branchpoint place.The code that will be modified for which indirect unit 210 of virtual execution system 138 tracking/marks when thus, some embodiment is included in and carries out.In certain embodiments, virtual execution system 138 which indirect unit of tracking are modified.For example, indirectly the unit is revised and is followed the tracks of and can be finished by virtual execution system 138 threads of the control of attempting to obtain other threads.
In certain embodiments, do not control how many threads in the structure of the code that generates and can carry out on the meaning of this code, thread relates to be carried out but not the structure of code in the storer.Thread 124 and circulation 200 are unnecessary to be 1 pair 1 mapping.Each thread can be in carrying out zero or more multicycle process, and each circulation can be carried out on zero or more a plurality of thread.Metastasis in this discussion can use in conjunction with an any amount thread and an any amount circulation.
In certain embodiments, loop jump is included in and lacks under the situation that can interrupt loop code fully, and (all) threads 124 and kernel 134 is synchronous.That is, VES does not have the complete trustship executing state that enough information sets up in each inner instruction place of circulation thread.On the contrary, VES only has enough information to carry out the complete executing state of setting up thread when " by indirect redirect " instructs at thread.Therefore, in certain embodiments, loop jump is included in that to allow execution environment to set up the position of complete executing state of thread synchronous with the entity of control execution environment with thread.
This mechanism can be used to a control/transfer any amount thread, is included in trustship and carries out the interior thread of scope, such as the VES thread.In certain embodiments, virtual execution system 138 comprises and can use thread to revise indirect unit 210 and therefore the circulation 200 of other threads is transferred to code by the re-orientation processes routine 220 of virtual execution system appointment.
In certain embodiments, re-orientation processes routine 220 is included in the code that will carry out garbage collected synchronous operation when carrying out.Yet loop jump also can be carried out for different with garbage collected or other purpose, such as circulation synchronously, suspend and carry out to check and/or archive storage content, software are used collection of remote measurement or the like.
In certain embodiments, compilation/intermediate language thunk (thunks) is the example of redirection target code 214.Thunk can have the specific compression structure of instruction sequence.A kind of being arranged in below in conjunction with the part of title for " other examples " of thunk that is designed to restraint the use of 222 little to maximize " push imm8 " and " jmp rel8 " instruction goes through.
In certain embodiments, the target thunk that is redirected itself loads the identifier 224 of corresponding indirect unit 210, such as indirect cell array index and module I D.Then, thunk jumps to public routine 220.This public routine can carry out synchronously, wait for that garbage collector finishes, and then uses indirect unit index to search circulation top address 206, and then jump to this address with garbage collector 132.The garbage collector code will seeked advice from bitmap or other tracking structures to check it needs which unit 210 to reduce behind the original address the indirect unit of reduction 210 itself therein.
In certain embodiments, code is created between two instruments and splits, that is, and and compiler and binding device.Compiler produces the circulation 126 with loop-body 202, and the binding device is revised circulation 126 and had transferable circulation 200 by the indirect redirect of unit 210 with establishment.The binding device does not need the loop-body code generation of compiler is applied requirement.That is, described circulation redirecting technique can be transparent for the code generator 130 that produces the loop-body code.In certain embodiments, compiler or binding device all do not apply circulation alignment requirement, and in other embodiments, the circulation alignment can be put teeth in by intermediate language code maker, binding device and/or another instrument.
In certain embodiments, reduce the code of indirect unit 210, and be not to generate thus by the part of compiler as application program in runtime environment inside.Compiler generates redirection target code 214, and this redirection target code 214 does not reduce indirect unit, but points out for this specific thread, and where redirect goes back to (circulation top).
In certain embodiments, comprise among some embodiment consistent that compiler is designated and may generates the rebound of revising (back-jump) in long playing method 402 circulations at it with attached Figure 4 and 5.The rebound code of this modification generates the code address that loads indirect unit 210 and jump to appointment in indirect unit.The x86 processor architecture has the single indirect jump instruction that can be used to this purpose.These only are the changes for the body of code in the circulation.These codes change less than " check global state " scheme, but greater than " but the complete interruption code " scheme that does not need loop-body to change.
In these specific embodiments, compiler sends the garbage collector liveness information 226 of circulation target location (that is the instruction of first in the loop-body).As a result of, the round-robin execution performance is subjected to the very little influence of additional code, but but than complete interruption code (it influences whole virtual execution system performance), has required quite a spot of extra garbage collector liveness information.And code changes permission thread trend VES itself joins, but and complete interruption code needs the support of special operational system to be redirected the execution of another thread forcibly to allow a thread.This makes that but interruption code is reluctant fully on the operating system of not this support.
In these specific embodiments, compiler also sends the extra target thunk 502 that is redirected for each circulation that is modified, its when VES attempts to obtain the control of the thread of carrying out this code as the destination address of indirect unit.This thunk 502 is examples of redirection target code 214, and it is that the original loop destination address is pairing and then join with VES with which circulation of mark.
VES is assumed to be can identify given thread 124 carries out in which method 402.In case found out in one or more methods of given thread in comprising these circulations and carried out, its (all) that will be associated with this method unit 210 indirectly change to the target thunk 502 that points to them and be redirected accordingly from pointing to their corresponding circulation tops 204.VES can be for example according to its see fit only to one in the method 402 circulation 200, to all circulations 200 in the method 402 or to the unit indirectly of all circulation 200 changes (all) in the module 122.But in complete interrupt scheme, VES uses the support of special operational system to suspend and to revise the context state of another thread forcibly.Such support is not all to be available in all operations system, for example, and the Mac OS of Apple
Figure BSA00000454397900131
Operating system.
In these embodiments, before continuing common execution, VES reduces the original contents of all indirect unit of its change.VES follows the tracks of during thread suspension it and has changed which indirect unit and then used the index of these unit to store the original value of searching them in other positions 212 in the module at look-up table or by compiler.
The target thunk 502 that is redirected is pressed into the loop index of their correspondence on the stack and then jumps to the public junction routine 220 that VES provides.This junction routine is with the buffer status of thread be kept at that garbage collector 132 can find local and then wait for finishing of garbage collector.Before continuing execution, this routine 220 will be searched original loop destination address (corresponding to " loop index " that be pressed into by thunk), reduction buffer status, also then the top of loop-body is gone back in redirect.
In certain embodiments, such as human user I/O equipment (screen, keyboard, mouse, graphic tablet, microphone, loudspeaker, motion sensor etc.) but etc. peripherals 106 will be present in the operation communication with one or more processors 110 and storer.Yet an embodiment also can be embedded in the system deeply, make do not have human user 104 directly and this embodiment mutual.Software process can be user 104.
In certain embodiments, this system comprises a plurality of computing machines that connect by network.Networking interface equipment can use such as for example being present in assemblies such as packet switched network interface card, transceiver or telephone network interface in the computer system access to network 108 is provided.Yet, one embodiment also can communicate by letter by direct memory access (DMA), removable non-volatile media or other information stores-retrievals and/or transmission method, and perhaps the embodiment in the computer system can with under the situation that other computer systems are communicated by letter not operate.
Process
Fig. 3 shows some process embodiment with process flow diagram 300.Process shown in the accompanying drawing in certain embodiments can be for example seldom or not needs to be automatically performed by the code generator 130 shown in accompanying drawing 1 and/or the accompanying drawing 2, kernel 134, circulation 200 and/or loop jump mechanism under the control of the script that the user imports at needs.Each process can partly automatically perform, and partly manually carries out, except as otherwise noted.In given embodiment, can repeat a process zero or a plurality of shown in step, may operate different parameters or data.Each step among one embodiment also can be finished by the order different with the order from top to bottom listed among Fig. 3.Each step can be serially, in partly overlapping mode or carry out fully concurrently.Traversal flow process Figure 30 0 indicates can variation between once carrying out of process carried out with another time of process at the order of the step of carrying out during the process.Process flow diagram traversal order also can change between a process embodiment and another process embodiment.Each step can be omitted, combination, rename, recombination or adopt other modes do not depart from shown in flow process, as long as performed process is exercisable and meets at least one claim.
Each example provided herein helps illustrate the each side of this technology, but the example that provides is not herein described all possible embodiment.The specific implementation that each embodiment is not limited to provide herein, arrangement, demonstration, feature, method or situation.Given embodiment can for example comprise other or different features, mechanism and/or data structure, and can depart from example provided herein in addition.
During step 302 was provided, an embodiment provided the circulation that therefrom forms transferable circulation 200.For example, circulation 126 can be generated by compiler, and perhaps circulation can be read from network 108 connections or local storage medium 112, with performing step 302.
During associated steps 304, unit 210 is related with the circulation that provides during step 302 by an embodiment indirectly.For example, after a circulation was designated long-play, the binding device can be revised redirect sequence 208 and use the circulation top address to come initialized pointer so that control is delivered to; This pointer is taken on the indirect unit related with circulation by the circulation top address that is stored in this pointer.Related 304 can be included in the data area of module for example inner or store in the runtime environment internal distribution.
During given step 306, an embodiment specifies in the position 212 of the original value (circulation top address) that wherein can store related indirect unit, to allow to return back to after a while common execution after circulation is transferred.The memory location can have and the identical size in indirect unit, and it is littler than indirect unit if perhaps circulation top address is compressed.Appointment 306 for example can comprise stores in the inside, data area or the runtime environment internal distribution of module.
During code sent step 308, an embodiment sent redirection target code 214.Particularly, code generator can send 308 codes 214 that indirect element identifier (element ID) 224 are delivered to the kernel routine 220 of appointment when carrying out, so the original value of unit 210 can be resumed when wishing that once more round-robin is common and carry out.
During garbage collected information output step 310, an embodiment is output garbage collector 132 liveness information in the circulation top or in the redirection target code for example.
Before the tour that continues the step shown in Fig. 3, above-mentioned steps in some embodiment of step and their mutual relationship are stated in consideration now in the use.Some embodiment provides by the process of carrying out such as the code generator of compiler, binding device or both combinations etc. that the cycle control circulation moves of being convenient to.This process comprises provides 302 circulations that have loop-body and circulation top as discussed above.Indirectly unit 210 is related with the circulation top 304, makes round-robin jump instruction sequence 208 be included in the redirect that is included in the address in the indirect unit.Original value memory location 212 is designated 306, and its size is suitable for preserving the expression of circulation top address.Send 308 redirection target codes 214 by code generator.After execution, redirection target code 214 will be determined corresponding to the identifier of corresponding unit indirectly and then control will be delivered to re-orientation processes routine 220.
In certain embodiments, code generator is also exported the 310 garbage collector liveness information of being familiar with 226 for the circulation top.
Though above some description is according to a round-robin with an indirect unit, but in certain embodiments, code generator a plurality of indirect unit are related with a plurality of corresponding circulations top 304, specify more than 306 corresponding original value memory locations and send more than 308 corresponding redirection target code.In certain embodiments, the original value memory location that appointment 306 its sizes are adjusted with the expression of preserving the circulation top address relates to because the address will be compressed, and code generator is that the original value memory location 212 less than indirect unit 210 is specified in some address at least of circulation top address.
Return the tour of Fig. 3 step, during obtaining step 312, an embodiment obtains to comprise the executable module 122 of (all) transferable circulations 200, for example by receiving this module via network link or load this module from the storage of this locality.
During loop-body execution in step 314, this embodiment carries out the main body of circulation 200, that is, and and the behavior of loop-body processor controls 110.
During continuing step 316, carry out the top 204 that proceeds to circulation 200 by indirect unit 210.Step 314 and 316 helps to provide the example of the common execution of transferable round-robin.
During address change step 318, the content of change and the 200 indirect unit 210 that are associated that circulate makes indirect unit no longer point to the top of this circulation 200.
During transfer step 320, execution is transferred to the position that is not the circulation top by the 318 indirect unit that are modified.
During reduction step 322, reduce indirect location contents, make indirect unit point to the top of this circulation 200 again.
Remove from the tour of Fig. 3 once more, above-mentioned steps and mutual relationship thereof discuss in more detail in conjunction with various embodiments hereinafter.All the other steps among Fig. 3 also will come into question.
Some embodiment is provided for the process that the cycle control circulation moves.This process is used the equipment with at least one logic processor 110 of operationally communicating by letter with at least one storage medium 112.This process is included in the executable module 122 that the interior acquisition 312 of storer comprises the circulation 200 with loop-body 202 and circulation top 204.Module 122 also comprises indirect unit 210, and this indirect unit 210 comprises circulation top address 206,, points to the address at circulation top 204 that is.Circulation 200 also comprises the cycling jump instruction sequence 208 of quoting indirect unit; As described, can realize by various processor architectures for indirect unit " quoting ".
After obtaining module, this process comprises first thread 124 of the execution of an iteration of carrying out 314 loop-body.The address that is included in the indirect unit is loaded in processor 110, and carries out stream and continue 316 to circulation top 204 by the address of appointment in indirect unit 210.So far, execution is common, but is different from standard cycle 126, and indirect location contents is obeyed in control.
Yet, at a time, may be after other execution of loop-body, the content of second thread change, the 318 indirect unit of execution makes indirect unit comprise the address that is not the circulation top address.Can adopt the address of for example independent redirection target code 214 or the address of re-orientation processes routine 220 to fill indirect unit.This change moves 320 to the points away from the circulation top by the indirect unit that is modified with the execution circulation of first thread, is referred to herein as transfer destination place 216 usually.Synchronous points 218 is the examples in transfer destination place 216, but as described, can finish transfer for the other reasons except synchronously.In certain embodiments, each free virtual execution system 138 management of first thread and second thread.
Subsequently, re-orientation processes routine 220 arrives circulation top address reduction 322 in the indirect unit.The execution of first thread continues 316 to the circulation top again by the indirect unit that is reduced; Continue common execution thus.
One embodiment can use more than 324 circulation 200, and/or uses more than 326 thread.In certain embodiments, for example, carry out a plurality of circulations 200 for one in (all) threads.Indirectly the unit is modified 318 kidnapping all circulations, but may not know which circulation will be actual that circulation that thread is sent to the transfer destination place.That is, an embodiment is provided with all indirect unit, and in them one is used by thread, and then all indirect unit are reduced 322.Particularly, the circulation of supposing above introduction is first circulation 200, first thread is also carried out second circulation 200, second circulation has corresponding second loop-body and the corresponding second circulation top, module also comprises the second indirect unit that comprises the second circulation top address of pointing to the second circulation top, and second circulation also comprises the second cycling jump instruction sequence of quoting the second indirect unit.The second indirect unit that a round-robin during this process circulates by each is modified moves 320 to the points away from this round-robin top with the execution circulation of first thread.
Use among the embodiment of more than 326 thread 124 at some, all threads of carrying out circulation 200 use identical indirect unit 210.That is, the unit circulates based on each indirectly, rather than comes related 304 based on each thread.A plurality of thread execution loop-body, and this process will be carried out all threads of loop-body by the indirect unit that is modified execution circulation moves 320 to the points 216 away from the circulation top.
Use among more than 324 the round-robin embodiment at some, storer (for example is included in a plurality of indirect unit 210 that distributes during the code that is used for corresponding a plurality of circulation 200 generates, table, array, piece), and this process comprises that mark 332 which (a bit) indirect unit are modified 318.Notice that in certain embodiments as discussing below in conjunction with bitmap and group of addresses, the unit that is modified allows the reduction of original unit value more efficiently, but the not necessarily high-efficiency method in the unit that is modified of reduction only.
In certain embodiments, with respect to all threads of the execution in the equipment, the step of change 318 indirect unit (making indirect unit comprise the address that is not the circulation top address) is an atom.
In certain embodiments, transfer step will be carried out circulation by the indirect unit that is modified and move 320 to synchronous points 218.In some cases, this process with the thread that is transferred and kernel 134 at the synchronous points place synchronous 328.In certain embodiments, this process is carried out 330 garbage collected synchronous operations.In certain embodiments, thread and kernel synchronous 328 are impliedly omitted 334 pairs revise the contextual operating system support of execution of a thread forcibly, because loop jump takes place under the situation that lacks this support from another thread.
In certain embodiments, based on following consideration, reservation process device register 338 uses.Suppose that loop-body has the bottom.To carry out circulation by the indirect unit that is modified and move 320 to being under the situation which register 338 is remained valid without limits, to carry out in these embodiments to the step in the indirect unit from the loop-body bottom to the continuous conversion at circulation top away from the point at circulation top with the reduction 322 of circulation top address.That is, these embodiment avoid 336 to apply restriction adapt to transfer on register uses.
The medium of configuration
Some embodiment comprises the computer-readable recording medium 112 of configuration.Medium 112 can comprise dish (disk, CD or other), RAM, EEPROM or other ROM and/or other configurable memories, specifically comprises non-transient state computer-readable medium (relative with other transmitting signal media with electric wire).The storage medium of configuration can be in particular such as movable storage mediums 114 such as CD, DVD or flash memories.Can be movably or immovable and can be volatibility or non-volatile general-purpose storage can be configured to use such as in the project implementation examples such as indirect unit 210, transfer cycle redirect sequence 208 and redirection target code 214 forming the medium of configuration, these projects adopt that connection waits the data 118 that another source reads and the form of instruction 116 from removable medium 114 and/or such as network.The medium 112 that is configured can make computer system carry out the process steps that is used for coming by loop jump disclosed herein transform data.Fig. 1 to 3 thereby help illustrate the storage medium embodiment and the process embodiment of configuration, and system and process embodiment.Particularly, any in the process steps shown in Figure 3 or other teaching herein can be used for helping the configuration store medium to form the medium embodiment of configuration.
Other examples
Below a plurality of son joints provide other details and design to consider.As other examples herein, in a given embodiment, described feature can be used individually and/or in combination, or does not use fully.
It will be apparent to one skilled in the art that and realize that details can relate to such as specific codes such as concrete API and concrete sample programs, does not therefore need to appear among each embodiment.Those skilled in the art also will understand, and therefore employed program identifier and some other term and do not need to belong to each embodiment specific to realizing during start a hare.Yet, although they not necessarily need to be present in this, but still provide these details, because they can help some reader by context is provided, and/or several in many may the realization of technology discussed herein can be shown.
In certain embodiments, loop jump can be considered to a kind of circulation of pattern kidnaps, be similar to thread in some aspects and kidnap, but mechanism, suppose, replace and other aspect different with the thread abduction.
Remember after the foregoing that present and then consideration is kidnapped the concise and to the point general view of the problem that solves by circulating, for the Microsoft of this problem The concise and to the point description and the alternative of common language runtime environmental program.It is the complete history or the complete analysis of problem space that this description is not intended to.The reader who is familiar with common language runtime environment (CLR) garbage collector information can jump to " general view is kidnapped in circulation " part forward.
Garbage collector (GS) is hung up concise and to the point general view
Similar with CLR, some goal systems stopped all threads before carrying out garbage collected.How the identical thorny subproblem that such system handles CLR must handle promptly, is arranged to tight loop interruptable in mode timely.Under situation about not addressing this problem, make those threads hungry to death by preventing the further distribution on other threads in the tight loop of moving on the thread.
In this was discussed, what assume as a matter of course was that kidnap the return address is possible, and abnormality processing is deferred to those abduction.Kidnap by the return address, all call to return to become at it and locate the point that we can obtain the control of thread.Thread can not be under the situation of not overflowing its stack recurrence ad infinitum, stayed thus do not have call (call-free) (being also referred to as " little " or " long-play ") circulation as can be under the situation of interrupting it without any mode at its place's thread long playing place.Merely, we can check to check whether GC wants to move and to be inserted in these circulations if then stop the code of thread carrying out simply.But to call circulation finally is very little because great majority do not have, and this check finally is high cost for round-robin is big or small.In the time of most of, because GC moves once in a while, all this checks are to check whether GC wants to move to fritter away energy.In other words, this scheme has undesirable characteristic, costs a lot of money when not being required.
Ideally, we do not change the code in the circulation, but arrange to allow us to describe certain mode of such round-robin " sideline production " data, make us can obtain round-robin control and also still have the perfect knowledge that (especially Deng Ji those) can controllable all moving objects be quoted in circulation." but complete interruption code ", the i.e. scheme of being used by CLR thus are born.
But complete interruption code
" but complete interruption code " refers to us and is recorded in the code that there is the position of object reference in each instruction skew place for it as used herein.This record is called as fully can interrupt GC information, and it provides correct characteristic in theory, and wherein not having only when having calling circulation by sign does not just need information, and the common execution of round-robin is not influenced by the requirement that stops of GC.It is very detailed can interrupting fully on the GC Essence of Information, but it only need be consulted in the GC time.
General view is kidnapped in circulation
In some aspects, this method is similar to simple check described above and checks realization.This method is added code to interested circulation, and this code will allow GC to obtain these round-robin control.Thus, it still has undesirable characteristic, and it increases common execution cost (all the time) to handle the rare cases of hanging up at cycle period GC.But the cost that the encode that is proposed changes may be little.As award, but than the complete interruption code of current CLR, performance when this method has preferable image size feature and preferable GC.
Basic idea is the abduction that allows the round-robin rebound, and its hindsight is the memory that we kidnap the mode of the return address on the stack.We come the conversion rebound in this way:
Figure BSA00000454397900201
The code of 6 bytes is added in this conversion under worst case.But each jump target has the clauses and subclauses in the jump target table quoted in its oneself indirect unit 210 in the write data part of module 122 reflection and the read-only part at this reflection.These unit 210 are according to destination address 206 orderings, make when the kernel runtime environment sees that a thread is just being carried out in ad hoc approach 402, it can use the address realm of this method to seek all interior jump target of this method, and they are set to and will cause the address of some synchronous code 214,502 of thread and GC.
Kidnap as the return address, thread can call another function and not hit circulation and kidnap, and the GC thread can continue these abduction are moved to the darker place of call stack so that faster seizure thread by with it identical mode being kidnapped in its return address thus.This hangs up in the sync section at GS and is at length discussed.
Its original jump target should be able to be got back in the indirect unit of target after it is held as a hostage.Yet because we do not know which circulation thread operates in, it is easy that circulation is kidnapped unlike kidnap the return address.All circulations in our therefore abduction (shifting 320) method and thread can hit any one in them.Therefore, thread calculated before abduction by its own execution, its head for target.
We also insert some thunk 502 with the slow path target as the circulation rebound.These thunks calculate us can be in order to recover the data of (reduction 322) original jump target.In certain embodiments, as what describe in the target thunk code section that is redirected, thunk is grouped together in the bundle 222.
Indirect unit 210 index that calculated by the target thunk that is redirected use in identifier 224 so that search original indirect cell value in original object table (example of original value memory location 212).In certain embodiments, the original object table comprises the address 206 of compression, that is, and and the coding of presentation address 206.This compressed format is described in original object sheet format part.
In certain embodiments, calculate and store the GC liveness information of being familiar with 226 for jump target addresses.To some embodiment that uses intermediate language (wherein the symbolically in the code that compiler generates of skew and address also resolves to digital value by the binding device that generates executable code after a while), exist GC_PROBE intermediate language pseudo-code instructions should finish this rebound conversion at the jump target place with notice binding device.
The deficiency of this mechanism is the introducing of the privately owned page or leaf of the indirect unit 210 of jump target that is used to revise.Yet under the situation of closely being packed in these unit, cost is considered to acceptable.If these privately owned pages or leaves become problem, then need to pursue further technology, can kidnap the quantity of redirect such as in the code that generates, reducing these.
The target thunk code that is redirected
The process of kidnapping circulation 200 relates to and the corresponding unit indirectly 210 of round-robin is provided with (change 318) is new value.Should new value be the address of thunk 502, routine 220 when this thunk 502 is sent to the public operation of finishing of carrying out wait GC with identifier 224 (with the form of index and module handle).In certain embodiments, these thunks are grouped together in the bundle 222 and follow four bundles and are grouped to constitute chunk (chunk).The arrangement of this binding allows us to use " push imm8 " and " jmp rel8 " in every unit (per-indirection-cell) code indirectly to instruct, and it allows thunk very little.
In certain embodiments, redirection target code 214 forms correspondingly look like following code.For promoting legibility, mark has been shortened, but meets the requirement of father's document simultaneously at this; " cc " expression " chunk _ public ", " ci0 " expression " chunk _ index _ 0 ", " b0c " expression " 0_ is public for bundle ", the rest may be inferred:
The chunk code
Cc:68 xx xx xx xxpush imm32 (the initial index of piece)
68 xx xx xx xx push imm32 (module identifier)
FF?25?xx?xx?xx?xx jmp[_imp_CommonLoopHijackHelper]
Bundle 0
ci0:6A?00 push 0
EB?7C jmp b0c
cil:6A?01 push 1
EB?78 jmp b0c
...
ci31:6A?1F push 31
EB?00 jmp b0c
b0c:E9?xx?xx?xx?xxjmpcc
ci32:6A?20 push 32
EB?F9 jmp b0c
ci33:6A?21 push 33
EB?F5 jmp b0c
...
ci62:6A?3E push 62
EB?81 jmp b0c
Bundle 1
[chunk index 63-125]
Bundle 2
[chunk index 126-188]
Bundle 3
[chunk index 189-251]
This binding arrangement allows us that 252 thunks are packaged into 1033 bytes.CommonLoopHijackHelper (public circulation kidnap helper) with the initial index-group of specific chunk index and chunk altogether with the index of the indirect cell array that obtains entering given module.This index and then allow common code routine 220 after GC finishes, with thread turn back to its from circulation 220.
GC hangs up synchronously
As mentioned previously, when thread suspension attempt to obtain the control of thread thread to continue nested be possible.Therefore, some embodiment follows the trail of thread downwards along its call stack, continues application cycle and kidnap (transfer) on the new method that thread enters, and abandons control up to it.
Yet,, will cause complexity because this fact that is applied to shared memory location is kidnapped in circulation; The indirect unit 210 that circulates is shared in all threads 124 that may move the loop code that is associated.This aspect that circulation is kidnapped is different with the return address abduction, and it revises thread local storage (that is the stack of thread).When kidnap the return address, it is effective only will kidnapping when following the trail of thread and move to another stack location from a stack location, but this is for using circulation to kidnap to realize it being difficult, because when we further goed deep into only following the trail of in the stack thread in a plurality of threads, a plurality of threads can be carried out in identical method.
Therefore some embodiment will keep the hang-up stage only to only limit to the ON cycle abduction.That is, some embodiment is not forbidden circulation and kidnapped by reducing 322 indirect location contents, and is synchronous with runtime environment up to all threads.Then all unit 210 are once reduced.
Realize that a kind of mode of this situation uses the unit bitmap of being held as a hostage, it is generated in the data division of module.The indirect unit 210 of one group of sizable circulation of given bit representation is to reduce the size of bitmap.As long as we wish to kidnap a unit, we kidnap all unit in this unit group and follow position corresponding in the bitmap and are set to 1.We seek advice from bitmap and need be reset (being reduced 322) to seek which unit 210 after a while.
The original object sheet format
For the size of the unit bitmap that reduces to be held as a hostage, 210 groups of compact codes that help original object table (original value memory location 212) in unit indirectly.This tableau format utilizes the integer coding of our existing compression.The first of this table comprises a series of variable-length signless integers of expression from the beginning of form to the skew of the group descriptor of correspondence, and this is the original jump target addresses 206 of having encoded of each the indirect unit 210 in the group.The second portion of this table is a group descriptor itself.Group descriptor begins with the variable-length signless integer, and it is the skew of first branch target in from the beginning of code section to group.Remaining variable-length signless integer is an increment size, and these increment sizes have provided the source code skew of each the indirect unit in the group when being added up.For example, if that group descriptor has is a series of 2000,20,10,30,5,10}, the original object of this group will be 2000,2020,2030,2060,2065,2075}.Because the characteristic of variable-length signless integer, more little numeral occupies more little space, so such adding up causes compact more coding.
Conclusion
Although specific embodiment is illustrated and is described as the medium or the system of process, configuration herein clearly, be appreciated that the discussion to one type embodiment generally also can expand to other embodiment types.For example, also help to describe the medium that is configured in conjunction with the process prescription of Fig. 3, and the operation of system such as those systems that help to describe as discuss in conjunction with other accompanying drawings and goods and goods.Can not draw a restriction among the embodiment must add among another embodiment.Particularly, each process not necessarily will be limited to data structure and the arrangement that proposes when discussing such as system such as the storer of configuration or product.
Be not that each project shown in the accompanying drawing all needs to be present among each embodiment.On the contrary, an embodiment can comprise the project that does not clearly illustrate in the accompanying drawing.Show some possibility in literal and the accompanying drawing as a specific example although this is in, each embodiment can break away from these examples.For example, the concrete feature of an example can be omitted, rename, differently divide into groups, repeat, with the differently instantiation or appear at the mixing of the feature in two or more examples of hardware and/or software.In certain embodiments, also can provide in the function shown in the position at diverse location.
By Reference numeral whole accompanying drawings have been made and to have quoted.In accompanying drawing or literal with wording that given Reference numeral is associated in any tangible inconsistency should be understood that it only is the scope of having widened the content that this mark quotes.
As used herein, one or more in indicated project or the step have been comprised such as " one " and terms such as " being somebody's turn to do ".Particularly, in claims, quoting of a project generally meaned at least one such project of existence, and quoting of a step meaned at least one example of carrying out this step.
Title is only for the purpose of for convenience; Can outside indicating the joint of this topic, its title find about the information of given topic.
The all authority of being submitted to requires and makes a summary is the part of this instructions.
Although shown in the drawings and, it will be apparent for a person skilled in the art that to make and do not deviate from the principle illustrated in claims and the multiple modification of notion in above each illustrative embodiments of having described.Although with the language description of architectural feature and/or process action special use this theme, be appreciated that subject matter defined in the appended claims is not necessarily limited to concrete feature or the action in the aforesaid right requirement.Each device that identifies in given definition or the example or aspect not necessarily will be present among each embodiment, also not necessarily will all be used in each embodiment.On the contrary, described concrete feature and action are as disclosed for the example of considering when realizing claim.
Fall in the implication of equivalents of claims and the scope change and should be contained by the scope of claims.

Claims (15)

1. one kind is used for the process that cycle control circulation moves, and described process is used the equipment that has operationally with at least one logic processor of at least one memory communication, and described logic processor has at least one register, and described process may further comprise the steps:
Obtain the executable module in (312) storer, described executable module comprises the circulation with loop-body and circulation top, described module also comprises the indirect unit that comprises the circulation top address, described circulation top address is promptly pointed to the address at circulation top, and described circulation also comprises the cycling jump instruction sequence of quoting described indirect unit;
The iteration of the described loop-body of carrying out of first thread execution (314);
Described circulation top is arrived with the execution stream continuation (316) of described first thread in address by appointment in described indirect unit;
The content of second thread change (318) described indirect unit of carrying out makes described indirect unit comprise the address that is not described circulation top address;
By the indirect unit that is modified the execution circulation of described first thread is moved (320) to the point away from described circulation top;
Described circulation top address reduction (322) is arrived in the described indirect unit; And
Again the execution stream of described first thread is proceeded to described circulation top by the indirect unit that is reduced.
2. process as claimed in claim 1 is characterized in that, below is satisfied one of at least:
A plurality of threads (124) are carried out described loop-body, and the execution circulation that will carry out all threads of described loop-body by the indirect unit that is modified of described process moves on to the point away from described circulation top;
Described circulation is first circulation (126), described first thread is also carried out second circulation, described second circulation has corresponding second loop-body and the corresponding second circulation top, described module also comprises the second indirect unit that comprises the second circulation top address of pointing to the described second circulation top, described second circulation also comprises the second cycling jump instruction sequence of quoting the described second indirect unit, and described process comprises by the second indirect unit that is modified of a round-robin in each circulation the execution of described first thread being circulated and moves on to point away from this round-robin top.
3. process as claimed in claim 1 is characterized in that, the step of the indirect unit (making indirect unit comprise the address that is not the circulation top address) of change (318) is atoms with respect to all threads of the execution in the described equipment.
4. process as claimed in claim 1 is characterized in that, described transfer step will be carried out circulation by the indirect unit that is modified and move on to synchronous points, and described process further comprises thread and kernel at the synchronous points place synchronous (328).
5. process as claimed in claim 1 is characterized in that, described transfer step will be carried out circulation by the indirect unit that is modified and move on to synchronous points, and described process further comprises then execution (330) garbage collected synchronous operation.
6. process as claimed in claim 1 is characterized in that, described storer is included in and is used for a plurality of indirect unit that distributes during corresponding a plurality of round-robin code generates, and described process further comprise mark (332) which/which indirect unit is modified.
7. process as claimed in claim 1, it is characterized in that, described loop-body has the bottom, and will carry out circulation by the indirect unit that is modified and move on to away from the point at circulation top and the circulation top address is reverted to step in the indirect unit is to carry out under the situation of remaining valid to the continuous conversion at circulation top bottom loop-body not limiting (336) which register.
8. non-transient state storage medium of computer-readable with data and instruction configuration, described instruction makes described at least one processor carry out a kind of process of making things convenient for the cycle control circulation to move of being used to when being carried out by at least one processor, and described process comprises the following steps of code generator:
Provide (302) to have the circulation at loop-body and circulation top;
Indirect unit is associated (304) with described circulation top, makes the cycling jump instruction sequence be included in to be included in the redirect of the address in the described indirect unit;
Appointment (306) its size is adjusted the original value memory location with the expression of preserving described circulation top address; And
Send (308) redirection target code, described redirection target code will be determined to be delivered to the re-orientation processes routine corresponding to the identifier of corresponding unit indirectly and with control when carrying out.
9. the medium of configuration as claimed in claim 8, it is characterized in that, described process comprises that code generator is with a plurality of indirect unit (210) be associated with a plurality of corresponding circulations top (304), specify (306) a plurality of corresponding original value memory location (212), and send (308) a plurality of corresponding redirection target codes (214).
10. the medium of configuration as claimed in claim 8, it is characterized in that, specify its size to be adjusted step with the original value memory location of the expression of preserving described circulation top address and comprise that code generator is that some address at least of circulation top address (206) specifies (306) less than original value memory location of unit indirectly.
11. a computer system comprises:
Logic processor (110);
The storer (112) of operationally communicating by letter with described logic processor;
Reside in the executable module (120) in the described storer, described executable module comprises a plurality of circulations (126) that have corresponding loop-body and corresponding circulation top separately, described module also comprises a plurality of indirect unit of each self-contained corresponding circulation top address, described circulation top address is promptly pointed to the address at corresponding circulation top, and each circulation also comprises the corresponding cycling jump instruction sequence that covers the redirect that is included in the address in the corresponding unit indirectly;
Reside in a plurality of corresponding original value memory location (212) in the described storer, each original value memory location is resized to preserve the expression of corresponding circulation top address; And
Reside in a plurality of corresponding redirection target code (214) in the described storer, each redirection target code is delivered to single shared re-orientation processes routine with control when carrying out.
12. system as claimed in claim 11, it is characterized in that, further comprise virtual execution system (138), described virtual execution system comprise can use thread revise indirect unit and therefore with the loop jump of other threads to code by the re-orientation processes routine of virtual execution system appointment.
13. system as claimed in claim 11 is characterized in that, described re-orientation processes routine (220) is included in the code that will carry out garbage collected synchronous operation when carrying out.
14. system as claimed in claim 11 is characterized in that, further is included in when carrying out and will follows the tracks of the code (332) which indirect unit has been modified for virtual execution system.
15. system as claimed in claim 11 is characterized in that, described redirection target code is grouped into bundle (222) together.
CN201110065876.0A 2010-03-10 2011-03-09 Cycle Stream of Control shifts Active CN102193777B (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US12/720,788 2010-03-10
US12/720,788 US8887142B2 (en) 2010-03-10 2010-03-10 Loop control flow diversion

Publications (2)

Publication Number Publication Date
CN102193777A true CN102193777A (en) 2011-09-21
CN102193777B CN102193777B (en) 2016-01-20

Family

ID=44560944

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201110065876.0A Active CN102193777B (en) 2010-03-10 2011-03-09 Cycle Stream of Control shifts

Country Status (2)

Country Link
US (1) US8887142B2 (en)
CN (1) CN102193777B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104102971A (en) * 2013-04-11 2014-10-15 波音公司 Managing model having object cycle
CN105765521A (en) * 2013-11-25 2016-07-13 马维尔国际贸易有限公司 Systems and methods for loop suspension in java programming
CN107179935A (en) * 2016-03-11 2017-09-19 华为技术有限公司 A kind of instruction executing method and virtual machine
CN108604192A (en) * 2016-02-08 2018-09-28 微软技术许可有限责任公司 Daily record is waited for call the thread transfer returned

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2510641A (en) * 2013-02-12 2014-08-13 F Secure Corp Detecting suspicious code injected into a process if function call return address points to suspicious memory area
US20220212100A1 (en) * 2021-01-04 2022-07-07 Microsoft Technology Licensing, Llc Systems and methods for streaming interactive applications

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6178499B1 (en) * 1997-12-31 2001-01-23 Texas Instruments Incorporated Interruptable multiple execution unit processing during operations utilizing multiple assignment of registers
US20020112227A1 (en) * 1998-11-16 2002-08-15 Insignia Solutions, Plc. Dynamic compiler and method of compiling code to generate dominant path and to handle exceptions
CN101373427A (en) * 2007-08-24 2009-02-25 松下电器产业株式会社 Program execution control device
US20090172263A1 (en) * 2007-12-27 2009-07-02 Pliant Technology, Inc. Flash storage controller execute loop

Family Cites Families (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6151703A (en) 1996-05-20 2000-11-21 Inprise Corporation Development system with methods for just-in-time compilation of programs
US5842016A (en) 1997-05-29 1998-11-24 Microsoft Corporation Thread synchronization in a garbage-collected system using execution barriers
US5995754A (en) 1997-10-06 1999-11-30 Sun Microsystems, Inc. Method and apparatus for dynamically optimizing byte-coded programs
US6851109B1 (en) 1999-05-06 2005-02-01 International Business Machines Corporation Process and system for dynamically compiling a partially interpreted method
US6993754B2 (en) 2001-11-13 2006-01-31 Hewlett-Packard Development Company, L.P. Annotations to executable images for improved dynamic optimization functions
US7395530B2 (en) 2004-08-30 2008-07-01 International Business Machines Corporation Method for implementing single threaded optimizations in a potentially multi-threaded environment
CN101046755B (en) 2006-03-28 2011-06-15 郭明南 System and method of computer automatic memory management
US8291393B2 (en) 2007-08-20 2012-10-16 International Business Machines Corporation Just-in-time compiler support for interruptible code
US8276131B2 (en) 2007-08-30 2012-09-25 International Business Machines Corporation Method and system for dynamic loop transfer by populating split variables
US20100122066A1 (en) * 2008-11-12 2010-05-13 Freescale Semiconductor, Inc. Instruction method for facilitating efficient coding and instruction fetch of loop construct

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6178499B1 (en) * 1997-12-31 2001-01-23 Texas Instruments Incorporated Interruptable multiple execution unit processing during operations utilizing multiple assignment of registers
US20020112227A1 (en) * 1998-11-16 2002-08-15 Insignia Solutions, Plc. Dynamic compiler and method of compiling code to generate dominant path and to handle exceptions
CN101373427A (en) * 2007-08-24 2009-02-25 松下电器产业株式会社 Program execution control device
US20090172263A1 (en) * 2007-12-27 2009-07-02 Pliant Technology, Inc. Flash storage controller execute loop

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104102971A (en) * 2013-04-11 2014-10-15 波音公司 Managing model having object cycle
CN104102971B (en) * 2013-04-11 2020-04-07 波音公司 Managing models with object circulation
CN105765521A (en) * 2013-11-25 2016-07-13 马维尔国际贸易有限公司 Systems and methods for loop suspension in java programming
CN105765521B (en) * 2013-11-25 2019-03-19 马维尔国际贸易有限公司 The system and method stopped for the circulation in Java programming
CN108604192A (en) * 2016-02-08 2018-09-28 微软技术许可有限责任公司 Daily record is waited for call the thread transfer returned
CN108604192B (en) * 2016-02-08 2021-11-19 微软技术许可有限责任公司 System, method, and medium for performing one or more tasks while waiting for an event to be recorded
CN107179935A (en) * 2016-03-11 2017-09-19 华为技术有限公司 A kind of instruction executing method and virtual machine
CN107179935B (en) * 2016-03-11 2021-01-29 华为技术有限公司 Instruction execution method and virtual machine

Also Published As

Publication number Publication date
US20110225213A1 (en) 2011-09-15
US8887142B2 (en) 2014-11-11
CN102193777B (en) 2016-01-20

Similar Documents

Publication Publication Date Title
CN105022630B (en) A kind of assembly management system and assembly management method
CN102193777A (en) Loop control flow diversion
CN105723341B (en) Realization method and system for layout engine and the memory model of script engine
Mäkitalo et al. WebAssembly modules as lightweight containers for liquid IoT applications
CN103809936A (en) System and method for allocating memory of differing properties to shared data objects
US9164735B2 (en) Enabling polymorphic objects across devices in a heterogeneous platform
CN102193810A (en) Cross-module inlining candidate identification
CN103279390A (en) Parallel processing system for small operation optimizing
CN103699635B (en) Information processing method and device
Behan et al. Adaptive graphical user interface solution for modern user devices
Holcombe et al. A general framework for agent-based modelling of complex systems
Buinevich et al. Method and prototype of utility for partial recovering source code for low-level and medium-level vulnerability search
Montella et al. Enabling android-based devices to high-end gpgpus
Schuchart et al. Global task data-dependencies in pgas applications
Aldinucci et al. Efficient streaming applications on multi-core with FastFlow: the biosequence alignment test-bed
Engel et al. Performance improvement of data mining in Weka through multi-core and GPU acceleration: opportunities and pitfalls
Chen et al. Full-Stack Machine Learning Development Framework for Energy Industry Applications
Haine et al. A middleware supporting data movement in complex and software-defined storage and memory architectures
Gijsbers et al. An efficient scalable runtime system for macro data flow processing using S-Net
Lehmann et al. Development of context-adaptive applications on the basis of runtime user interface models
Jorgensen et al. An interactive big data processing/visualization framework
de Carvalho Junior et al. Contextual contracts for component‐oriented resource abstraction in a cloud of high performance computing services
CN109271182A (en) Method and device for search engine upgrading
Park Improving the performance of HDFS by reducing I/O using adaptable I/O system
Driscoll et al. PyGAS: A partitioned global address space extension for python

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
ASS Succession or assignment of patent right

Owner name: MICROSOFT TECHNOLOGY LICENSING LLC

Free format text: FORMER OWNER: MICROSOFT CORP.

Effective date: 20150727

C41 Transfer of patent application or patent right or utility model
TA01 Transfer of patent application right

Effective date of registration: 20150727

Address after: Washington State

Applicant after: Micro soft technique license Co., Ltd

Address before: Washington State

Applicant before: Microsoft Corp.

C14 Grant of patent or utility model
GR01 Patent grant