WO1990001738A1 - Machine process for translating programs in binary machine language into another binary machine language - Google Patents

Machine process for translating programs in binary machine language into another binary machine language Download PDF

Info

Publication number
WO1990001738A1
WO1990001738A1 PCT/US1989/002994 US8902994W WO9001738A1 WO 1990001738 A1 WO1990001738 A1 WO 1990001738A1 US 8902994 W US8902994 W US 8902994W WO 9001738 A1 WO9001738 A1 WO 9001738A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
program
computer
flow analysis
memory
Prior art date
Application number
PCT/US1989/002994
Other languages
French (fr)
Inventor
Colin B. Hunter
John P. Banning
Hans Pufal
Original Assignee
Hunter Systems Software, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hunter Systems Software, Inc. filed Critical Hunter Systems Software, Inc.
Publication of WO1990001738A1 publication Critical patent/WO1990001738A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/30Creation or generation of source code
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/40Transformation of program code
    • G06F8/52Binary to binary

Definitions

  • This invention relates to machine processes for translating computer programs from one computer language into another computer language and, more particularly, to a method for translating computer programs from one binary machine language into another binary machine language, or from one assembly language into another assembly language.
  • Compilers are well-known in the art. They translate programs written in a high-level language, such as C, Fortran, or Pascal, into either assembly language or binary machine language. Likewise, assemblers are well-known in the art; they translate assembly language into binary machine language.
  • compilers translate single lines of a human-readable, high-level language ("statements") into several lines of assembly language or several binary machine instructions.
  • Assemblers on the other hand generally translate one assembly language line into one machine instruction (neglecting comments and assembler directives) .
  • Interpreters are similar to compilers, but instead of simply translating the source program into machine language, an interpreter translates each statement, then executes the translated code, and then translates and executes the next statement, and so on. Because an interpreter deals with only one statement at a time, it can be simpler in design than a compiler for the same language, but the scope for optimization is much less. Consequently, interpreted programs tend to execute much more slowly than compiled programs. Other forms of translators have also been developed from time to time. Various high-level translators (e.g. , Pascal to C) have been around for almost as long as there have been high-level languages. Assembly language translators (e.g., 8080 assembly code to 8086 assembly code) have also been reported, but seldom actually seen. It appears that compiler optimization techniques have not been applied to such translators.
  • high-level translators e.g. , Pascal to C
  • Assembly language translators e.g., 8080 assembly code to 8086 assembly code
  • phase problem is caused because the binary instruction formats of most computers have a varying length. It is, therefore, sometimes difficult to know where one instruction ends and another begins. It is especially difficult to know whether the disassembly process has been started correctly at the beginning of an instruction, or has begun in the middle of an instruction. In the latter case, all subsequent disassembled instructions will generally be wrong.
  • Disassemblers have a difficult time determining whether a particular pattern of bits is really an instruction or just several bytes of data. And of course the data problem exacerbates the phase problem, since the disassembler must correctly determine the length of the data area before it can resume disassembling at the correct place.
  • simulators deal with binary machine language source files. Simulators, however, are similar to interpreters in that they simultaneously translate and execute the source file. When they run, they have an effect as if the source binary program were executing on another computer with a different machine language. Simulators achieve this effect by simulating in software on the target computer the exact behavior of the original computer. Simulators have had very limited success, largely because of one well-known -A- proble : they are very slow. Quite often hundreds of simulator instructions must be executed for every instruction in the source program, and even the very best simulators need ten to twenty simulator instructions per source instruction.
  • This invention provides an effective machine process for translating with high efficiency computer programs from one binary machine language into another binary machine language.
  • This machine process will be referred to as a "binary compiler”; it can be realized in a program for a digital computer.
  • the technique can also be used to translate from one assembly language into another.
  • a binary compiler has the same relationship to a simulator that a compiler has to an interpreter. And just as a compiler produces code that executes faster than interpreted code, application programs converted with a binary compiler execute faster than they do with simulators.
  • the binary compiling process of the present invention involves disassembling the source binary program, analyzing the binary program to produce "global flow analysis" data, using this global flow analysis data to complete the disassembly process, and producing a translated binary machine language version of the source binary program, using global flow analysis data to generate optimized binary code.
  • the invention is used to translate from one assembly language into another, the disassembly stage is omitted, but the global flow analysis is still performed.
  • the output uses the global flow analysis data to produce optimized assembly code instead of optimized binary code.
  • FIG. 1 shows a generalized flow diagram representing the illustrative machine algorithm for practicing data processing in accordance with the present invention
  • FIG. 2 shows a more detailed flow diagram of the PROCESS PROCEDURES portion of the algorithm of FIG. 1;
  • FIG. 3 shows a more detailed flow diagram of the PROCESS A PROCEDURE portion of the algorithm of FIG. 2;
  • FIG. 4 shows a more detailed flow diagram of the of the BUILD BASIC BLOCKS portion of the algorithm of FIG. 3;
  • FIG. 5 shows a more detailed flow diagram of the FORWARD FLOW ANALYSIS portion of the algorithm of FIG. 3;
  • FIG. 6 shows a more detailed flow diagram of the
  • FIG. 7 shows a more detailed flow diagram of the UNKNOWNS ANALYSIS portion of the algorithm of FIG. 3;
  • FIG. 8 shows a more detailed flow diagram of the ANALYZE COMPLETED FLOWGRAPH portion of the algorithm of FIG. 1;
  • FIG. 9 shows a more detailed flow diagram of the LIVE/DEAD ANALYSIS portion of the algorithm of FIG. 8;
  • FIG. 10 shows a more detailed flow diagram of the TRANSLATE INSTRUCTIONS portion of the algorithm of FIG. 1.
  • the first steps of the algorithm represented by the flow chart of FIG. 1 are to read the input data 104 necessary to carry out the represented process.
  • This data includes the source program 108 in 8086 binary machine language and any application specific data (asd) 112 associated with the source binary program.
  • Read data input is represented by block 104 of FIG. 1.
  • the object in this PROCESS PROCEDURES 120 is to analyze the source binary program into its component instructions, which are grouped into "basic blocks" of sequential instructions terminated by a change of control (call, jump, or return) .
  • the PROCESS PROCEDURES 120 also builds a "flow graph", which is a data structure representing the flow of control among the basic blocks. Associated with each basic block are data structures containing information about the use of registers, flags, the stack, and memory within the block, along with a list of all instructions in the block.
  • the basic blocks are grouped into "procedures", which are entered from a call instruction and terminate with a return.
  • the ANALYZE COMPLETED FLOWGRAPH 124 is to analyze the data structures built by PROCESS PROCEDURES 120 in several different ways and pass the results of this analysis to the TRANSLATE INSTRUCTIONS block 132.
  • the ANALYZE COMPLETE FLOWGRAPH 124 performs five different types of analysis: "call-return analysis”, "live-dead analysis” on registers, "live-dead analysis” on flags, "byte orientation analysis", and "alignment analysis”. The results of this analysis are used by TRANSLATE INSTRUCTIONS 132 to generate optimized translated code.
  • the process enters the TRANSLATE INSTRUCTIONS algorithm block 132.
  • the object in the TRANSLATE INSTRUCTIONS 132 is to translate the analyzed instructions in the instruction lists of the basic blocks into equivalent instructions in 68020 binary machine language, using the data developed by the ANALYZE COMPLETED FLOWGRAPH 124 to make the translated code sequences optimally short.
  • FIG. 2 shows an overview of the PROCESS PROCEDURES process 120.
  • the process manipulates data structures called Procedure Blocks (PBs) , which can be linked onto up to three different queues: the new procedure queue, the upward procedure queue, and the downward procedure queue.
  • PBs Procedure Blocks
  • PB exists for every procedure in the program being analyzed.
  • the first step of the PROCESS PROCEDURES process 120 is to build an empty PB on the new procedure queue. This step is represented by block 204 of FIG. 2.
  • the next step, represented by block 208, is to determine whether any PBs exist on the new procedure queue. (The first time through the loop the answer is, of course, yes.) If a PB is on the queue, the process moves to block 210, which is responsible for removing the empty PB from the new procedure queue and initializing it, that is, filling in the starting values in the PB. After the step represented by block 210, the process moves to block 220, which represents the PROCESS A PROCEDURE algorithm.
  • This algorithm 220 performs as much processing on the procedure represented by the PB as can be done at the current point in the analysis. As many basic blocks are built as can be found in this procedure.
  • One result of this algorithm's operation may be to cause new PBs to be put on the new procedure queue, or existing PBs (including the present one) to be put on either the downward procedure queue or the upward procedure queue.
  • This step determines again if any PBs exist on the new procedure queue (some may have been created by the PROCESS A PROCEDURE algorithm) . Again, the process moves to block 210 and thence to 220 and back to 208 if a PB is found on the queue, with this loop executing until no new PBs remain on the new procedure queue.
  • the process moves on to the decision step represented by block 212, which determines if any PBs are on the downward procedure queue. If any are found, the process moves to block 214, which removes the PB from the queue for processing, and on to block 220 (PROCESS A PROCEDURE) .
  • the result of this step may be to create new PBs, so the process moves back to 208, and this cycle continues until all PBs have been removed fro both the new procedure queue and the downward procedure queue.
  • the process moves to the step represented by block 216 and decides if there are any PBs on the upward procedure queue. If there are, they are processed just like the PBs on the downward queue, until no PBs remain on any of the three queues, then the whole PROCESS PROCEDURES algorithm exits.
  • FIG. 3 shows the details of the PROCESS A PROCEDURE block 220 shown in FIG. 2.
  • the binary compiler manipulates four queues of data structures called Basic Blocks (BBs) .
  • BBs Basic Blocks
  • the four queues are: new queue, config queue, unknowns queue, and uses queue.
  • One BB is associated with each basic block in the procedure.
  • the first step in this process is to build all BBs that can be identified in the procedure (304).
  • FORWARD FLOW ANALYSIS is performed (block 308), then BACKWARD FLOW ANALYSIS (312), and finally UNKNOWNS ANALYSIS (320) . If any BBs remain on any of the four queues after these steps (see block 3200, then the process repeats steps 304 through 316, and this loop is iterated until no BBs remain on any queues.
  • FIG. 4 shows the details of the BUILD BASIC BLOCKS
  • the first step (402) is to take the first available BB off the new queue (one of the four BB queues mentioned above) , and then sequentially disassemble instructions beginning at the starting address in the BB, until a jump, call, return, or interrupt instruction (known as the termination instruction) is reached.
  • An encoded representation of the disassembled instructions is stored in a data structure called the Instruction List associated with the BB.
  • the next step (404) is to link the current BB with the BB whose code begins immediately after the current BB's termination instruction (th s BB is called the immediate successor) . If no immediate successor BB has yet been created, the process now creates a new BB with this address, links it to the current BB, and also puts it on the new queue. If an immediate successor already exists, it is put on the config queue (another of the four BB queues) .
  • the next step (408) is to perform forward flow analysis within the current BB.
  • the BBs that precede the current BB are all called its predecessors; they are either the immediate predecessors in the sense of the previous paragraph, or they are BBs whose termination instruction resulted in a transfer of control to the current BB.
  • the result of this operation is the forward data of the current BB, which is stored in a data structure associated with the BB.
  • next steps depend on whether the termination instruction of the current BB is a call, a computed jump, or an interrupt (block 412) . If it is (420) , then the BB is put on the unknowns queue (another of the four BB queues). If it is not (i.e., if the termination instruction is a simple jump) , then the BB is linked to the BB associated with the code at the jump's target address. A new BB is created and put on the new queue, if no BB exists for that address. An existing BB is put on the config queue. (See 416) .
  • the current BB is put on the uses queue (the last of the four BB queues) . Then a check is made to see if the new queue is empty (428) . If it is not, the process performs steps 402 through 428 again repeatedly until the new queue is empty.
  • FIG. 5 shows the details of the FORWARD FLOW ANALYSIS algorithm of FIG. 3, represented by block 308, which performs the forward flow analysis on all the BBs in a procedure.
  • the first step (504) is to take a BB off the config queue. Then.the forward data from all predecessors to the BB is propagated through the BB and stored in its forward data structure (blocks 508 and 512) . If the immediate predecessor ends with a call instruction, the data is first propagated through the called procedure before it is propagated through the BB. Then a check is made (516) to determine if the current BB's forward data has been modified. If so, all successor BBs are put on the config queue (520) . In either case, a check is made to see if any BBs are left on the config queue (524) , and if there are, the whole loop from 504 to 524 is performed again repeatedly until no BBs remain on the config queue.
  • FIG. 6 shows the details of the BACKWARD FLOW ANALYSIS algorithm of FIG. 3, represented by block 312, which performs the backward flow analysis on all the BBs in a procedure.
  • the first step (604) is to take a BB off the uses queue. Then the backward data (expression lists) from all successors to the BB is propagated back through the BB and stored in its backward data structure (blocks 608 and 612) . If any successor ends in a call instruction, the data is first propagated through the called procedure before it is propagated through the BB. Then a check is made (616) to determine if the current BB's backward data has been modified. If so, all predecessor BBs are put on the uses queue (620) . In either case, a check is made to see if any BBs are left on the uses queue (624) , and if there are, the whole loop from 604 to 624 is performed again repeatedly until no BBs remain on the uses queue.
  • FIG. 7 shows the details of the UNKNOWNS ANALYSIS algorithm of FIG. 3, represented by block 316, which performs the analysis on the unknown BBs in a procedure.
  • the first step (704) is to take a BB off the unknown queue. Then an attempt is made to calculate the computed jump or call address, using data in the asd file, if necessary (706) . The process then splits (712) depending on the result of this attempt. If the computation successfully determined the target address, a link is built to the target BB. A new BB is created on the new queue, if no BB exists for that address; an existing BB is put on the config queue.
  • the BB itself is put on the uses queue, and that entry is removed form the unknown list of the procedure's PB (724) . If the attempt to compute the target address was unsuccessful (720) , the BB is put back on the unknown queue and an entry is made in the unknown list of the procedure's PB. In either case, a check (728) is made to see any BBs remain on the unknown queue whose unknown list entry has not been processed. If any remain, the entire loop from 704 to 728 is performed again repeatedly until no unprocessed BBs remain on the unknowns queue.
  • FIG. 8 shows the details of the ANALYZE COMPLETED FLOWGRAPH algorithm of the FIG. 1 process, represented by block 124, which performs the global flow analysis on the completed flowgraph in preparation for the code generation stage.
  • the first step 804 is the sorting stage where the complete set of basic blocks are sorted into order by increasing address, using a standard sort algorithm.
  • the sort algorithm used is a version of the insertion sort algorithm described in D.E. Knuth The Art of Computer Programming, Vol. 3., Sorting and Searching, , Addison-Wesley, 1973, Reading, Massachusetts, pages 80 - 102.
  • the process moves to an optional stage (808) that optimizes the use of jump, call, and return addresses for the particular case of translating programs from 8086 code into 68020 code. This stage will not be present in the general case.
  • FLOWGRAPH algorithm 124 This step performs a global flow analysis of the completed flowgraph, computing "live-dead" data for registers and flags. This data specifies whether source machine condition flags (e.g., CARRY or OVERFLOW) and registers are used by subsequent instructions ("live") or not used ("dead") . This information is used by the subsequent TRANSLATE INSTRUCTIONS algorithm 132 in FIG. 1 to generate optimized target code, by only generating instructions to preserve or simulate live condition flag values and by not preserving data in dead registers. Then the process moves to another optional stage (816) , which performs various "peep-hole” optimizations that are dependent on the particular case of translating programs from 8086 code into 68020 code. FIG.
  • FIG. 9 shows the details of the stage represented by block 812 in FIG. 8., that is, the LIVE-DEAD ANALYSIS algorithm.
  • PBs Procedure Blocks
  • BBs Basic Blocks
  • the first step (904) is to put the PBs for all non-returning procedures (that is procedures that do not return to another procedure) on the downward queue.
  • the next step (908) is to check whether there are any PBs on the downward queue. If not, the process checks (912) to see if there are any PBs on the upward queue.
  • step 916 which computes live-dead data and the transfer function (that is, the effect the procedure has on live-dead data coming from other procedures) for the procedure. This step terminates when the procedure is completely analyzed or when the process encounters a call instruction whose called procedure has not been previously analyzed. Then the process continues to step 920, which determines whether step 916 terminated because the procedure was completely analyzed or because a call to an unanalyzed procedure was reached. If the procedure was completed, the process moves to step 924, which puts the PBs for all procedures that call the completed one on the upward queue. If step 916 terminated because .a call to an unanalyzed procedure was reached, the unanalyzed procedure's PB is put on the downward queue in step 928. In either case, the process continues back to step 908 and iterates until all procedures have been fully analyzed.
  • FIG. 10 shows the details of the TRANSLATE INSTRUCTIONS algorithm of the FIG. 1 process, represented by block 132, which performs the 68020 code generation.
  • the first step (1204) is to get a BB from the chain of BBs arranged in increasing address order that is produced by step 804.
  • an instruction's encoded opcode, addressing mode, and attributes are read from the instruction list associated with the BB (1208) .
  • These encoded values are used as indices to select a short segment of translated 68020 binary code (often just one instruction) from a table (1212) .
  • the 68020 code is saved in a data structure (1216) , then the process proceeds to convert the next instruction by performing steps 1208 through 1220 repeatedly until no more instructions remain in the BB's instruction list. At this point the 68020 instructions representing the BB are output to a file (1222) . Then the process moves to the next BB and performs steps 1204 through 1224 until no BBs remain to be translated.

Abstract

A machine process is disclosed in which a first program in one binary language is translated into a second program in another binary machine language. The process disassembles the first binary program (120), analyzes the first program to produce global flow analysis data (124), uses this global flow analysis data to complete the disassembly, and produces a translated binary machine language version of the first program, using the global flow analysis data to generate the second program (140).

Description

MACHINE PROCESS FOR TRANSLATING PROGRAMS IN BINARY MACHINE LANGUAGE INTO ANOTHER BINARY MACHINE LANGUAGE
Background of the Invention
This invention relates to machine processes for translating computer programs from one computer language into another computer language and, more particularly, to a method for translating computer programs from one binary machine language into another binary machine language, or from one assembly language into another assembly language.
Description of the Prior Art The art relating to machine processes for translating computer programs from one computer language to another ("translators") is well-developed with an extensive amount of literature. The following text briefly describes the relevant technologies. Compilers are well-known in the art. They translate programs written in a high-level language, such as C, Fortran, or Pascal, into either assembly language or binary machine language. Likewise, assemblers are well-known in the art; they translate assembly language into binary machine language.
In general, compilers translate single lines of a human-readable, high-level language ("statements") into several lines of assembly language or several binary machine instructions. Assemblers, on the other hand generally translate one assembly language line into one machine instruction (neglecting comments and assembler directives) .
There is, therefore, scope for optimization with compilers that is not present with assemblers. A good, i.e., optimizing, compiler is usually one that generates fewer machine instructions for a particular sequence of statements than does an average compiler. A vast array of techniques, well-known in the art, have been developed to optimize compiler code generation, including "global flow analysis". The standard reference for:compiler design is Compilers, Principles. Techniques, and Tools by Aho, R. Sethi, and J. Ullman (Addison- esley, 1986) ; note especially Chapter 10 on optimization techniques. Interpreters are similar to compilers, but instead of simply translating the source program into machine language, an interpreter translates each statement, then executes the translated code, and then translates and executes the next statement, and so on. Because an interpreter deals with only one statement at a time, it can be simpler in design than a compiler for the same language, but the scope for optimization is much less. Consequently, interpreted programs tend to execute much more slowly than compiled programs. Other forms of translators have also been developed from time to time. Various high-level translators (e.g. , Pascal to C) have been around for almost as long as there have been high-level languages. Assembly language translators (e.g., 8080 assembly code to 8086 assembly code) have also been reported, but seldom actually seen. It appears that compiler optimization techniques have not been applied to such translators.
Considering now binary machine language source files, disassemblers have been standard features of debugging tools for years and are well-known in the art. They translate sections of binary machine language into an equivalent set of assembly language instructions. Their use has been restricted because of several well-known problems, in particular the "phase problem" and the "data problem".
The phase problem is caused because the binary instruction formats of most computers have a varying length. It is, therefore, sometimes difficult to know where one instruction ends and another begins. It is especially difficult to know whether the disassembly process has been started correctly at the beginning of an instruction, or has begun in the middle of an instruction. In the latter case, all subsequent disassembled instructions will generally be wrong.
The data problem arises because many programs contain bytes or words of data interspersed with the instructions.
Disassemblers have a difficult time determining whether a particular pattern of bits is really an instruction or just several bytes of data. And of course the data problem exacerbates the phase problem, since the disassembler must correctly determine the length of the data area before it can resume disassembling at the correct place.
Like disassemblers, simulators deal with binary machine language source files. Simulators, however, are similar to interpreters in that they simultaneously translate and execute the source file. When they run, they have an effect as if the source binary program were executing on another computer with a different machine language. Simulators achieve this effect by simulating in software on the target computer the exact behavior of the original computer. Simulators have had very limited success, largely because of one well-known -A- proble : they are very slow. Quite often hundreds of simulator instructions must be executed for every instruction in the source program, and even the very best simulators need ten to twenty simulator instructions per source instruction. Because of the above-described problems with disassemblers, there appear to be no example of binary-to-binary optimizing translators; that is, of programs that translate one binary machine language into another binary machine language with high efficiency. Thus, there are no binary-to-binary equivalents of compilers, as simulators are binary-to-binary equivalents of interpreters.
Summary of the Invention
This invention provides an effective machine process for translating with high efficiency computer programs from one binary machine language into another binary machine language. This machine process will be referred to as a "binary compiler"; it can be realized in a program for a digital computer. The technique can also be used to translate from one assembly language into another. A binary compiler has the same relationship to a simulator that a compiler has to an interpreter. And just as a compiler produces code that executes faster than interpreted code, application programs converted with a binary compiler execute faster than they do with simulators. The binary compiling process of the present invention involves disassembling the source binary program, analyzing the binary program to produce "global flow analysis" data, using this global flow analysis data to complete the disassembly process, and producing a translated binary machine language version of the source binary program, using global flow analysis data to generate optimized binary code. When the invention is used to translate from one assembly language into another, the disassembly stage is omitted, but the global flow analysis is still performed. The output uses the global flow analysis data to produce optimized assembly code instead of optimized binary code. Brief Description of the Drawings and Listings
A complete understanding of the present invention and of the above and other advantages thereof may be gained from a consideration of the following detailed description of an illustrative embodiment thereof, which translates programs from the binary machine language of the Intel 8086 microprocessor ("8086 code") into the binary machine language of the Motorola 68020 microprocessor ("68020 code") .
FIG. 1 shows a generalized flow diagram representing the illustrative machine algorithm for practicing data processing in accordance with the present invention; FIG. 2 shows a more detailed flow diagram of the PROCESS PROCEDURES portion of the algorithm of FIG. 1;
FIG. 3 shows a more detailed flow diagram of the PROCESS A PROCEDURE portion of the algorithm of FIG. 2;
FIG. 4 shows a more detailed flow diagram of the of the BUILD BASIC BLOCKS portion of the algorithm of FIG. 3;
FIG. 5 shows a more detailed flow diagram of the FORWARD FLOW ANALYSIS portion of the algorithm of FIG. 3; FIG. 6 shows a more detailed flow diagram of the
BACKWARD FLOW ANALYSIS portion of the algorithm of FIG. 3;
FIG. 7 shows a more detailed flow diagram of the UNKNOWNS ANALYSIS portion of the algorithm of FIG. 3; FIG. 8 shows a more detailed flow diagram of the ANALYZE COMPLETED FLOWGRAPH portion of the algorithm of FIG. 1;
FIG. 9 shows a more detailed flow diagram of the LIVE/DEAD ANALYSIS portion of the algorithm of FIG. 8;
FIG. 10 shows a more detailed flow diagram of the TRANSLATE INSTRUCTIONS portion of the algorithm of FIG. 1.
Detailed Description
The first steps of the algorithm represented by the flow chart of FIG. 1 are to read the input data 104 necessary to carry out the represented process. This data includes the source program 108 in 8086 binary machine language and any application specific data (asd) 112 associated with the source binary program. Read data input is represented by block 104 of FIG. 1.
Following input of data, the process enters the PROCESS PROCEDURES algorithm block 120 of FIG. l. The object in this PROCESS PROCEDURES 120 is to analyze the source binary program into its component instructions, which are grouped into "basic blocks" of sequential instructions terminated by a change of control (call, jump, or return) . The PROCESS PROCEDURES 120 also builds a "flow graph", which is a data structure representing the flow of control among the basic blocks. Associated with each basic block are data structures containing information about the use of registers, flags, the stack, and memory within the block, along with a list of all instructions in the block. The basic blocks are grouped into "procedures", which are entered from a call instruction and terminate with a return.
After the PROCESS PROCEDURES algorithm 120 has been run, the process enters the ANALYZE COMPLETED FLOWGRAPH algorithm represented by block 124 of FIG. l. The -1- object in the ANALYZE COMPLETED FLOWGRAPH 124 is to analyze the data structures built by PROCESS PROCEDURES 120 in several different ways and pass the results of this analysis to the TRANSLATE INSTRUCTIONS block 132. The ANALYZE COMPLETE FLOWGRAPH 124 performs five different types of analysis: "call-return analysis", "live-dead analysis" on registers, "live-dead analysis" on flags, "byte orientation analysis", and "alignment analysis". The results of this analysis are used by TRANSLATE INSTRUCTIONS 132 to generate optimized translated code.
After the ANALYZE COMPLETED FLOWGRAPH algorithm 124 has been run, the process enters the TRANSLATE INSTRUCTIONS algorithm block 132. The object in the TRANSLATE INSTRUCTIONS 132 is to translate the analyzed instructions in the instruction lists of the basic blocks into equivalent instructions in 68020 binary machine language, using the data developed by the ANALYZE COMPLETED FLOWGRAPH 124 to make the translated code sequences optimally short. The result of applying
TRANSLATE INSTRUCTIONS 132 is to produce the translated binary program 140. The PROCESS PROCEDURES 120, ANALYZE COMPLETED FLOWGRAPH 124, and TRANSLATE INSTRUCTIONS 132 algorithms of FIG. 1 will now be described in detail. FIG. 2 shows an overview of the PROCESS PROCEDURES process 120. During the course of its operation, the process manipulates data structures called Procedure Blocks (PBs) , which can be linked onto up to three different queues: the new procedure queue, the upward procedure queue, and the downward procedure queue. One
PB exists for every procedure in the program being analyzed.
As indicated in FIG. 2, the first step of the PROCESS PROCEDURES process 120 is to build an empty PB on the new procedure queue. This step is represented by block 204 of FIG. 2. The next step, represented by block 208, is to determine whether any PBs exist on the new procedure queue. (The first time through the loop the answer is, of course, yes.) If a PB is on the queue, the process moves to block 210, which is responsible for removing the empty PB from the new procedure queue and initializing it, that is, filling in the starting values in the PB. After the step represented by block 210, the process moves to block 220, which represents the PROCESS A PROCEDURE algorithm. This algorithm 220 performs as much processing on the procedure represented by the PB as can be done at the current point in the analysis. As many basic blocks are built as can be found in this procedure. One result of this algorithm's operation may be to cause new PBs to be put on the new procedure queue, or existing PBs (including the present one) to be put on either the downward procedure queue or the upward procedure queue.
After this algorithm completes, the process moves back to the decision step represented by block 208.
This step determines again if any PBs exist on the new procedure queue (some may have been created by the PROCESS A PROCEDURE algorithm) . Again, the process moves to block 210 and thence to 220 and back to 208 if a PB is found on the queue, with this loop executing until no new PBs remain on the new procedure queue.
At this point, the process moves on to the decision step represented by block 212, which determines if any PBs are on the downward procedure queue. If any are found, the process moves to block 214, which removes the PB from the queue for processing, and on to block 220 (PROCESS A PROCEDURE) . The result of this step may be to create new PBs, so the process moves back to 208, and this cycle continues until all PBs have been removed fro both the new procedure queue and the downward procedure queue.
Then the process moves to the step represented by block 216 and decides if there are any PBs on the upward procedure queue. If there are, they are processed just like the PBs on the downward queue, until no PBs remain on any of the three queues, then the whole PROCESS PROCEDURES algorithm exits.
FIG. 3 shows the details of the PROCESS A PROCEDURE block 220 shown in FIG. 2. During the course of these steps, the binary compiler manipulates four queues of data structures called Basic Blocks (BBs) . As will be discussed, the four queues are: new queue, config queue, unknowns queue, and uses queue. One BB is associated with each basic block in the procedure. As indicated in FIG. 3. the first step in this process is to build all BBs that can be identified in the procedure (304). Then FORWARD FLOW ANALYSIS is performed (block 308), then BACKWARD FLOW ANALYSIS (312), and finally UNKNOWNS ANALYSIS (320) . If any BBs remain on any of the four queues after these steps (see block 3200, then the process repeats steps 304 through 316, and this loop is iterated until no BBs remain on any queues.
Then the process tests to see if any of the backward flow data for the procedure's own PB has been modified (324) . If it has been modified, then the PBs for all procedures that call this procedure are put on the upward procedure queue which would be processed subsequently by blocks 216 and 218 shown in FIG. 2. FIG. 4 shows the details of the BUILD BASIC BLOCKS
304 algorithm of FIG. 3, which builds all BBs that can be identified in the procedure. The first step (402) is to take the first available BB off the new queue (one of the four BB queues mentioned above) , and then sequentially disassemble instructions beginning at the starting address in the BB, until a jump, call, return, or interrupt instruction (known as the termination instruction) is reached. An encoded representation of the disassembled instructions is stored in a data structure called the Instruction List associated with the BB.
The next step (404) is to link the current BB with the BB whose code begins immediately after the current BB's termination instruction (th s BB is called the immediate successor) . If no immediate successor BB has yet been created, the process now creates a new BB with this address, links it to the current BB, and also puts it on the new queue. If an immediate successor already exists, it is put on the config queue (another of the four BB queues) .
The next step (408) is to perform forward flow analysis within the current BB. This means to take the forward data (register values, stack values, flag values, memory values) stored in all BBs that logically precede the current BB and propagate these values through the BB to its end, performing all transformations on the data that are done by the BB's instructions. The BBs that precede the current BB are all called its predecessors; they are either the immediate predecessors in the sense of the previous paragraph, or they are BBs whose termination instruction resulted in a transfer of control to the current BB. The result of this operation is the forward data of the current BB, which is stored in a data structure associated with the BB.
The next steps depend on whether the termination instruction of the current BB is a call, a computed jump, or an interrupt (block 412) . If it is (420) , then the BB is put on the unknowns queue (another of the four BB queues). If it is not (i.e., if the termination instruction is a simple jump) , then the BB is linked to the BB associated with the code at the jump's target address. A new BB is created and put on the new queue, if no BB exists for that address. An existing BB is put on the config queue. (See 416) .
Finally, in all cases (424) , the current BB is put on the uses queue (the last of the four BB queues) . Then a check is made to see if the new queue is empty (428) . If it is not, the process performs steps 402 through 428 again repeatedly until the new queue is empty.
FIG. 5 shows the details of the FORWARD FLOW ANALYSIS algorithm of FIG. 3, represented by block 308, which performs the forward flow analysis on all the BBs in a procedure. The first step (504) is to take a BB off the config queue. Then.the forward data from all predecessors to the BB is propagated through the BB and stored in its forward data structure (blocks 508 and 512) . If the immediate predecessor ends with a call instruction, the data is first propagated through the called procedure before it is propagated through the BB. Then a check is made (516) to determine if the current BB's forward data has been modified. If so, all successor BBs are put on the config queue (520) . In either case, a check is made to see if any BBs are left on the config queue (524) , and if there are, the whole loop from 504 to 524 is performed again repeatedly until no BBs remain on the config queue.
FIG. 6 shows the details of the BACKWARD FLOW ANALYSIS algorithm of FIG. 3, represented by block 312, which performs the backward flow analysis on all the BBs in a procedure. The first step (604) is to take a BB off the uses queue. Then the backward data (expression lists) from all successors to the BB is propagated back through the BB and stored in its backward data structure (blocks 608 and 612) . If any successor ends in a call instruction, the data is first propagated through the called procedure before it is propagated through the BB. Then a check is made (616) to determine if the current BB's backward data has been modified. If so, all predecessor BBs are put on the uses queue (620) . In either case, a check is made to see if any BBs are left on the uses queue (624) , and if there are, the whole loop from 604 to 624 is performed again repeatedly until no BBs remain on the uses queue.
FIG. 7 shows the details of the UNKNOWNS ANALYSIS algorithm of FIG. 3, represented by block 316, which performs the analysis on the unknown BBs in a procedure. The first step (704) is to take a BB off the unknown queue. Then an attempt is made to calculate the computed jump or call address, using data in the asd file, if necessary (706) . The process then splits (712) depending on the result of this attempt. If the computation successfully determined the target address, a link is built to the target BB. A new BB is created on the new queue, if no BB exists for that address; an existing BB is put on the config queue. (See block 724.) Then the BB itself is put on the uses queue, and that entry is removed form the unknown list of the procedure's PB (724) . If the attempt to compute the target address was unsuccessful (720) , the BB is put back on the unknown queue and an entry is made in the unknown list of the procedure's PB. In either case, a check (728) is made to see any BBs remain on the unknown queue whose unknown list entry has not been processed. If any remain, the entire loop from 704 to 728 is performed again repeatedly until no unprocessed BBs remain on the unknowns queue.
FIG. 8 shows the details of the ANALYZE COMPLETED FLOWGRAPH algorithm of the FIG. 1 process, represented by block 124, which performs the global flow analysis on the completed flowgraph in preparation for the code generation stage. The first step 804 is the sorting stage where the complete set of basic blocks are sorted into order by increasing address, using a standard sort algorithm. In this particular implementation, the sort algorithm used is a version of the insertion sort algorithm described in D.E. Knuth The Art of Computer Programming, Vol. 3., Sorting and Searching, , Addison-Wesley, 1973, Reading, Massachusetts, pages 80 - 102. After completion of the sorting stage, the process moves to an optional stage (808) that optimizes the use of jump, call, and return addresses for the particular case of translating programs from 8086 code into 68020 code. This stage will not be present in the general case.
After the step represented by block 808, the process moves to block 812, which represents the LIVE-DEAD ANALYSIS algorithm. This step is the heart of the optimizations performed by the ANALYZE COMPLETED
FLOWGRAPH algorithm 124. This step performs a global flow analysis of the completed flowgraph, computing "live-dead" data for registers and flags. This data specifies whether source machine condition flags (e.g., CARRY or OVERFLOW) and registers are used by subsequent instructions ("live") or not used ("dead") . This information is used by the subsequent TRANSLATE INSTRUCTIONS algorithm 132 in FIG. 1 to generate optimized target code, by only generating instructions to preserve or simulate live condition flag values and by not preserving data in dead registers. Then the process moves to another optional stage (816) , which performs various "peep-hole" optimizations that are dependent on the particular case of translating programs from 8086 code into 68020 code. FIG. 9 shows the details of the stage represented by block 812 in FIG. 8., that is, the LIVE-DEAD ANALYSIS algorithm. During the course of these steps two queues of Procedure Blocks (PBs) — the upward queue and the downward queue — and one queue of Basic Blocks (BBs) are completed. The first step (904) is to put the PBs for all non-returning procedures (that is procedures that do not return to another procedure) on the downward queue. The next step (908) is to check whether there are any PBs on the downward queue. If not, the process checks (912) to see if there are any PBs on the upward queue. If there are any PBs on either"queue, the process continues to step 916 which computes live-dead data and the transfer function (that is, the effect the procedure has on live-dead data coming from other procedures) for the procedure. This step terminates when the procedure is completely analyzed or when the process encounters a call instruction whose called procedure has not been previously analyzed. Then the process continues to step 920, which determines whether step 916 terminated because the procedure was completely analyzed or because a call to an unanalyzed procedure was reached. If the procedure was completed, the process moves to step 924, which puts the PBs for all procedures that call the completed one on the upward queue. If step 916 terminated because .a call to an unanalyzed procedure was reached, the unanalyzed procedure's PB is put on the downward queue in step 928. In either case, the process continues back to step 908 and iterates until all procedures have been fully analyzed.
FIG. 10 shows the details of the TRANSLATE INSTRUCTIONS algorithm of the FIG. 1 process, represented by block 132, which performs the 68020 code generation. The first step (1204) is to get a BB from the chain of BBs arranged in increasing address order that is produced by step 804. Then, an instruction's encoded opcode, addressing mode, and attributes are read from the instruction list associated with the BB (1208) . These encoded values are used as indices to select a short segment of translated 68020 binary code (often just one instruction) from a table (1212) . The 68020 code is saved in a data structure (1216) , then the process proceeds to convert the next instruction by performing steps 1208 through 1220 repeatedly until no more instructions remain in the BB's instruction list. At this point the 68020 instructions representing the BB are output to a file (1222) . Then the process moves to the next BB and performs steps 1204 through 1224 until no BBs remain to be translated.
It is to be understood that the above-described embodiments and program implementations are only illustrative of the application of the principles of the present invention. Numerous modifications may be devised by those skilled in the art without departing from the spirit and scope of the invention.

Claims

WHAT IS CLAIMED IS:
1. A machine process for translating a first computer program in one binary machine language having one or more basic blocks, into a second computer program in another binary machine language, by use of a programmed digital computer, having stored in its internal memory a program enabling the computer to perform the following steps: a. disassembling one of said basic blocks of said first computer program; b. analyzing one basic block to produce global flow analysis data; c. continuing the steps of (a) and (b) until all of said basic blocks of said first computer program have been disassembled and analyzed; and d. generating said second computer program, using said global flow analysis data.
2. The process of Claim 1 wherein the disassembling step further comprising: a. disassembling the instructions of said one basic block continuously until a branch, call, or return instruction is reached; and b. preserving a representation of the opcode and addressing modes of each disassembled instruction.
3. The process of Claim 2 wherein said analyzing step further comprising: a. computing and saving the target address of the branch, call, or return instruction; and b. saving the next sequential address an unconditional branch instruction.
4. The process of Claim 3 wherein said continuing step further comprising: a. continuing disassembling at a saved address (either a target address of a branch, call, or return, or the next sequential address after a conditional branch) ; and b. performing the steps in Claims 2-3 repeatedly until either step (a) of this claim encounters only code that has already been disassembled, or the source program ends.
5. The process of Claim 4 wherein said generating step further comprising: a. sorting the sections of disassembled code into order by their address; and b. generating the binary machine language translation for each disassembled instruction.
6. The process of Claim 3 wherein the computing step further comprises a. continuously computing and updating the currently known contents of the first computer program's stacks to produce "global flow analysis stack data"; and b. using said global flow analysis stack data to compute target addresses for branches, calls, and returns.
7. The process of Claim 3 wherein the computing step further comprises: a. continuously computing and updating the program's registers to produce "global of low analysis register data" ; and b. using said global flow analysis register data to computer target addresses for branches, calls and returns.
8. The process of Claim 3 wherein the computing step further comprises: a. continuously computing and updating the currently known contents of the first computer program's key memory location to produce "global flow analysis memory data"; and b. Aising said global flow analysis memory data to computer target addresses for branches, calls and returns.
9. The process of Claim 3 wherein the computing step further comprises: a. using application specific data to compute branches, calls or return target addresses that cannot be computed by the method of Claims 6, 7, or 8.
10. A process as in Claim 9 wherein said application specific data is used to determine the contents of the stack, registers, or memory locations.
11. The process of Claim 5 wherein said global flow analysis data is used to compute whether source machine condition flags (e.g. , CARRY or OVERFLOW) are used by subsequent instructions ("live") or not used ("dead") , and this information is then used to generate optimized target code, by only generating instructions to preserve or simulate live condition flag values.
12. The process of Claim 11 wherein said global flow analysis data is used to compute whether source machine registers are used by subsequent instructions ("live") or not used ("dead") , and this information is then used to generate optimized code, by not preserving data in dead registers.
13. The process of Claim 5 wherein said global flow analysis data (stack, register, or memory) is used to compute whether source machine registers are used by subsequent instructions ("live") or not used ("dead"), and this information is then used to generate optimized code, by using the target machine equivalents of dead registers to hold temporary variables needed by the translated instructions.
14. The process as in Claims 5 wherein said global flow analysis data is used to compute whether data in memory of the source machine is referenced by two or more instructions that operate on different data type lengths (e.g., 16 bit quantities and 8 bit quantities), and this information is used to generate optimized code, in cases where the target machine language and the source machine language differ in the ordering of bytes within a half-word, and half-words within a word.
15. The process of Claim 5 wherein said global flow analysis data is used to compute whether data in memory of the source machine is aligned on addresses that are a multiple of the data length (e.g., 2 bytes or 4 bytes) . Then for those target machines that require such alignment, generating a single memory reference to access the data if the data is aligned, and generating multiple references only when the data is not aligned or when its alignment cannot be determined.
16. The process of Claim 7 wherein said global flow analysis register data is used to identify operating system calls embedded in the binary machine language, by virtue of the register contents current when the call is made.
17. A machine process for translating a first computer program in a computer assembly language having an entry point into second computer assembly language by use of a programmed digital computer having stored in its internal memory a program enabling the computer to perform the following steps: a. starting at said entry point, computing global flow analysis data (stack, register, and/or memory) continuously until the first program ends; b. using said global flow analysis data (stack, register, or memory) for computing whether the first program condition flags (e.g., CARRY or OVERFLOW) are used by subsequent instructions ("live") or not used ("dead") ; and c. generating optimized assembly code using this information, by only generating instructions to preserve or simulate live condition flag values.
18. The process as in Claim 17 wherein said step (a) further comprising: a. reading the first program in address order, b. starting at the entry point, computing global flow analysis data (stack, register, and/or memory) continuously until a branch, call or return instruction is reached, c. computing and saving the target address of the branch, call, or return instruction, d. saving the next sequential address an unconditional branch instruction, e. continuing computing global flow analysis data at a saved address (either a target address of a branch, call, or return, or the nest sequential address after a conditional branch) , f. performing steps (b) through (e) of this claim repeatedly until either step (e) encounters only code that has already been completely analyzed, or the source program ends.
19. A machine process for translating a first computer program in a computer assembly language having an entry point into second computer assembly language by use of a programmed digital computer having stored in its internal memory a program enabling the computer to perform the following steps: a. starting at said entry point, computing global flow analysis data (stack, register, and/or memory) continuously until the first program ends; b. using said global flow analysis data (stack, register, or memory) for computing whether the first program registers are used by subsequent instructions ("live") or not used ("dead") ; and c. generating optimized assembly code using this information, by not preserving data in dead registers.
20. A machine process for translating a first computer program in a computer assembly language having an entry point into second computer assembly language by use of a programmed digital computer having stored in its internal memory a program enabling the computer to perform the following steps: a. starting at said entry point, computing global flow analysis data (stack, register, and/or memory) continuously until the first program ends; b. using said global flow analysis data (stack, register, or memory) for computing whether the first program registers are used by subsequent instructions ("live") or not used ("dead") ; and c. generating optimized assembly code using this information, by using the second program's equivalents of dead registers to hold temporary variables needed by the translated instructions.
21. A machine process for translating a first computer program in a computer assembly language having an entry point into second computer assembly language by use of a programmed digital computer having stored in its internal memory a program enabling the computer to perform the following steps: a. starting at said entry point, computing global flow analysis data (stack, register, and/or memory) continuously until the first program ends; b. using said global flow analysis data (stack, register, or memory) for computing whether data in memory of the source machine is referenced by two or more instructions that operate on different data type lengths (e.g., 16 bit quantities and 8 bit quantities) ; and c. generating optimized code using this information, in cases where the second program's machine language and the first program language differ in the ordering of bytes within a half-word, and half-words within a word.
22. A machine process for translating a first computer program in a computer assembly language having an entry point into second computer assembly language by use of a programmed digital computer having stored in its internal memory a program enabling the computer to perform the following steps: a. starting at said entry point, computing global flow analysis data (stack, register, and/or memory) continuously until the first program ends; b. using said global flow analysis data (stack, register, or memory) for computing whether data in memory of the source machine is aligned on addresses that are a multiple of the data length (e.g., 2 bytes or 4 bytes); and c. generating a single memory reference to access the data if the data is aligned, and generating multiple references only when the data is not aligned or when its alignment cannot be determined for those second language that require such alignment.
PCT/US1989/002994 1988-07-29 1989-07-10 Machine process for translating programs in binary machine language into another binary machine language WO1990001738A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US22607888A 1988-07-29 1988-07-29
US226,078 1988-07-29

Publications (1)

Publication Number Publication Date
WO1990001738A1 true WO1990001738A1 (en) 1990-02-22

Family

ID=22847458

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US1989/002994 WO1990001738A1 (en) 1988-07-29 1989-07-10 Machine process for translating programs in binary machine language into another binary machine language

Country Status (3)

Country Link
EP (1) EP0428560A4 (en)
JP (1) JPH04500132A (en)
WO (1) WO1990001738A1 (en)

Cited By (33)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0372835A2 (en) * 1988-12-06 1990-06-13 AT&T Corp. Translation technique
WO1992015938A1 (en) * 1991-03-07 1992-09-17 Digital Equipment Corporation Branch resolution via backward symbolic execution
WO1992015937A1 (en) * 1991-03-07 1992-09-17 Digital Equipment Corporation Automatic flowgraph generation for program analysis and translation
WO1992015939A1 (en) * 1991-03-07 1992-09-17 Digital Equipment Corporation Method and apparatus for computer code processing in a code translator
WO1992015932A2 (en) * 1991-03-07 1992-09-17 Digital Equipment Corporation Cross-image referencing of program code
US5287490A (en) * 1991-03-07 1994-02-15 Digital Equipment Corporation Identifying plausible variable length machine code of selecting address in numerical sequence, decoding code strings, and following execution transfer paths
US5301325A (en) * 1991-03-07 1994-04-05 Digital Equipment Corporation Use of stack depth to identify architechture and calling standard dependencies in machine code
US5307492A (en) * 1991-03-07 1994-04-26 Digital Equipment Corporation Mapping assembly language argument list references in translating code for different machine architectures
US5339238A (en) * 1991-03-07 1994-08-16 Benson Thomas R Register usage tracking in translating code for different machine architectures by forward and reverse tracing through the program flow graph
US5432795A (en) * 1991-03-07 1995-07-11 Digital Equipment Corporation System for reporting errors of a translated program and using a boundry instruction bitmap to determine the corresponding instruction address in a source program
US5450575A (en) * 1991-03-07 1995-09-12 Digital Equipment Corporation Use of stack depth to identify machine code mistakes
US5548717A (en) * 1991-03-07 1996-08-20 Digital Equipment Corporation Software debugging system and method especially adapted for code debugging within a multi-architecture environment
US5598560A (en) * 1991-03-07 1997-01-28 Digital Equipment Corporation Tracking condition codes in translation code for different machine architectures
EP0703532A3 (en) * 1994-09-22 1997-01-29 Sun Microsystems Inc Embedded program flow information for object code manipulation
US5652869A (en) * 1991-03-07 1997-07-29 Digital Equipment Corporation System for executing and debugging multiple codes in a multi-architecture environment using jacketing means for jacketing the cross-domain calls
US5784552A (en) * 1993-07-28 1998-07-21 Digital Equipment Corporation Debugging a computer program by simulating execution forwards and backwards in a main history log and alternative history logs
EP0905617A2 (en) * 1997-09-30 1999-03-31 Sun Microsystems, Inc. Method for generating a java bytecode data flow graph
WO2000022519A1 (en) * 1998-10-14 2000-04-20 Alcatel Usa Sourcing, L.P. Assembly language translator
WO2000034861A1 (en) * 1998-12-11 2000-06-15 Incert Software Corporation A method for determining program control flow
WO2000045255A2 (en) * 1999-01-29 2000-08-03 Unisys Corporation Determining destinations of a dynamic branch
EP1118933A2 (en) * 2000-01-14 2001-07-25 International Business Machines Corporation Method, system, program, and data structures for transforming an instruction in a first bit architecture to an instruction in a second bit architecture
US6305010B2 (en) 1997-12-04 2001-10-16 Incert Software Corporation Test, protection, and repair through binary code augmentation
US6353924B1 (en) 1999-02-08 2002-03-05 Incert Software Corporation Method for back tracing program execution
US6745383B1 (en) 1999-12-29 2004-06-01 Veritas Operating Corporation Early warning mechanism for enhancing enterprise availability
US6748584B1 (en) 1999-12-29 2004-06-08 Veritas Operating Corporation Method for determining the degree to which changed code has been exercised
US6804814B1 (en) 1999-12-29 2004-10-12 Veritas Operating Corporation Method for simulating back program execution from a traceback sequence
US7246267B2 (en) * 2001-10-01 2007-07-17 Tektronix, Inc. Logic analyzer having a disassembler employing symbol table information for identifying op-codes
GB2448225A (en) * 2007-04-03 2008-10-08 Toshiba Kk Computer program for converting or translating code executable on a first processor to code executable on a second processor.
US7673293B2 (en) 2004-04-20 2010-03-02 Hewlett-Packard Development Company, L.P. Method and apparatus for generating code for scheduling the execution of binary code
CN101271398B (en) * 2007-03-23 2010-06-09 北京大学 Recognition method of multi-path branch structure
US7802299B2 (en) 2007-04-09 2010-09-21 Microsoft Corporation Binary function database system
CN103235724A (en) * 2013-05-10 2013-08-07 中国人民解放军信息工程大学 Atomic operation semantic description based integrated translation method for multisource binary codes
US8869109B2 (en) 2008-03-17 2014-10-21 Microsoft Corporation Disassembling an executable binary

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4667290A (en) * 1984-09-10 1987-05-19 501 Philon, Inc. Compilers using a universal intermediate language

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5313614A (en) * 1988-12-06 1994-05-17 At&T Bell Laboratories Method and apparatus for direct conversion of programs in object code form between different hardware architecture computer systems

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4667290A (en) * 1984-09-10 1987-05-19 501 Philon, Inc. Compilers using a universal intermediate language

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP0428560A4 *

Cited By (53)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0372835A2 (en) * 1988-12-06 1990-06-13 AT&T Corp. Translation technique
EP0372835A3 (en) * 1988-12-06 1992-04-01 AT&T Corp. Translation technique
WO1992015938A1 (en) * 1991-03-07 1992-09-17 Digital Equipment Corporation Branch resolution via backward symbolic execution
WO1992015937A1 (en) * 1991-03-07 1992-09-17 Digital Equipment Corporation Automatic flowgraph generation for program analysis and translation
WO1992015939A1 (en) * 1991-03-07 1992-09-17 Digital Equipment Corporation Method and apparatus for computer code processing in a code translator
WO1992015932A2 (en) * 1991-03-07 1992-09-17 Digital Equipment Corporation Cross-image referencing of program code
WO1992015932A3 (en) * 1991-03-07 1992-10-15 Digital Equipment Corp Cross-image referencing of program code
US5287490A (en) * 1991-03-07 1994-02-15 Digital Equipment Corporation Identifying plausible variable length machine code of selecting address in numerical sequence, decoding code strings, and following execution transfer paths
US5301325A (en) * 1991-03-07 1994-04-05 Digital Equipment Corporation Use of stack depth to identify architechture and calling standard dependencies in machine code
US5307492A (en) * 1991-03-07 1994-04-26 Digital Equipment Corporation Mapping assembly language argument list references in translating code for different machine architectures
US5339238A (en) * 1991-03-07 1994-08-16 Benson Thomas R Register usage tracking in translating code for different machine architectures by forward and reverse tracing through the program flow graph
AU653626B2 (en) * 1991-03-07 1994-10-06 Digital Equipment Corporation Cross-image referencing of program code
AU656577B2 (en) * 1991-03-07 1995-02-09 Digital Equipment Corporation Branch resolution via backward symbolic execution
AU656964B2 (en) * 1991-03-07 1995-02-23 Digital Equipment Corporation Automatic flowgraph generation for program analysis and translation
US5428786A (en) * 1991-03-07 1995-06-27 Digital Equipment Corporation Branch resolution via backward symbolic execution
US5432795A (en) * 1991-03-07 1995-07-11 Digital Equipment Corporation System for reporting errors of a translated program and using a boundry instruction bitmap to determine the corresponding instruction address in a source program
US5450575A (en) * 1991-03-07 1995-09-12 Digital Equipment Corporation Use of stack depth to identify machine code mistakes
US5507030A (en) * 1991-03-07 1996-04-09 Digitial Equipment Corporation Successive translation, execution and interpretation of computer program having code at unknown locations due to execution transfer instructions having computed destination addresses
US5548717A (en) * 1991-03-07 1996-08-20 Digital Equipment Corporation Software debugging system and method especially adapted for code debugging within a multi-architecture environment
EP0731410A1 (en) * 1991-03-07 1996-09-11 Digital Equipment Corporation Method and processing for computer code processing in a code translator
EP0731409A1 (en) * 1991-03-07 1996-09-11 Digital Equipment Corporation Method and apparatus for computer code processing in a code translator
EP0735464A1 (en) * 1991-03-07 1996-10-02 Digital Equipment Corporation Method and apparatus for computer code processing in a code translator
US5598560A (en) * 1991-03-07 1997-01-28 Digital Equipment Corporation Tracking condition codes in translation code for different machine architectures
US5649203A (en) * 1991-03-07 1997-07-15 Digital Equipment Corporation Translating, executing, and re-translating a computer program for finding and translating program code at unknown program addresses
US5652889A (en) * 1991-03-07 1997-07-29 Digital Equipment Corporation Alternate execution and interpretation of computer program having code at unknown locations due to transfer instructions having computed destination addresses
US5652869A (en) * 1991-03-07 1997-07-29 Digital Equipment Corporation System for executing and debugging multiple codes in a multi-architecture environment using jacketing means for jacketing the cross-domain calls
US5784552A (en) * 1993-07-28 1998-07-21 Digital Equipment Corporation Debugging a computer program by simulating execution forwards and backwards in a main history log and alternative history logs
EP0703532A3 (en) * 1994-09-22 1997-01-29 Sun Microsystems Inc Embedded program flow information for object code manipulation
US5926639A (en) * 1994-09-22 1999-07-20 Sun Microsystems, Inc. Embedded flow information for binary manipulation
EP0905617A2 (en) * 1997-09-30 1999-03-31 Sun Microsystems, Inc. Method for generating a java bytecode data flow graph
EP0905617A3 (en) * 1997-09-30 2002-08-14 Sun Microsystems, Inc. Method for generating a java bytecode data flow graph
US6305010B2 (en) 1997-12-04 2001-10-16 Incert Software Corporation Test, protection, and repair through binary code augmentation
WO2000022519A1 (en) * 1998-10-14 2000-04-20 Alcatel Usa Sourcing, L.P. Assembly language translator
WO2000034861A1 (en) * 1998-12-11 2000-06-15 Incert Software Corporation A method for determining program control flow
US6308321B1 (en) 1998-12-11 2001-10-23 Incert Software Corporation Method for determining program control flow
WO2000045255A2 (en) * 1999-01-29 2000-08-03 Unisys Corporation Determining destinations of a dynamic branch
WO2000045255A3 (en) * 1999-01-29 2001-02-15 Unisys Corp Determining destinations of a dynamic branch
US6662354B1 (en) 1999-01-29 2003-12-09 Unisys Corporation Determining destinations of a dynamic branch
US6353924B1 (en) 1999-02-08 2002-03-05 Incert Software Corporation Method for back tracing program execution
US7296261B2 (en) 1999-12-29 2007-11-13 Veritas Operating Corporation Method for determining the degree to which changed code has been exercised
US7823134B2 (en) 1999-12-29 2010-10-26 Symantec Operating Corporation Early warning mechanism for enhancing enterprise availability
US6745383B1 (en) 1999-12-29 2004-06-01 Veritas Operating Corporation Early warning mechanism for enhancing enterprise availability
US6748584B1 (en) 1999-12-29 2004-06-08 Veritas Operating Corporation Method for determining the degree to which changed code has been exercised
US6804814B1 (en) 1999-12-29 2004-10-12 Veritas Operating Corporation Method for simulating back program execution from a traceback sequence
EP1118933A2 (en) * 2000-01-14 2001-07-25 International Business Machines Corporation Method, system, program, and data structures for transforming an instruction in a first bit architecture to an instruction in a second bit architecture
EP1118933A3 (en) * 2000-01-14 2004-04-07 International Business Machines Corporation Method, system, program, and data structures for transforming an instruction in a first bit architecture to an instruction in a second bit architecture
US7246267B2 (en) * 2001-10-01 2007-07-17 Tektronix, Inc. Logic analyzer having a disassembler employing symbol table information for identifying op-codes
US7673293B2 (en) 2004-04-20 2010-03-02 Hewlett-Packard Development Company, L.P. Method and apparatus for generating code for scheduling the execution of binary code
CN101271398B (en) * 2007-03-23 2010-06-09 北京大学 Recognition method of multi-path branch structure
GB2448225A (en) * 2007-04-03 2008-10-08 Toshiba Kk Computer program for converting or translating code executable on a first processor to code executable on a second processor.
US7802299B2 (en) 2007-04-09 2010-09-21 Microsoft Corporation Binary function database system
US8869109B2 (en) 2008-03-17 2014-10-21 Microsoft Corporation Disassembling an executable binary
CN103235724A (en) * 2013-05-10 2013-08-07 中国人民解放军信息工程大学 Atomic operation semantic description based integrated translation method for multisource binary codes

Also Published As

Publication number Publication date
EP0428560A1 (en) 1991-05-29
JPH04500132A (en) 1992-01-09
EP0428560A4 (en) 1992-04-01

Similar Documents

Publication Publication Date Title
WO1990001738A1 (en) Machine process for translating programs in binary machine language into another binary machine language
JP4573189B2 (en) Program code conversion method
US5355494A (en) Compiler for performing incremental live variable analysis for data-parallel programs
EP0467629B1 (en) A loop compiler method & apparatus for a data processing system
US6748588B1 (en) One-pass greedy-pattern-matching finite-state-machine code generation
US20030088860A1 (en) Compiler annotation for binary translation tools
Kleir et al. Optimization strategies for microprograms
JP2000066898A (en) Method for scheduling execution of computer instruction
US10459707B2 (en) Instruction-set simulator and its simulator generation method
JPH07105012A (en) Compiling processing system for language processing program
US6658655B1 (en) Method of executing an interpreter program
JPH04330527A (en) Optimization method for compiler
EP0703532B1 (en) Embedded program flow information for object code manipulation
US6519768B1 (en) Instruction translation method
JP3539613B2 (en) Array summary analysis method for loops containing loop jump statements
US7120905B2 (en) System and method for transformation of assembly code for conditional execution
KR0125605B1 (en) Method and device for verifying operation of machine language program
Barnard et al. Hierarchic syntax error repair for LR grammars
JP3266097B2 (en) Automatic reentrant method and system for non-reentrant program
EP0180077A2 (en) A data processing machine for compiling computer programs
US20040045018A1 (en) Using address space bridge in postoptimizer to route indirect calls at runtime
JP4158239B2 (en) Information processing apparatus and method, and recording medium
JPH02176938A (en) Machine language instruction optimizing system
WO2007051634A2 (en) Method for generating a simulation program which can be executed on a host computer
JP3551352B2 (en) Loop splitting method

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): JP

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): AT BE CH DE FR GB IT LU NL SE

WWE Wipo information: entry into national phase

Ref document number: 1989908667

Country of ref document: EP

WWP Wipo information: published in national office

Ref document number: 1989908667

Country of ref document: EP

WWW Wipo information: withdrawn in national office

Ref document number: 1989908667

Country of ref document: EP