WO1999004337A1 - Code measure tool - Google Patents

Code measure tool Download PDF

Info

Publication number
WO1999004337A1
WO1999004337A1 PCT/GB1998/002050 GB9802050W WO9904337A1 WO 1999004337 A1 WO1999004337 A1 WO 1999004337A1 GB 9802050 W GB9802050 W GB 9802050W WO 9904337 A1 WO9904337 A1 WO 9904337A1
Authority
WO
WIPO (PCT)
Prior art keywords
software
rules
maintainability
rule
code
Prior art date
Application number
PCT/GB1998/002050
Other languages
French (fr)
Inventor
Gerald Anthony Robinson
Original Assignee
British Telecommunications Public Limited Company
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by British Telecommunications Public Limited Company filed Critical British Telecommunications Public Limited Company
Priority to AU82352/98A priority Critical patent/AU8235298A/en
Publication of WO1999004337A1 publication Critical patent/WO1999004337A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/36Preventing errors by testing or debugging software
    • G06F11/3604Software analysis for verifying properties of programs
    • G06F11/3616Software analysis for verifying properties of programs using software metrics

Definitions

  • the present invention is concerned with computer systems and in particular with a system for analysing computer software so as to establish potential problems in the software.
  • the present invention is particularly concerned with a management system capable of identifying or predicting in computer systems where problems are likely to occur and additionally identifying code for updating/investment and accessing the impact of changing code.
  • HP index is calculated on a file and system level and is mainly a size measure and has been validated by industrial use. In accordance with the HP index source files scoring less than 65 are considered to have poor maintainability, and files scoring over 85 good maintainability. Thus it is already known to provide systems which use rules which when applied to software parameters give an indication of the maintainability of code. However, as will be described later, these rules are often contradictory.
  • a concern of the present invention is to provide a system which can reliably assess the maintainability of computer software, and in particular at the file level.
  • a code measurement system for determining the maintainability of software, comprising first means for deriving from a fault record of known software and a metric database of the known software a set of rules, and second means for comparing a metric database of second software not having a fault record to derive a signal indicating the maintainability of the second software.
  • a method of measuring code for determining the mountability of software comprising the steps of measuring code system for determining the maintainability of software, comprising a step of deriving from a fault record of known software and a metric database of the known software a set of rules, and a step of comparing a metric database of second software not having a fault record to derive a signal indicating the maintainability of the second sof ware .
  • Figure 1 shows a general overview of a known system
  • FIGS. 2A and 2B show a general overview of a system in accordance with the present invention
  • Figure 3 is a flow diagram showing the operation of the system of Figure 2;
  • Figures 4 and 5 show typical graphical displays.
  • a prediction system using rules acquired by data mining techniques builds on knowledge of changes in the past in order to carry out the necessary prediction of quality or maintainability.
  • a maintainability detecting index for the code under consideration together with the amount it will have to be changed .
  • These two parameters indicate the impact of potential developments .
  • the very nature of the data mining techniques and the rules identified by these techniques mean that the parameters predicting possible problems have to be expressed as probabilities and frequently one rule when applied will lead to one set of probabilities whilst another rule will lead to a set of contradicting probabilities.
  • the simple application of a set of rules is insufficient to give an unambiguous prediction.
  • FIG. 1 of the drawings shows a general purpose computer 1 which acts as a measurement tool with regard to a set of files 2 which together comprise the software of a complex computer system.
  • the files 2 are coded in C but of course they can exist in any suitable codes such as Cobol .
  • a set of change records 3 are generated for each of the files in the set of files 2. These can be generated by computer 1 or by another processor (not shown).
  • Table A The results of such an analysis of the files of C code, are shown in Table A, where the file names are shown on the left and the measures taken are shown under the measures heading, together with the number of detected changes and a pass or fail criterion.
  • the criterion threshold can be set arbitrarily but in general will be set so as to catch the worst 25% of the files in the system. In this context worse means most heavily changed.
  • the results of the analysis will be referred to as a metric database and is indicated in Figuer 1 at 20.
  • Figure 2A shows part of a system in accordance with an aspect of the present invention.
  • data mining software is used in a computer 1 to analyse the metrics database 20 to produce a set 21 of appropriate rules which will be defined in greater detail hereinafter.
  • the number 10 indicates a set of new files of C code the maintainability of which is to be ascertained and which are measured by the measurement tools 11 in the same way as the known files were measured in Figure 1 so as to generate a metric database 12 similar to the already described Table A.
  • the set of data mining rules 21 as derived from a known set of files as described with respect to Figure 2A are then applied to the metric database 12 using a general purpose computer 22 to generate predictions with regard to each of the files in the newly generated table.
  • the predictions can be displayed on the screen 23 of the computer or stored at 24 for subsequent analysis.
  • a rule will be generated by taking a combination of two or more of the measures shown in Table A together with the pass or fail criterion so as to generate at least two inequalities so that the rule applies when the two or more inequalities by the unknown files are satisfied.
  • a rule can, using the reasons shown in Table A, consist of the two inequalities 2 ⁇ Nm ⁇ 17 and 16 ⁇ M2. As an example, if these inequalities are satisfied the rule will indicate that the file is a fail.
  • the confidence factor associated with the rule will be dependent on the number of records in the database or source data to which the rule applies .
  • Step S10 of this flowchart represents the files of a known software system.
  • a record is made, as described with regard to Figure 1, of the file changes in the software system over a period of time.
  • a metric database is set up in which selected key parameters of the known code are extracted.
  • the file change record and the metric database are combined and at step S14 the data mining is applied to the combined metric database so as to generate a set of rules such as the rule previously discussed.
  • the unknown code that is the code the maintainability or quality of which is to be assessed, has generated from it a metric database which corresponds to the metric database of step 12.
  • step S16 The measurement of the unknown code is shown at step S16 and the generation of the metric database is shown at step S17.
  • step S18 the rules derived by the data mining step S14 are applied in the prediction system shown in Figure 2 so as to generate and display or otherwise make available a series of predictions from which the maintainability or quality of the unknown code can be assessed. This is shown at step 19.
  • step 19 There is, of course, no absolute necessity for the measurement of the known code to be carried out before the measurement of the unknown code with the proviso that it has to be done before step S18 can be carried out.
  • This table shows a selection of rules under which the file passed and a selection of those rules under which the file failed. Naturally many more rules are examined in actual operation of the system.
  • the table also shows a confidence factor (CF), expressed as a percentage, with which the prediction for pass or fail has been given.
  • This confidence factor is an estimate of the probability that an individual rule will be correct in relation to any individual case.
  • Table B does not really give a clear indication as to whether the new file is a pass or a fail because of the fact that under some measurements the file is a pass and under other measurements the file is a fail. Additionally the confidence factors vary for each of the predicted results. It is thus necessary to make a further judgement from Table B trying to take into account these varying factors .
  • One simple method is to compare the number of pass rules with the number of fail rules so that the file passes if the pass rule number is greater than the fail rule number.
  • a second method would be to sum the confidence factors (CF's) of the passes and compare this with the sum of the CF's of the fails. In the given table the sum of the CF's for pass is 229% and fail 167%. On this calculation the file is a pass.
  • Yet another approach is only to take into account those rules where the CF is greater than chance value.
  • Yet another method is to simply state that the best rule wins.
  • the highest CF is for rule 1 and using this method the file passes.
  • This procedure can be modified by adjusting the best rule to take into account the fact that as a greater percentage (70%) of the files in the source data were pass files it is more likely statistically that the best rule would be for a pass file.
  • none of these approaches provides a reliable outcome as to the likelihood of the file passing or failing .
  • system is adapted to carry out a calculation using the following factors:
  • CF X Confidence factor associated with Rule x.
  • tp Number of pass rules triggered
  • tf Number of fail rules triggered
  • p % of pass files in "KNOWN”
  • f % of fail files in "KNOWN” .
  • m Number of pass rules in role database.
  • n Number of fail rules in role database.
  • the adjustment factor is a factor based on the software system from which the rules were derived. They are based on the known history of the software system about which the prediction is to be made, and also the required sensitivity level of the prediction tool. The two adjustment factors should always sum to 100. h e default values used for both of adjustment factors are 50 and 50. These values would be suitable if the system being analysed is newly developed code, and if it was equally important to avoid false fails and false passes .
  • the prediction technique will generally produce more false fails and fewer false passes. This can be prevented by increasing the pass adjustment factor (to 60) and reducing the fail adjustment factor (to 40).
  • CMT Code Measurement Toolkit
  • CMT consists of three main components:
  • the Code Measurement Component utilises a code parsing and metrics tool X-RAY, developed by South Bank University, as the code parsing tool. Further code metrics are calculated from X-RAY 's output using the Qualms tool and other appropriate algorithms . In total 17 system level, 58 source file level and 44 function level measures are made (function in this context is taken as being synonymous with perform and procedure). These metrics include measurements of the following aspects of source code:
  • file level measures appear in a number of statistically derived forms, max, mean, weighted mean, and average density (the average density of a metric for a portion of source code is calculated by dividing the metric value by the number of non-comment, non-blank lines of code).
  • the File Change Prediction Capability which predicts the likelihood of a file changing in the future is based on the analysis of the measures produced by CMT.
  • Two expert systems are used to independently analyse the measurements, a neural net and a rules based expert system. The results from the two expert systems are then compared to allow a judgement to be made.
  • the two expert systems To enable the two expert systems to analyse the code measurements they first have to be trained on a set of source code which has a known change history.
  • the source code is measured and the measurements and change history are analysed by a data mining engine to extract the rules for the rules based expert system and by a neural network retraining tool.
  • the code and change history of release 32 of four sub-systems was used for training CMT.
  • the sub-systems were of varying size and contained in total approximately 1 million lines of source code.
  • the file change predictions were validated by checking the predictions of change against the amount of change that had already occurred to the code used for training. The predictions were found to be > 90% accurate.
  • the prediction accuracy was also tested with a C source code system, of some 90 source files, for which the source code for the first and each subsequent release was available.
  • the source code for the first release was measured.
  • the predictions of change showed all the files as "Least Likely to Change", this was compared with the number of times that each file actually had been changed.
  • the source code was of generally good quality and the predictions made by CMT were correct .
  • CMT can then be used to make change predictions of other source code.
  • the system is, as shown in Figure 2, provided with a graphical user interface .
  • the file change prediction results for the set of source files measured are presented in a "File Change Predictions" window, as shown in Figure 4.
  • the window is composed of three scrolled lists 20, 21 and 22.
  • the top most list 25, which is red, contains all the files which have been given a FAIL rating, i.e. those files which are most likely to change in the future.
  • the green bottom most list 27 contains all of the files that have been given the rating of a PASS, i.e. those files which are least likely to change in the future .
  • the yellow Reasonable Maintainability list 1 shows files scoring between 65 and 85. These files are likely to be quite maintainable, though may have some difficult areas .
  • the green Good Maintainability list 32 shows files scoring over 85. These files are likely to be easy to maintain.
  • the prediction of likelihood of change can be used together with the maintainability measure to identify code that would be a good candidate for improvement. For example if a particular source file is likely to change a lot in the future and it is unmaintainable then it would be a good investment to improve the quality of the code so that future changes are easier to perform and involve less risk.

Abstract

The present invention concerns a code measurement system and method for determining the maintainability of software. The system comprises first means for deriving from a fault record of know software and a metric database of the known software a set of rules, and second means for comparing a metric database of second software not having a fault record to derive a signal indicating the maintainability of the second software.

Description

CODE MEASURE TOOL
The present invention is concerned with computer systems and in particular with a system for analysing computer software so as to establish potential problems in the software. There is now an immense quality of software code which has been written in languages such as C or Cobol and the amount of computer code in current use is increasing at a very substantial rate.
It is thus very important that users can have the capability of identifying where maintenance or reliability problems are likely to occur not only in existing code, but in code which has been written to interlink with existing code and also entirely new code. Such information is useful both when a user is concerned with establishing the quality of new code after it has been communicated or generated in-house, or accessing the demands of maintaining existing code. It can also be used at intervals to determine what has happened during the working life of software and whether the quality of the software has improved or not.
Thus the present invention is particularly concerned with a management system capable of identifying or predicting in computer systems where problems are likely to occur and additionally identifying code for updating/investment and accessing the impact of changing code.
As a result of an appreciation of these problems relating to software code quality a number of metrics have been devised by means of which a measure can be given to the reliability of a particular set of software. These metrics exist at three levels . The lowest of these levels is known as the function level and these metrics deal with functions within a software file. The next level deals with files within a software system and the third level is that which deals with system metrics. There are substantial problems in obtaining sufficient data to carry out software analysis at the files within a system level and even more so at the system level so that the present specification is mainly concerned with metrics at the function level although it will be appreciated that a similar approach and a similar system could be devised for operating at the file and system metric levels.
An example of file level metrics which are already known are shown in the tables marked "Annex A" at the end of this specification.
A known maintainability metric is that called the Hewlett Packard (HP) maintainability index. The HP index is calculated on a file and system level and is mainly a size measure and has been validated by industrial use. In accordance with the HP index source files scoring less than 65 are considered to have poor maintainability, and files scoring over 85 good maintainability. Thus it is already known to provide systems which use rules which when applied to software parameters give an indication of the maintainability of code. However, as will be described later, these rules are often contradictory.
A concern of the present invention is to provide a system which can reliably assess the maintainability of computer software, and in particular at the file level.
In accordance with one aspect of the present invention there is provided a code measurement system for determining the maintainability of software, comprising first means for deriving from a fault record of known software and a metric database of the known software a set of rules, and second means for comparing a metric database of second software not having a fault record to derive a signal indicating the maintainability of the second software.
In accordance with a second aspect of the present invention there is provided a method of measuring code for determining the mountability of software, the method comprising the steps of measuring code system for determining the maintainability of software, comprising a step of deriving from a fault record of known software and a metric database of the known software a set of rules, and a step of comparing a metric database of second software not having a fault record to derive a signal indicating the maintainability of the second sof ware .
In order that the present invention may be more readily understood an embodiment thereof will now be described by way of example and with reference to the accompanying drawings .
Figure 1 shows a general overview of a known system;
Figures 2A and 2B show a general overview of a system in accordance with the present invention;
Figure 3 is a flow diagram showing the operation of the system of Figure 2;
Figures 4 and 5 show typical graphical displays.
In accordance with the present invention a prediction system using rules acquired by data mining techniques builds on knowledge of changes in the past in order to carry out the necessary prediction of quality or maintainability. In such a system it is possible to define a maintainability detecting index for the code under consideration together with the amount it will have to be changed . These two parameters indicate the impact of potential developments . However the very nature of the data mining techniques and the rules identified by these techniques mean that the parameters predicting possible problems have to be expressed as probabilities and frequently one rule when applied will lead to one set of probabilities whilst another rule will lead to a set of contradicting probabilities. Thus the simple application of a set of rules is insufficient to give an unambiguous prediction.
Referring now to Figure 1 of the drawings, this shows a general purpose computer 1 which acts as a measurement tool with regard to a set of files 2 which together comprise the software of a complex computer system. In the present embodiment the files 2 are coded in C but of course they can exist in any suitable codes such as Cobol .
In the arrangement of Figure 1 a set of change records 3 are generated for each of the files in the set of files 2. These can be generated by computer 1 or by another processor (not shown). The results of such an analysis of the files of C code, are shown in Table A, where the file names are shown on the left and the measures taken are shown under the measures heading, together with the number of detected changes and a pass or fail criterion. The criterion threshold can be set arbitrarily but in general will be set so as to catch the worst 25% of the files in the system. In this context worse means most heavily changed. The results of the analysis will be referred to as a metric database and is indicated in Figuer 1 at 20.
TABLE A
FILE NAME MEASURES NO. OF PASS CHANGES OR
FAIL
Ml M2 M3 Mn
ABC.C 21 3.4 29.2 5 F
ABD.C 16 0.2 8.3 2 P
ABE.C 21 29.0 72.0 0 P
ABF.C
XXX. C 16 7.0 16.0 32 F
It will be appreciated that this Table A is the result of investigating a known software system and the present invention is concerned with providing a prediction with regard to the software files which constitute an unknown system.
Figure 2A shows part of a system in accordance with an aspect of the present invention. In Figure 2A data mining software is used in a computer 1 to analyse the metrics database 20 to produce a set 21 of appropriate rules which will be defined in greater detail hereinafter. In Figure 2B the number 10 indicates a set of new files of C code the maintainability of which is to be ascertained and which are measured by the measurement tools 11 in the same way as the known files were measured in Figure 1 so as to generate a metric database 12 similar to the already described Table A. The set of data mining rules 21 as derived from a known set of files as described with respect to Figure 2A are then applied to the metric database 12 using a general purpose computer 22 to generate predictions with regard to each of the files in the newly generated table. Naturally the various steps in the process of generating the metric database and the set of rules can be carried out by the same computer. The predictions can be displayed on the screen 23 of the computer or stored at 24 for subsequent analysis.
A rule will be generated by taking a combination of two or more of the measures shown in Table A together with the pass or fail criterion so as to generate at least two inequalities so that the rule applies when the two or more inequalities by the unknown files are satisfied. In a simple example a rule can, using the reasons shown in Table A, consist of the two inequalities 2 < Nm < 17 and 16 < M2. As an example, if these inequalities are satisfied the rule will indicate that the file is a fail. The confidence factor associated with the rule will be dependent on the number of records in the database or source data to which the rule applies .
The overall basic procedure for deriving maintainability values for the files of an unknown software system are shown in the flowchart of Figure 3.
Step S10 of this flowchart represents the files of a known software system. At step Sll a record is made, as described with regard to Figure 1, of the file changes in the software system over a period of time. At step S12 a metric database is set up in which selected key parameters of the known code are extracted. At step SI3 the file change record and the metric database are combined and at step S14 the data mining is applied to the combined metric database so as to generate a set of rules such as the rule previously discussed. Once the data mining has established the rules, as shown at step 15, the unknown code, that is the code the maintainability or quality of which is to be assessed, has generated from it a metric database which corresponds to the metric database of step 12. Naturally, the code from which this second metric database was generated will not have a fault record. The measurement of the unknown code is shown at step S16 and the generation of the metric database is shown at step S17. At step S18 the rules derived by the data mining step S14 are applied in the prediction system shown in Figure 2 so as to generate and display or otherwise make available a series of predictions from which the maintainability or quality of the unknown code can be assessed. This is shown at step 19. There is, of course, no absolute necessity for the measurement of the known code to be carried out before the measurement of the unknown code with the proviso that it has to be done before step S18 can be carried out.
The calculations just described will in total give a reasonable prediction as to the maintainability of a file. However the nature of the techniques employed will tend to provide a bias which should be compensated for. Thus if the rules used have been generated from heavily maintained code, that is code which has already been changed a lot, then the prediction will be biased in favour of failing files.
The result of the examination of a new file on the basis of the rules is shown in Table B. TABLE B
PASS RULES TRIGGERED FAIL RULES TRIGGERED
RULE 1 CF 87% RULE 12 CF 50%
RULE 13 CF 16% RULE 3 CF 26%
RULE 15 CF 54% RULE 18 CF 14%
RULE 28 CF 32% RULE 74 CF 12%
RULE 96 CF 24% RULE 200 CF 65%
RULE 134 CF 16%
This table shows a selection of rules under which the file passed and a selection of those rules under which the file failed. Naturally many more rules are examined in actual operation of the system. The table also shows a confidence factor (CF), expressed as a percentage, with which the prediction for pass or fail has been given. This confidence factor is an estimate of the probability that an individual rule will be correct in relation to any individual case. However the result as expressed by Table B does not really give a clear indication as to whether the new file is a pass or a fail because of the fact that under some measurements the file is a pass and under other measurements the file is a fail. Additionally the confidence factors vary for each of the predicted results. It is thus necessary to make a further judgement from Table B trying to take into account these varying factors .
A number of different methods have been proposed to rationalise the results as expressed in Table B so as to reach a conclusion about the maintainability of the file.
One simple method is to compare the number of pass rules with the number of fail rules so that the file passes if the pass rule number is greater than the fail rule number. However because of the presence of the confidence factors it may well be that the system predicted fails with more confidence than it predicted passes. Thus a second method would be to sum the confidence factors (CF's) of the passes and compare this with the sum of the CF's of the fails. In the given table the sum of the CF's for pass is 229% and fail 167%. On this calculation the file is a pass. Yet another approach is only to take into account those rules where the CF is greater than chance value.
Yet another method is to simply state that the best rule wins. Thus the highest CF is for rule 1 and using this method the file passes. This procedure can be modified by adjusting the best rule to take into account the fact that as a greater percentage (70%) of the files in the source data were pass files it is more likely statistically that the best rule would be for a pass file. On this basis the new file is a fail as 87 - 70 = 17 and 65 - 30 = 35. However none of these approaches provides a reliable outcome as to the likelihood of the file passing or failing .
In accordance with another aspect of the present invention the system is adapted to carry out a calculation using the following factors:
CFX = Confidence factor associated with Rule x. tp = Number of pass rules triggered, tf = Number of fail rules triggered, p = % of pass files in "KNOWN", f = % of fail files in "KNOWN" . m = Number of pass rules in role database. n = Number of fail rules in role database.
Figure imgf000014_0001
Figure imgf000014_0002
Figure imgf000014_0003
Figure imgf000015_0001
As a result of these four calculations pass and fail scores are determined as follows :-
PASS SCORE
{ADJUSTMENT, + -^ - ≥) / m tp tf
FAIL SCORE =
(ADJUSTMENT? + — - — ) /n 2 tf tp
The adjustment factor is a factor based on the software system from which the rules were derived. They are based on the known history of the software system about which the prediction is to be made, and also the required sensitivity level of the prediction tool. The two adjustment factors should always sum to 100. h e default values used for both of adjustment factors are 50 and 50. These values would be suitable if the system being analysed is newly developed code, and if it was equally important to avoid false fails and false passes .
If the system which is being analysed has already been heavily maintained then the prediction technique will generally produce more false fails and fewer false passes. This can be prevented by increasing the pass adjustment factor (to 60) and reducing the fail adjustment factor (to 40).
If the intention is to minimize the number of false passes then by reducing the pass adjustment factor and increasing the fail adjustment factor this effect can be achieved.
If Pass is 10% or more greater than Fail then PASS Else Fail.
TEST RESULTS
Figure imgf000016_0001
Figure imgf000017_0001
EFFECTS OF ADJUSTMENTS
Figure imgf000017_0002
The expert system just described can also be used in combination with a neural network to evaluate software. The resulting system is known as a Code Measurement Toolkit (CMT) . This provides an integrated environment for the code analysis and maintainability assessment of C and COBOL code. CMT can be used to:
Predict the likelihood of future change to a file. Use two Maintainability Indices to assess the maintainability of code. Use a comprehensive set of industry leading software metrics including data flow measures to assess the complexity, testability and readability of code.
Export a set of metrics for use by the Code Monitor successive release monitoring tool.
CMT consists of three main components:
1. Code measurement. 2. File change prediction capability. 3. Graphical user interface.
The Code Measurement Component utilises a code parsing and metrics tool X-RAY, developed by South Bank University, as the code parsing tool. Further code metrics are calculated from X-RAY 's output using the Qualms tool and other appropriate algorithms . In total 17 system level, 58 source file level and 44 function level measures are made (function in this context is taken as being synonymous with perform and procedure). These metrics include measurements of the following aspects of source code:
size - processing complexity data complexity information flow between source code files and between functions testability maintainability
Some of the file level measures appear in a number of statistically derived forms, max, mean, weighted mean, and average density (the average density of a metric for a portion of source code is calculated by dividing the metric value by the number of non-comment, non-blank lines of code).
The range of measures made is very broad which enables a very good understanding of various attributes of the code to be obtained. Current commercially available tools do not provide the ability to make such a broad range of measures .
The File Change Prediction Capability which predicts the likelihood of a file changing in the future is based on the analysis of the measures produced by CMT. Two expert systems are used to independently analyse the measurements, a neural net and a rules based expert system. The results from the two expert systems are then compared to allow a judgement to be made.
To enable the two expert systems to analyse the code measurements they first have to be trained on a set of source code which has a known change history. The source code is measured and the measurements and change history are analysed by a data mining engine to extract the rules for the rules based expert system and by a neural network retraining tool. In one embodiment the code and change history of release 32 of four sub-systems was used for training CMT. The sub-systems were of varying size and contained in total approximately 1 million lines of source code. The file change predictions were validated by checking the predictions of change against the amount of change that had already occurred to the code used for training. The predictions were found to be > 90% accurate. The prediction accuracy was also tested with a C source code system, of some 90 source files, for which the source code for the first and each subsequent release was available. The source code for the first release was measured. The predictions of change showed all the files as "Least Likely to Change", this was compared with the number of times that each file actually had been changed. The source code was of generally good quality and the predictions made by CMT were correct .
Once training is complete CMT can then be used to make change predictions of other source code. The system is, as shown in Figure 2, provided with a graphical user interface . The file change prediction results for the set of source files measured are presented in a "File Change Predictions" window, as shown in Figure 4. The window is composed of three scrolled lists 20, 21 and 22.
The top most list 25, which is red, contains all the files which have been given a FAIL rating, i.e. those files which are most likely to change in the future.
The middle list 26, which is amber, contains all of the files which have not been given a rating. This occurs if the two expert systems give conflicting results for a file, e.g. the neural net passes a file which the rules based expert system fails.
The green bottom most list 27 contains all of the files that have been given the rating of a PASS, i.e. those files which are least likely to change in the future .
Naturally colours coding is only available on a colour display.
Code maintainability is measured using the Hewlett Packard Maintainability Index. The results are presented in a window of similar format to the file change predictions, an example is given in Figure 5. In Figure 5 the red Poor Maintainability list shown at 30 shows files scoring less than 65. These files are very likely to be difficult to maintain in the future.
The yellow Reasonable Maintainability list 1 shows files scoring between 65 and 85. These files are likely to be quite maintainable, though may have some difficult areas .
The green Good Maintainability list 32 shows files scoring over 85. These files are likely to be easy to maintain.
The prediction of likelihood of change can be used together with the maintainability measure to identify code that would be a good candidate for improvement. For example if a particular source file is likely to change a lot in the future and it is unmaintainable then it would be a good investment to improve the quality of the code so that future changes are easier to perform and involve less risk.
Figure imgf000022_0001
Figure imgf000023_0001
Figure imgf000024_0001
Figure imgf000025_0001
Figure imgf000026_0001
Figure imgf000027_0001

Claims

1. A code measurement system for determining the maintainability of software, comprising first means for deriving from a fault record of known software and a metric database of the known software a set of rules, and second means for comparing a metric database of second software not having a fault record to derive a signal indicating the maintainability of the second software.
2. A system according to claim 1, wherein said second means are adapted for each file of the second software to generate and display a series of indications as to whether the file passes or fails a maintainability criterion.
3. A system according to claim 2, wherein each indication includes a confidence factor as to the correctness of the pass or fail indication by the rule.
4. A system according to claim 3, wherein the confidence factor associated with the rule is determined by the number of records in the source data from which the rule was generated.
5. A system according to any one of claims to 2 to 4 and including means for determining from said plurality of indications a maintainability criterion based on the following calculations:
x=tp
Zp = ∑ JiCFΪ -p2) x=l
x=tf
Figure imgf000029_0001
Figure imgf000029_0002
x=tp
Yf - Γêæ J{100-CFX)2 - (100- ) x=╬╣
wherein :
CFX = Confidence factor associated with Rule x. tp = Number of pass rules triggered. tf Number of fail rules triggered, p = % of pass files in "KNOWN", f = % of fail files in "KNOWN", m = Number of pass rules in role database, ii = Number of fail rules in role database.
6. A method of measuring code system for determining the maintainability of software, comprising a step of deriving from a fault record of known software and a metric database of the known software a set of rules, and a step of comparing a metric database of second software not having a fault record to derive a signal indicating the maintainability of the second software.
7. A method according to claim 6 , wherein in said second step there is generated and displayed for each file of the second software a series of indications as to whether the file passes or fails a maintainability criterion .
8. A method according to claim 7, wherein each indication includes a confidence factor as to the correctness of the pass or fail indication by the rule.
9. A method according to claim 8, wherein the confidence factor associated with the rule is determined by the number of records in the source data from which the rule was generated.
10. A method according to any one of claims 7 to 9 and including the step of determining from said plurality of indications a maintainability criterion based on the following calculations:
Figure imgf000031_0001
x=tf
Yp = Γêæ (100~CFx)2 - (100-P) x=l
x=tp
Γêæ ^(CF2 - F2) x=l
x=tp
Yf = Γêæ J {100 -CFX) (100- ) x=╬╣
wherein:
CFX = Confidence factor associated with Rule x tp = Number of pass rules triggered, tf = Number of fail rules triggered, p = % of pass files in "KNOWN". f = % of fail files in "KNOWN". m = Number of pass rules in role database, n = Number of fail rules in role database.
PCT/GB1998/002050 1997-07-17 1998-07-13 Code measure tool WO1999004337A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU82352/98A AU8235298A (en) 1997-07-17 1998-07-13 Code measure tool

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
GB9715061A GB9715061D0 (en) 1997-07-17 1997-07-17 Code measure tool
GB9715061.9 1997-07-17

Publications (1)

Publication Number Publication Date
WO1999004337A1 true WO1999004337A1 (en) 1999-01-28

Family

ID=10816002

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/GB1998/002050 WO1999004337A1 (en) 1997-07-17 1998-07-13 Code measure tool

Country Status (3)

Country Link
AU (1) AU8235298A (en)
GB (1) GB9715061D0 (en)
WO (1) WO1999004337A1 (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6535870B1 (en) * 2000-02-09 2003-03-18 International Business Machines Corporation Method of estimating an amount of changed data over plurality of intervals of time measurements
EP2007070A1 (en) * 2007-06-18 2008-12-24 Avaya GmbH & Co. KG Method for displaying process information for a data processing facility and data processing system
US7725881B2 (en) 2006-06-09 2010-05-25 Microsoft Corporation Automatically extracting coupling metrics from compiled code
RU2643045C2 (en) * 2013-02-01 2018-01-30 Кембридж Консалтантс Лимитед Foam supply device

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1997002528A1 (en) * 1995-07-06 1997-01-23 Bell Communications Research, Inc. Method and system for an architecture based analysis of software quality

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1997002528A1 (en) * 1995-07-06 1997-01-23 Bell Communications Research, Inc. Method and system for an architecture based analysis of software quality

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
PAUL OMAN ET AL.: "Metrics for Assessing a Software System's Maintainability", PROCEEDINGS OF THE CONFERENCE ON SOFTWARE MAINTENANCE 1992, 9 November 1992 (1992-11-09) - 12 November 1992 (1992-11-12), ORLANDO, FLORIDA, pages 337 - 340, XP000366358 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6535870B1 (en) * 2000-02-09 2003-03-18 International Business Machines Corporation Method of estimating an amount of changed data over plurality of intervals of time measurements
US7725881B2 (en) 2006-06-09 2010-05-25 Microsoft Corporation Automatically extracting coupling metrics from compiled code
EP2007070A1 (en) * 2007-06-18 2008-12-24 Avaya GmbH & Co. KG Method for displaying process information for a data processing facility and data processing system
RU2643045C2 (en) * 2013-02-01 2018-01-30 Кембридж Консалтантс Лимитед Foam supply device

Also Published As

Publication number Publication date
GB9715061D0 (en) 1997-09-24
AU8235298A (en) 1999-02-10

Similar Documents

Publication Publication Date Title
CN105700518B (en) A kind of industrial process method for diagnosing faults
US7516368B2 (en) Apparatus, method, and computer product for pattern detection
US6625589B1 (en) Method for adaptive threshold computation for time and frequency based anomalous feature identification in fault log data
CN111508216B (en) Intelligent early warning method for dam safety monitoring data
EP2384389B1 (en) A method and system for monitoring a drilling operation
KR101106201B1 (en) Clinical examination analyzing device, clinical examination analyzing method, and storing computer readable recording medium program for allowing computer to execute the method
AU768166B2 (en) Method and apparatus for diagnosing difficult to diagnose faults in a complex system
Debón et al. Comparing risk of failure models in water supply networks using ROC curves
CN106886481B (en) Static analysis and prediction method and device for system health degree
CN109816031B (en) Transformer state evaluation clustering analysis method based on data imbalance measurement
Khan et al. Logistic regression models in obstetrics and gynecology literature
CN107958268A (en) The training method and device of a kind of data model
US20080114574A1 (en) Complex event evaluation systems and methods
US7552035B2 (en) Method to use a receiver operator characteristics curve for model comparison in machine condition monitoring
US20080275747A1 (en) Incident/accident report analysis apparatus and method
CN110874744A (en) Data anomaly detection method and device
CN111191855B (en) Water quality abnormal event identification and early warning method based on pipe network multi-element water quality time sequence data
CN115130578A (en) Incremental rough clustering-based online evaluation method for state of power distribution equipment
WO1999004337A1 (en) Code measure tool
CN115858794B (en) Abnormal log data identification method for network operation safety monitoring
CN115729761B (en) Hard disk fault prediction method, system, equipment and medium
CN117194995A (en) Rail vehicle RAMS data association analysis method based on data mining
US7366639B2 (en) Methods for establishing alerts and/or alert limits for monitoring mechanical devices
CN106685926A (en) Information system security level evaluation method and system
CN113242213B (en) Power communication backbone network node vulnerability diagnosis method

Legal Events

Date Code Title Description
WWE Wipo information: entry into national phase

Ref document number: 09125757

Country of ref document: US

AK Designated states

Kind code of ref document: A1

Designated state(s): AL AM AT AU AZ BA BB BG BR BY CA CH CN CU CZ DE DK EE ES FI GB GE GH GM HR HU ID IL IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MD MG MK MN MW MX NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT UA UG US UZ VN YU ZW

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): GH GM KE LS MW SD SZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

NENP Non-entry into the national phase

Ref country code: JP

Ref document number: 1999506706

Format of ref document f/p: F

NENP Non-entry into the national phase

Ref country code: CA

122 Ep: pct application non-entry in european phase