Developing Techniques for Using Software Documents:A Series of Empirical Studies
Ph.D. Thesis Proposal
Forrest J. Shull
University of Maryland
Abstract
This proposal presents an empirical method for developing “reading techniques” that give effective,
procedural guidance to software practitioners. Each reading technique is tailored so that it can be used to
accomplish a specific software-related task (e.g. defect detection) using a specific kind of software
document (e.g. requirements). This empirical method can be followed to create and then continually
improve reading techniques, since it explicitly builds and tests underlying models of how developers use a
particular software document in a particular environment. That is, our approach for developing reading
techniques checks the underlying assumptions about:
· Specific tasks for which the document will be used
· Aspects of the document that must be understood to support those tasks
· Aspects of the environment affecting tasks using the document
This empirical approach avoids the mistake of applying assumptions that are true in other environments to
a context in which they have not been tested.
We illustrate the development method by discussing how it was used to create two reading
techniques (one for finding defects in requirements, and a second for reusing design and code from an
Object-Oriented framework in a new system). We describe a series of studies by which we and other
researchers have assessed the effectiveness of these techniques in practical situations. These studies show
that these reading techniques are useful and effective guidelines for accomplishing real software
engineering tasks, and thus demonstrate the effectiveness of the empirical development method.
Finally, this proposal discusses how indications from these studies have been used to begin to
formulate higher-level hypotheses about reading techniques. We discuss the results from these earlier
studies as well as important questions that have not yet been addressed. We propose a further study that is
designed to test potential improvements to one of our reading techniques and, in the process, shed some
light on the unanswered questions from our earlier studies.
1
Table of Contents
1. Introduction 2
1.1. Problem Statement 3
1.2. Proposed Solution 4
1.2.1. Definitions 4
1.2.2. Focusing on Software Documents 5
1.2.3. An Empirical Approach 8
1.2.4. A Process for Developing Reading Techniques 9
1.2.5. An Example 16
1.3. Validation Plan 20
1.3.1. Evaluating Reading Techniques 20
1.3.2. Evaluating the Process for Developing Reading Techniques 21
2. Completed Work 24
2.1. Investigating a Technique for Analysis of Requirements: PBR 25
2.1.1. Introduction 25
2.1.2. Related Work 25
2.1.3. Modeling 26
2.1.4. Mapping Models to Procedure 27
2.1.5. Description of the Study 29
2.1.6. Evaluation of Effectiveness in the Environment 31
2.1.7. Evaluation of Level of Specificity 32
2.1.8. Evaluation of Measures of Subject Experience 33
2.1.9. Evaluation of Tailoring to Other Environments 34
2.1.10. Conclusions and Indications for Future Research into PBR 36
2.2. Investigating a Technique for Reuse of Code and Design: SBR 37
2.2.1. Introduction 37
2.2.2. Related Work 38
2.2.3. Modeling 38
2.2.4. Mapping Models to Procedure 39
2.2.5. Description of the Study and Analysis 40
2.2.6. Evaluation of Effectiveness in the Environment 42
2.2.7. Evaluation of Level of Specificity 44
2.2.8. Evaluation of Measures of Subject Experience 46
2.2.9. Evaluation of Tailoring to Other Environments 47
2.2.10. Conclusions and Indications for Future Research into SBR 47
2.3. Summary of Completed Work 49
3. Proposed Work 51
3.1. Introduction 51
3.2. Modeling 53
3.3. Mapping Models to Procedure 53
3.4. Description of the Study 55
3.5. Evaluating the Reading Techniques 57
4. Summary 59
References 60
Appendices
A. Sample requirements 65
B. PBR procedures 67
C. SBR procedures 70
D. New versions of PBR, for proposed work 74
E. Data collection forms and questionnaires, for proposed work 80
2
1 Introduction
This proposal presents a body of research that makes two main contributions to software engineering:
1. It presents an iterative process for creating and improving reading techniques, which are procedural
techniques that give effective guidance for accomplishing a specific software-related task. We refer to
this process simply as the Reading Technique Development (RTD) process. As will be explained in
greater detail in the rest of this chapter, reading techniques created by the RTD process provide an
effective way to use a given software document (such as requirements) for a particular task (for
example, defect detection).
2. It presents two reading techniques that have been tailored to different documents and tasks. Aside
from providing a demonstration of the effectiveness of the RTD process, these techniques also
represent useful and effective procedures that can aid in accomplishing some common tasks in
software engineering.
This proposal also discusses a series of studies that provide an empirical evaluation of particular
reading techniques, and uses the experimental results to begin building a body of knowledge about this type
of technique and their application. As part of this discussion, we demonstrate that our understanding of the
experimental methods and results can be packaged in a way that allows others to learn from and extend our
findings.
This proposal is organized as follows: Section 1.1 motivates why the RTD process is useful and
necessary, by discussing the specific problem it addresses. Section 1.2 presents the RTD process as our
proposed solution, provides the relevant definitions, and defines the scope of this work. Section 1.3
outlines how we intend to provide an empirical validation of our proposed solution.
The remaining chapters of this proposal are concerned with studying the effects of our proposed
solution in practice. Chapter 2 provides an overview of work we have already undertaken in this area,
identifying common themes and explaining the most important open research questions. Finally, chapter 3
proposes a new study to continue this work.
3
1.1 PROBLEM STATEMENT
Software practitioners are interested in using effective software development processes, and for this they
have many tools and processes from which to choose. The field of software engineering has produced
many such options, ranging from large sweeping process changes (e.g. moving to an Object-Oriented
programming paradigm) to detailed refinements of a particular aspect of the software lifecycle (e.g. using a
formal notation rather than natural language to record system requirements). In trying to quantify this
phenomenon, some sources have counted that, since the 1970s, “literally hundreds” of different work
practices have appeared and have been claimed to somehow improve software development [Law93].
Relying on standards bodies to identify effective processes does not greatly simplify the situation, since
over 250 “standard practices” have been published in software engineering [Fenton94].
The dilemma, of course, is how the practitioner is to know which tools and processes to invest in,
when almost all of them come with claims of enhanced productivity. All too often, the wrong decision is
made and both software practitioners and researchers invest time and money in tools and processes that
never see significant use. Tools and processes can fail to find use for several reasons:
· they are too difficult to integrate into the standard way of doing things for an organization;
· they do not pay adequate attention to the needs of the user;
· they require the developer to invest too much effort for too little gain;
· they incorporate techniques or technologies that are not effective in the particular environment;
· the limits of their effectiveness are simply not understood.
Actually, all of these reasons are inter-related. The common theme is that the tools and processes
in question simply do not match the requirements of the environment in which they are to be used. Often,
even the developers of an approach do not know how effective it will be in a given environment, what its
limitations are, or to what classes of users it is best suited.
This situation could be avoided if process development and tools selection were based on an
empirical characterization of what works and what does not work in a particular environment. Such an
empirical foundation avoids ad-hoc selection of software tools and techniques based upon assumptions
(about the tasks developers need to do, about the types of information developers need to understand, about
characteristics of the product being developed) that are never checked. In their paper, Science and
Substance: A Challenge to Software Engineers, Fenton, Pfleeger, and Glass dwell on the plethora of tools,
standards, and processes available before pointing out that “much of what we believe about which
approaches are best is based on anecdotes, gut feelings, expert opinions, and flawed research.” What is
needed, they conclude, is a careful, empirical approach [Fenton94].
Basili described the basics of such an empirical approach in his paper, The Experimental Paradigm
in Software Engineering. To develop an understanding of the effectiveness of software processes in a
particular environment, he stresses the necessity of starting from a real understanding of the software
development process, not from assumptions. He advocates:
· modeling the important components of software development in the environment (e.g. processes,
resources, defects);
· integrating these models to characterize the development process;
· evaluating and analyzing these models through experimentation;
· continuing to refine and tailor these models to the environment. [Basili93]
As demonstrated in this chapter, we use these guidelines as the basis of our empirical approach.
4
1.2 PROPOSED SOLUTION
1.2.1 Definitions
This dissertation proposal presents an empirical approach for developing tailored techniques for use by
software practitioners. We refer to these techniques as “reading techniques.” The word “reading” was
chosen to emphasize the similarities with the mental processes we use when attempting to understand any
meaningful text; a reading technique is a process a developer can follow in attempting to understand any
textual software work document. It is important to note that, while “reading” in software engineering is
often assumed to apply only to code, in fact any document used in the software lifecycle has to be read
effectively. (For example, requirements documents have to be read effectively in order to find defects in
inspections.) We address the problem space in more detail in section 1.2.2.
More specifically, a reading technique can be characterized as a series of steps for the inpidual
analysis of a textual software product to achieve the understanding needed for a particular task [Basili96b].
This definition has three important components:
1. A series of steps: The technique must give guidance to the practitioner. The level of guidance must
vary depending on the level best suited to the goal of the task, and may vary from a specific step-bystep
procedure to a set of questions on which to focus.
2. Inpidual analysis: Reading techniques are concerned with the comprehension process that occurs
within an inpidual. Although the method in which this technique is embedded may require
practitioners to work together after they have reviewed a document (e.g. to discuss potential defects
found inpidually in a document), the comprehension of some aspect of the document is still an
inpidual task.
3. The understanding needed for a particular task: Reading techniques have a particular goal; they aim at
producing a certain level of understanding of some (or all) aspects of a document. Thus, although we
may think about CASE tools to support a technique as it evolves, the focus of this research is not on
the tools themselves but on the developer’s understanding of artifacts (whether supported by tools or
not).
The research presented in this proposal demonstrates a proof of concept that our approach to
developing software reading techniques can be effective, by providing empirical evidence concerning the
effectiveness of the specific techniques developed. We also propose to validate our approach by testing
whether it can be used to build up knowledge about reading techniques in general. These goals are broad
but we narrow them to more detailed evaluation questions in section 1.3.
To be as clear as possible about the scope of the work presented in this proposal, we use the
following definitions:
· Technique: A series of steps, at some level of detail, that can be followed in sequence to complete a
particular task. Unless we qualify the phrase, when we speak of “techniques” in this proposal we are
referring specifically to “reading techniques,” that is, techniques that meet the three criteria presented
at the beginning of this section.
· Method: A management-level description of when and how to apply techniques, which explains not
only how to apply a technique, but also under what conditions the technique’s application is
appropriate. (This definition is taken from [Basili96].)
· Software document: Any textual artifact that is used in the software development process. This
definition encompasses artifacts from any stage of the software lifecycle (from requirements elicitation
through maintenance), either produced during the development process (e.g. a design plan) or
constructed elsewhere but incorporated into the software lifecycle (e.g. a code library).
Our definitions sharply differentiate a technique from a method. Perspective-Based Reading
(PBR), discussed in section 2.1, provides a useful example of each. PBR recommends that reviewers use
one of three distinct techniques in reviewing requirements documents to find defects. Each of these
techniques is expressed as a procedure that reviewers can follow, step by step, to achieve the desired result.
PBR itself is a method, because it contains information about the context in which the techniques can be
effectively applied, and about the manner in which the inpidual techniques should be used in order to
achieve the most thorough review of the document.
5
1.2.2 Focusing on Software Documents
One of the most important characteristics of our approach to developing reading techniques is the focus on
providing guidance for how developers use software documents. Software documents are important
because they contain much of the relevant information for software development and are used to support
almost all of the developer’s tasks. Developers are required every day not only to construct work
documents associated with software development (e.g., requirements, design, code, and test plans) but also
to analyze them for a variety of reasons. For example:
· requirements must be checked to see that they represent the goals of the customer,
· code that was written by another developer must be understood so that maintenance can be
undertaken,
· test plans must be checked for completeness,
· all documents must be checked for correctness.
In short, software documents often require continual understanding, review, and modification throughout
the development life cycle. The inpidual analysis of software documents is thus the core activity in many
software engineering tasks: verification and validation, maintenance, evolution, and reuse. By focusing on
software documents we focus on the sources of information that are crucial for the entire development
process.
For example, our primary research environment in the NASA Software Engineering Laboratory
(SEL) uses a model of software development that consists of eight different phases (requirements
definition, requirements analysis, preliminary design, detailed design, implementation, system testing,
acceptance testing, maintenance & operation). Each phase is characterized by specific tasks and the
products that they produce [SEL92]. From observation of this environment we draw two important
conclusions about software documents: