The Evolution of the CWE Development and Research Views
The Evolution of the CWE Development and Research Views
Document version: 1.0 Date: September 9, 2008
This is a draft document. It is intended to support maintenance of CWE, and to educate and solicit feedback from a specific technical
audience. This document does not reflect any official position of the MITRE Corporation or its sponsors. Copyright © 2008, The MITRE Corporation. All rights reserved. Permission is granted to redistribute this document if this paragraph is not removed. This document is subject to change without notice.
Author:Steve Christey
URL: http://cwe.mitre.org/documents/views/view-evolution.html
Table of Contents
Table of Contents
Introduction
Introduction
Since CWE is intended to facilitate communication, it must support
multiple stakeholders. In the past year, we have been developing
multiple views that serve different purposes and audiences. Each view
selects and organizes the appropriate content in a manner that makes
the most sense for its intended audience.
Before Draft 8, CWE only provided one way of organizing data, in a
roughly hierarchical fashion. This scheme borrowed heavily from
multiple past efforts, including Seven Pernicious Kingdoms [7PK],
[CLASP], [PLOVER], and Landwehr et al. [Landwehr]. It also used
content from many different sources with different perspectives.
In the early stages of CWE's development, many different people took
turns at building up and maintaining the CWE content, which led to
many inconsistent classifications. This reduced the utility of the
tree for navigation and browsing, sometimes making it difficult for a
user to find the proper CWE ID to use for a weakness, leading to
frustration and mapping errors. This also made it easier for CWE
maintainers to accidentally introduce a duplicate entry.
These problems were exacerbated by the prevalence of inconsistent or
vague terminology within CWE names and descriptions. Some of these
terminology problems are endemic to the security industry, some were
inherited from previous sources that CWE used, and some were
introduced by the CWE team itself.
These difficulties are not unique to CWE. For example, the problems
of repeatability and mutual exclusiveness in vulnerability taxonomies
has been well-documented since at least Ivan Krsul's PhD thesis
[Krsul].
In the months leading to the release of CWE 1.0, we have been
addressing this problem by:
building multiple different views within CWE, to support multiple
audiences
improving the existing views so that their organization is more
consistent
changing names and descriptions to provide more precise
information that minimizes the amount of confusion for each CWE
entry
This document describes how we have organized the two main
organizational views of CWE:
Development Concepts (CWE-699) is geared towards developers and
people who are familiar with other vulnerability-related
taxonomies.
Research Concepts (CWE-1000) is oriented towards academic research,
creating a new framework for classifying weaknesses.
We also describe how we are trying to address the limitations of
existing names and descriptions.
A separate document contrasts the CWE-699 and CWE-1000 approaches with
those of Seven Pernicious Kingdoms, which has its own view in CWE 1.0,
CWE-700. It can be found at:
http://cwe.mitre.org/documents/views/view-comparison.html
Seven Pernicious Kingdoms (CWE-700)
Seven Pernicious Kingdoms (CWE-700)
One of the most familiar taxonomies is Seven Pernicious Kingdoms
[7PK]. Its focus is on weaknesses that can be analyzed by tools,
preferring "concrete and specific problems" over "abstract and
theoretical ones." It defines 7 kingdoms (plus an extra kingdom),
which are broad categories that decompose into phyla. Some examples
of the pernicious kingdoms are input validation, API abuse, and
security features.
The Seven Pernicious Kingdom view (CWE-700) is based on the original
7PK paper, which is focused primarily on usefulness to developers. It
groups weaknesses by categories.
Development Concepts View (CWE-699)
Development Concepts View (CWE-699)
The Development Concepts view (CWE-699), sometimes referred to as the
Development view, organizes weaknesses around concepts that are
frequently used or encountered in software development. Accordingly,
this view can align closely with the perspectives of developers,
educators, and assessment vendors.
Some of the goals of CWE-699 include flexibile navigation (useful
categories), familiarity (similarity to other efforts), and coverage
(identifying all low-level CWE weaknesses).
To achieve familiarity, the higher-level nodes in the view borrow
heavily from the structure used by Seven Pernicious Kingdoms [7PK],
the categories of errors in [CLASP], the Genesis and Location
classifications used by [Landwehr] et al., and the Preliminary List of
Vulnerability Examples for Researchers [PLOVER]. As a result, the
Development view can be readily understood by users who are already
familiar with these other taxonomies.
With respect to navigation, these past taxonomies and ongoing CWE
maintenance have introduced a variety of different categories for
weaknesses. This provides a mechanism for CWE users to navigate
through the large number of weaknesses that are covered, from a
variety of perspectives. Many of these categories define groups of
weaknesses based on common attributes such as language, resource, or
consequence. Categories include pointer issues, mobile code issues,
error handling, data handling, time and state, temporary file
problems, weaknesses in J2EE and ASP environments, web problems, and
so on.
The underpinnings of the Development view have been in place since the
earliest drafts, serving as the main organizational structure in CWE
until it was formalized as CWE-699 in CWE 1.0.
One challenge for CWE-699 is that with the large number of categories
available, it can lead to inconsistencies in how people perform
mappings to CWE identifiers. For example, in Draft 8, CWE-444 (HTTP
Request Smuggling) was a child of the category CWE-442 (Web Problems).
However, this category was incomplete, so somebody who navigated to
the Web Problems category wouldn't find CWE-79 (XSS). In addition,
CWE-444 was not classified in any other way, so a developer or
researcher would not be able to see how it related to other issues.
For information on other considerations for the Development view, see
the "Taxonomic and Related Properties" section at
http://cwe.mitre.org/documents/views/view-comparison.html
Research Concepts View (CWE-1000)
Research Concepts View (CWE-1000)
While multiple perspectives and category-based views into CWE are very
useful, a view based solely on weaknesses and their abstractions is
also needed to provide a more formal mechanism for classifying
weaknesses, maintaining CWE, and performing vulnerability research.
CWE-1000, the Research Concepts View (sometimes referred to as the
Research view), was developed to address this need. It classifies
weaknesses in a way that largely ignores how they can be detected,
where they appear in code, and when they are introduced in the
software development lifecycle. Accordingly, it avoids capturing
relationships based on specific language, environment, technology,
framework, frequency of occurrence, impact, and mitigation. (Since
these relationships are convenient for many users, they are captured
in CWE-699). By doing so, we have been able to concentrate on
canonical factors that make each weakness unique.
The Research view is mainly organized according to abstractions of
software behaviors and the resources that are manipulated by those
behaviors, which aligns with MITRE's research into vulnerability
theory. In addition to classification, the Research view explicitly
models the inter-dependencies between weaknesses, which have not been
a formal part of past classification efforts. The main examples are
chains and composites.
The view uses multiple deep hierarchies as its organization structure,
with more levels of abstraction than other classification schemes.
Ideally, the abstraction is only on weakness-to-weakness
relationships, with minimal overlap and no categories. Thus,
weaknesses would be present from the lowest levels all the way to
roots of the tree. Each member weakness would cover a single error.
The top-level entries of each of the hierarchies are called Pillars.
This approach lays the groundwork for a more solid understanding of
the complexity of weaknesses and their impact. Once the foundation is
established, the Research view can then be used to systematically
identify theoretical gaps within CWE. We can then start adding in
lower-level variations to create a more complex and thorough
understanding of weaknesses.
The organizational structure of the Developer View (CWE-699) is not
appropriate for the goals being used in the Research View (CWE-1000).
In CWE-699, weaknesses belong to multiple categories, the relevance of
which can change based on the context in which the weakness occurs, or
the perspective of the viewer. As a result, weaknesses are often
children of categories that have little to do with the nature of the
underlying software error.
Process for Creating Abstractions in the Research View
Process for Creating Abstractions in the Research View
To unify the Research view to conform to a single perspective, we
tried to build abstractions of the core issues behind each weakness.
This work was performed inductively, using available weakness data.
The primary source was the existing repository of weaknesses from CWE
Drafts 8 and 9. Other sources were background knowledge and observed
examples from CVE work that were not yet classified, repositories that
we received under NDA, and substantial suggestions from an anonymous
source.
The inductive process was not formal, but the general approach evolved
so that "core issues" were defined in terms of invalid behaviors on
resources. Abstraction was then performed on these two key concepts.
This was not always done, however.
Identification of Abstraction Levels
The CWE team labeled each weakness according to different levels
of abstraction: Class, Base, and Variant. These levels have not been
formally defined, so there are some inconsistencies in the
definitions. In general, however, a Class describes a weakness in a
very abstract fashion, typically independent of any specific language
or technology. At the lowest level, a Variant describes a weakness at
a very low level of detail, typically limited to a specific language,
technology, programming idiom, or resource type. Between the Class
and Variant levels, the Base describes a weakness in an abstract
fashion, but with sufficient details that the reader could infer
specific methods for detection and prevention based solely on the
description.
For CWE-1000 (Research View), the team then identified and resolved
logical inconsistencies in weakness relationships, such as a Variant
that was a parent of a Class, or a Base that had no Class parents.
This focused efforts in undeveloped areas of the hierarchy.
Performing Abstraction
Roughly speaking, the informal process was:
examine the weakness if it does not have any weakness parents.
determine if the issue is really describing a more complex
combination of lower-level weaknesses, such as a chain or
composite
if a chain or composite, or other compound element, treat each
weakness separately, and capture appropriate relationships between
them (e.g. using the CanPrecede relationship for chains)
label each weakness according to different levels of abstraction:
Class, Base, and Variant.
for each weakness, identify the key behavior that seems to be
invariant across any weakness of this type, and how the behavior
is erroneous. Ensure that the behavior is reflected in the
weakness, and build an abstraction of behaviors.
if multiple behaviors are involved, then the issue is probably a
category that requires a split into separate weaknesses.
if the weakness is specific to a particular type of resource,
identify the type of resources being manipulated, and build an
abstraction of that resource.
for each arbitrary grouping (i.e. category), look at each child
weakness, make sure that child has a parent that is a weakness,
then remove the category from the view (typically this involved
shifting the category relationships to the Development view).
when a weakness cannot be abstracted any more, it is treated as a
Pillar - a root of the hierarchy.
At the release of CWE 1.0, only one category remains, which could not
be resolved by release time because of the need for community
involvement.
It should be noted that we did not merge pillars by following the
"seven plus one" principle as used in 7PK. For example, both CWE-682
(Incorrect Calculation) and CWE-330 (Use of Insufficiently Random
Values) are related to numbers, but they were not merged into one
pillar, because it would force the creation of a category.
The modeling of behaviors and resources is most evident under the
CWE-664 pillar, Insufficient Control of a Resource Through its
Lifetime. Note that CWE-664 is still a work in progress, e.g. with
respect to multiple perspectives.
Abstraction in Action
Consider CWE-8 (J2EE Misconfiguration: Entity Bean Declared Remote),
as it was in Draft 8 (with a more up-to-date name for clarity).
examine parents: the only parent is the category CWE-4 (J2EE
Environment Issues). This weakness needs a new parent under
CWE-1000.
identify abstraction level: the weakness is technology-specific
(J2EE) and resource-specific (Entity Bean). Thus the abstraction
level is Variant.
identify behavior: the code specifies that an entity bean can be
accessed remotely.
abstract behavior: the code specifies improper access for an entity
bean, defining a sphere of control that is broader than intended.
abstract resource: the code specifies improper access for a
resource, defining an improper sphere of control.
match CWE: this closely matches CWE-668 (Exposure of Resource to
Wrong Sphere), a Class. Thus the mapping fit would be ABSTRACTION
[Loveless].
The closest parent CWE-668 is a class, so is very abstract, whereas
CWE-8 is a Variant. This is an inconsistency in the level of
abstraction that suggests that a Base should probably be
identified. In CWE 1.0, CWE-668 does not have any children that
would be more appropriate.
Another example involves CWE-560 (Use of umask() with chmod-style
Argument).
In Draft 8, this had an uninformative name of "Often Misused:
umask()". The description was not much more specific, stating "The
mask specified by the argument umask() is often confused with the
argument to chmod()."
examine parents: the only parent is the category CWE-559 (Often
Misused: Arguments and Parameters), which is intended to collect
anything related to arguments and parameters. So, this item needs
a new parent under CWE-1000.
identify abstraction level: the weakness is OS-specific (UNIX) and
language-specific (C). Thus the abstraction level is Variant.
identify behavior: since CWE-560's name and description were not
sufficiently clear, examine the context notes within CWE-560. This
explains that the core error is that the programmer calls umask(),
but the argument to umask is specified using a number that
specifies permissions in chmod(). The same value has different
interpretations in chmod() versus umask(), so the programmer is not
specifying the argument to umask() correctly. This could cause the
program to create files with insecure permissions.
identify chain: notice that the "incorrect specification of umask()
argument" is primary to a later behavior, in which files are
created with insecure permissions. Thus there is a chaining
relationship; here, we will concentrate on the primary weakness,
the incorrect specification of the argument.
abstract behavior: specification of the behavior is already pretty
abstract, so there is not much room to go. "The programmer invokes
a function with an incorrectly specified umask."
abstract resource: the argument to the function is a code-layer
resource, so one could abstract to: "The programmer invokes a
function with an incorrectly specified argument."
match CWE: this is an exact match with CWE-687 (Function Call With
Incorrectly Specified Argument Value). This is labeled as a
Variant, but it's more arguably a Base or Class, since it's not
specific to any technology or language.
set CWE-687 as the parent of CWE-560 under view 1000.
preserve CWE-559 as another parent of CWE-560, but shift the
relationship to the Development Concepts view (CWE-699).
Identification of Perspective Issues
During construction of the Research view, sometimes there would be
confusion with respect to how to classify a weakness. In some cases,
this was due to knowledge gaps within CWE, and sometimes these would
be resolved. In other cases, multiple classifications remained, and
we preserved these. We suspect that when there are multiple
classifications, there are differences of perspective. However, we
could not investigate this problem deeply before the CWE 1.0 release.
One of the main examples of the perspective problem is in chains and
composites. We found that much of the classification task was
simplified once we learned how to recognize chains. It may be that
some of the weaknesses with multiple parents are composites.
Consider CWE-568 (finalize() Method Without super.finalize()). In CWE
1.0, it has two parents under View 1000:
In some sense, the CWE-568 finalize() weakness is a composite, since
it's a failure to follow specification in code that is intended to
perform resource shutdown. Thus it would be a child of the (573+404)
composite. An alternate interpetation could be that this is a chain -
by not following specification, the program performs an improper
resource shutdown.
While the previous example demonstrates the level of complexity that
can arise when trying to identify the core aspects of a weakness, the
chain and composite concepts have also helped us to understand why
classification has been such a difficult challenge in the past. In
turn, these are probably responsible for many mapping errors and
inconsistencies.
Many issues could be described by two or more high-level Pillars. For
example, everything could be argued as "API Abuse" - by calling
strcpy() with an overly long buffer, the programmer is violating the
contract. However, this is not regarded as "API Abuse" under view
1000, nor under 7PK. How to clearly communicate these concepts will
remain a challenge.
Some further description of the perspectives problem is at:
http://www.nabble.com/Identifying-Perspective-Issues-in-CWE-to18245971.html
Using the Research View to Identify Gaps
Another benefit of collocating similar core weaknesses is the ease of
applying a consistent vocabulary across CWE. Weaknesses that had no
relationships before are brought together and allow the CWE team to
see inconspicuous relationships and trends. For example, CWE-696
(Incorrect Behavior Order) was created as a weakness class for several
entries where the fundamental problem was performing operations in the
incorrect sequence. The first level of children of CWE-696 under the
Research view now reads:
CWE-179 Incorrect Behavior Order: Early Validation
CWE-408 Incorrect Behavior Order: Early Amplification
CWE-551 Incorrect Behavior Order: Authorization Before Parsing
and Canonicalization
Since abstraction of nodes is primarily based on behaviors and
weaknesses, and the view only covers weaknesses, a higher-level node
could be used to provide guidance on how to identify potential new
children. For example, CWE-118 (Improper Access of Indexable
Resource) only has one child under view 1000, CWE-119 (Failure to
Constrain Operations within the Bounds of an Allocated Memory Buffer).
By considering other types of indexable resources, one could identify
equivalent weaknesses, such as access of file segments through
user-controlled file offsets.
In views such as 699, which have a less formal organization, the
identification of gaps would be more haphazard and less systematic.
Identification of Duplicates
With an organization that focuses mostly on behaviors and resources,
the development of View 1000 helped to identify duplicate CWE entries.
For example, duplicates CWE-132 (Miscalculated Null Termination) and
CWE-170 (Improper Null Termination) used to be in disjoint segments of
the hierarchy. CWE-132 was under CWE-682 (Incorrect Calculation),
which is really a chaining relationship, while CWE-170 was under
CWE-169 (Technology-Specific Special Elements), which is a category
for any technology-specific weakness related to special elements.
Since CWE-170 was unique to null-terminated strings, it was reasonable
that CWE-170 would be categorized that way.
When reorganized by core weakness, these two entries fell in the same
location, because the behaviors and resource types were the same.
Thus, identifying these entries as duplicates was much more obvious.
One critical step was to recognize that one way of describing the
missing null termination was through a chain:
CWE-682 Incorrect Calculation
CWE-170 Improper Null Termination
CWE-119 Failure to Constrain Operations within the Bounds of an
Allocated Memory Buffer
Notice how in this case, CWE-119 would be resultant, and CWE-682 would
be primary. Due to an incorrect calculation (CWE-682), proper null
termination would not occur (CWE-170), and ultimately the program
might read or write memory outside of a buffer (CWE-119). The
chaining relationship of these three weaknesses could introduce a
vulnerability.
The question then becomes how to classify CWE-170, the middle step of
this chain. During this step, the associated behavior does not
properly maintain the structure of data (i.e. the null termination),
which is relied on by the code. So, this falls under CWE-707 (Failure
to Enforce that Messages or Data are Well-Formed).
Clarity of Names and Descriptions
Clarity of Names and Descriptions
All CWE entries should be clearly written to avoid confusion. This is
important for educating the public, but it is also critical for
ensuring that CWE mappings are repeatable.
Based on our experiences with how people are mapping to CWE, we have
seen evidence that sometimes people can rely exclusively on the CWE
name when deciding how to map an issue to CWE. This was seen when an
issue was incorrectly mapped to a CWE entry when the entry had a vague
name (which could match), and a precise description that clearly was
not a match. There is frequent confusion based on the name,
description, observed examples, and demonstrative examples.
One problem is that weakness/vulnerability terminology is not
particularly expressive, outside of a handful of terms. Some terms
have multiple uses (such as "overflow" and "leak), and there are often
perspective problems, e.g. the attack-focused "SQL injection" phrase.
We are trying to make the CWE name and description more clear about
the weakness being covered, and to keep the perspective on the
weakness itself, instead of the attack or consequence. However, to
keep CWE entries as accessible to the everyday user as possible, we
want to preserve commonplace terminology where feasible.
We tried to change the names so that a CWE consumer would not have to
depend so much on looking up the item's description and associated
notes, just to figure out what the item is talking about.
Support for Alternate Names
Sometimes, multiple terms are used for the same weakness, such as
"Cross-site Request Forgery" and "Cross-site Reference Forgery".
These alternate terms are recorded in CWE. For terms
that are extremely common, we include them in parentheses, such as:
Some entries contain Terminology Notes that describe the
terminological issues for that entry and related weaknesses.
Many entries contain mappings to various taxonomies. These mappings
contain the original names as used in those taxonomies.
Finally, when an entry's name changes, the previous node names are
also recorded.
Properties of Good CWE Names
The criteria for a good name include:
The litmus test for a name change is simple: if a CWE analyst doesn't
have a good idea of the issue after reading the name, then it needs to
be changed. As a result, we removed a lot of non-specific terms such
as "insecure," "improper," and "erroneous." We are slowly building a
more consistent vocabulary, but this is still a work in progress.
Examples of Problematic Names
Some examples include:
Dangerous Functions: this phrase was originally used in 7PK. CWE
inherited this phrase from 7PK and used it as the name for CWE-242,
until the name was changed in Draft 9. Both the 7PK and CWE
descriptions tried to make it clear that this category was for
functions that have inherent vulnerabilities that can never be
guaranteed to work safely, such as gets(). However, it was
sometimes interpreted by people as including any function that has
certain risks but can be safe if used properly, such as strcpy().
Overflow of static internal buffer: this phrase was inherited from
CLASP and used in CWE-500 in Draft 8 and earlier. The phrase
implies a general type of buffer overflow, but the core issue
discussed in CLASP is that an object contains a field that hasn't
been marked as final, which might enable modification of the field
to trigger a buffer overflow. Presumably, other attacks are also
possible. In CWE, the non-final field is regarded as a primary
weakness in a chain that could have multiple resultant issues, such
as an overflow. In this context, the CLASP-style name is
describing a consequence, not the underlying weakness.
Improperly Freeing Heap Memory: this name was used in the Draft 8
version of CWE-590. Here, the "improper" term has multiple
interpretations:
double-free
not clearing sensitive heap memory before freeing (heap inspection)
freeing memory too early because it's referenced later (use after
free)
running free() on an object that was allocated using new()
running delete() on memory that was allocated using free()
Another problem is that, since the name is in accordance with the
Research Concepts model of behaviors and resources, it could be
interpreted as a Base-level weakness with multiple Variant children,
such as those outlined above. This would be an incorrect assumption,
since the core issue being described by CWE-590 is actually a Variant
instead of a Base.
In Draft 9, CWE-590 was renamed to "Free of Invalid Pointer Not on the
Heap."
Often Misused: Authentication: this phrase was inherited from 7PK
and used in CWE-247 in Draft 8 and earlier. If only the name is
used, it could be interpreted as a very general category that
covers anything related to authentication. Closer examination of
the description, however, makes it clear that this is a
language-specific, OS-specific, low-level variant related to
reliance on the getlogin() function.
Mobile Code: Object Hijack: this phrase was inherited from 7PK and
used in CWE-491 in Draft 8 and earlier. However, it is attack
oriented, and it does not specify the nature of the weakness that
allows this attack to occur. In a mapping task, this allows room
for multiple interpretations and increases the chance of mapping
errors. At least, it forces the reader to perform more research in
order to figure out what is being discussed.
Memory Locking: this phrase was used in CWE-591 in Draft 8 and
earlier. It is not weakness-focused; it is about a behavior
without any specification of what is wrong with the behavior. So
it has multiple interpretations, such as:
this could be a category listing all weaknesses that are related to
memory locking
this could be a weakness about a program that does not lock memory
when it should
this could be a weakness about a program that attempts to lock
memory correctly, but does not
this could be a weakness about a program that locks memory, but
doesn't release the lock and enables deadlock conditions
In Draft 9, the name was changed to "Sensitive Data Storage in
Improperly Locked Memory."
Other examples of name changes can be found by reviewing the
Previous_Entry_Names elements of individual CWE entries.
[7PK] Katrina Tsipenyuk, Brian Chess, Gary McGraw. "Seven Pernicious
Kingdoms: A Taxonomy of Software Security Errors". November 2005.
NIST Workshop on Software Security Assurance Tools, Techniques, and
Metrics.
http://cwe.mitre.org/documents/sources/SevenPerniciousKingdoms.pdf
[CLASP] John Viega. "The CLASP Application Security Process". 2005.
http://searchsoftwarequality.techtarget.com/searchAppSecurity/downloads/clasp_v20.pdf
[Krsul] Ivan Krsul. "Software Vulnerability Analysis". 1998. Purdue
University PhD Thesis.
ftp://ftp.cerias.purdue.edu/pub/papers/ivan-krsul/krsul-phd-thesis.pdf
[Landwehr] Carl E. Landwehr, Alan R. Bull, John P. McDermott, William
S. Choi.
"A Taxonomy of Computer Program Security Flaws, with Examples".
November 1993. NRL/FR/5542--93-9591.
http://cwe.mitre.org/documents/sources/ATaxonomyofComputerProgramSecurityFlawswithExamples%5BLand...
[PLOVER] Steve Christey. The Preliminary List of Vulnerability
Examples for Researchers (PLOVER). August 2005. NIST Workshop on
Defining the State of the Art of Software Security Tools.
http://cwe.mitre.org/documents/sources/PLOVER.pdf
[Vulncat] Fortify Software Security Research Group, Dr. Gary McGraw.
"Fortify Taxonomy: Software Security Errors". 2008.
http://www.fortify.com/vulncat/en/vulncat/index.html
More information is available — Please edit the custom filter or select a different filter.
|