CWE

Common Weakness Enumeration

A Community-Developed List of Software Weakness Types

CWE/SANS Top 25 Most Dangerous Software Errors
Home > Documents > The Evolution of the CWE Development and Research Views  
ID

The Evolution of the CWE Development and Research Views
The Evolution of the CWE Development and Research Views

Document version: 1.0    Date: September 9, 2008

This is a draft document. It is intended to support maintenance of CWE, and to educate and solicit feedback from a specific technical audience. This document does not reflect any official position of the MITRE Corporation or its sponsors. Copyright © 2008, The MITRE Corporation. All rights reserved. Permission is granted to redistribute this document if this paragraph is not removed. This document is subject to change without notice.

Author:Steve Christey
URL: http://cwe.mitre.org/documents/views/view-evolution.html

Table of Contents
Table of Contents
  1. Introduction
  2. Seven Pernicious Kingdoms (CWE-700)
  3. Development Concepts View (CWE-699)
  4. Research Concepts View (CWE-1000)
  5. Process for Creating Abstractions in the Research View
  6. Clarity of Names and Descriptions
  7. References
Introduction
Introduction

Since CWE is intended to facilitate communication, it must support multiple stakeholders. In the past year, we have been developing multiple views that serve different purposes and audiences. Each view selects and organizes the appropriate content in a manner that makes the most sense for its intended audience.

Before Draft 8, CWE only provided one way of organizing data, in a roughly hierarchical fashion. This scheme borrowed heavily from multiple past efforts, including Seven Pernicious Kingdoms [7PK], [CLASP], [PLOVER], and Landwehr et al. [Landwehr]. It also used content from many different sources with different perspectives.

In the early stages of CWE's development, many different people took turns at building up and maintaining the CWE content, which led to many inconsistent classifications. This reduced the utility of the tree for navigation and browsing, sometimes making it difficult for a user to find the proper CWE ID to use for a weakness, leading to frustration and mapping errors. This also made it easier for CWE maintainers to accidentally introduce a duplicate entry.

These problems were exacerbated by the prevalence of inconsistent or vague terminology within CWE names and descriptions. Some of these terminology problems are endemic to the security industry, some were inherited from previous sources that CWE used, and some were introduced by the CWE team itself.

These difficulties are not unique to CWE. For example, the problems of repeatability and mutual exclusiveness in vulnerability taxonomies has been well-documented since at least Ivan Krsul's PhD thesis [Krsul].

In the months leading to the release of CWE 1.0, we have been addressing this problem by:

  • building multiple different views within CWE, to support multiple audiences

  • improving the existing views so that their organization is more consistent

  • changing names and descriptions to provide more precise information that minimizes the amount of confusion for each CWE entry

This document describes how we have organized the two main organizational views of CWE:

  • Development Concepts (CWE-699) is geared towards developers and people who are familiar with other vulnerability-related taxonomies.

  • Research Concepts (CWE-1000) is oriented towards academic research, creating a new framework for classifying weaknesses.

We also describe how we are trying to address the limitations of existing names and descriptions.

A separate document contrasts the CWE-699 and CWE-1000 approaches with those of Seven Pernicious Kingdoms, which has its own view in CWE 1.0, CWE-700. It can be found at: http://cwe.mitre.org/documents/views/view-comparison.html

Seven Pernicious Kingdoms (CWE-700)
Seven Pernicious Kingdoms (CWE-700)

One of the most familiar taxonomies is Seven Pernicious Kingdoms [7PK]. Its focus is on weaknesses that can be analyzed by tools, preferring "concrete and specific problems" over "abstract and theoretical ones." It defines 7 kingdoms (plus an extra kingdom), which are broad categories that decompose into phyla. Some examples of the pernicious kingdoms are input validation, API abuse, and security features.

The Seven Pernicious Kingdom view (CWE-700) is based on the original 7PK paper, which is focused primarily on usefulness to developers. It groups weaknesses by categories.

Development Concepts View (CWE-699)
Development Concepts View (CWE-699)

The Development Concepts view (CWE-699), sometimes referred to as the Development view, organizes weaknesses around concepts that are frequently used or encountered in software development. Accordingly, this view can align closely with the perspectives of developers, educators, and assessment vendors.

Some of the goals of CWE-699 include flexibile navigation (useful categories), familiarity (similarity to other efforts), and coverage (identifying all low-level CWE weaknesses).

To achieve familiarity, the higher-level nodes in the view borrow heavily from the structure used by Seven Pernicious Kingdoms [7PK], the categories of errors in [CLASP], the Genesis and Location classifications used by [Landwehr] et al., and the Preliminary List of Vulnerability Examples for Researchers [PLOVER]. As a result, the Development view can be readily understood by users who are already familiar with these other taxonomies.

With respect to navigation, these past taxonomies and ongoing CWE maintenance have introduced a variety of different categories for weaknesses. This provides a mechanism for CWE users to navigate through the large number of weaknesses that are covered, from a variety of perspectives. Many of these categories define groups of weaknesses based on common attributes such as language, resource, or consequence. Categories include pointer issues, mobile code issues, error handling, data handling, time and state, temporary file problems, weaknesses in J2EE and ASP environments, web problems, and so on.

The underpinnings of the Development view have been in place since the earliest drafts, serving as the main organizational structure in CWE until it was formalized as CWE-699 in CWE 1.0.

One challenge for CWE-699 is that with the large number of categories available, it can lead to inconsistencies in how people perform mappings to CWE identifiers. For example, in Draft 8, CWE-444 (HTTP Request Smuggling) was a child of the category CWE-442 (Web Problems). However, this category was incomplete, so somebody who navigated to the Web Problems category wouldn't find CWE-79 (XSS). In addition, CWE-444 was not classified in any other way, so a developer or researcher would not be able to see how it related to other issues.

For information on other considerations for the Development view, see the "Taxonomic and Related Properties" section at http://cwe.mitre.org/documents/views/view-comparison.html

Research Concepts View (CWE-1000)
Research Concepts View (CWE-1000)

While multiple perspectives and category-based views into CWE are very useful, a view based solely on weaknesses and their abstractions is also needed to provide a more formal mechanism for classifying weaknesses, maintaining CWE, and performing vulnerability research.

CWE-1000, the Research Concepts View (sometimes referred to as the Research view), was developed to address this need. It classifies weaknesses in a way that largely ignores how they can be detected, where they appear in code, and when they are introduced in the software development lifecycle. Accordingly, it avoids capturing relationships based on specific language, environment, technology, framework, frequency of occurrence, impact, and mitigation. (Since these relationships are convenient for many users, they are captured in CWE-699). By doing so, we have been able to concentrate on canonical factors that make each weakness unique.

The Research view is mainly organized according to abstractions of software behaviors and the resources that are manipulated by those behaviors, which aligns with MITRE's research into vulnerability theory. In addition to classification, the Research view explicitly models the inter-dependencies between weaknesses, which have not been a formal part of past classification efforts. The main examples are chains and composites.

The view uses multiple deep hierarchies as its organization structure, with more levels of abstraction than other classification schemes. Ideally, the abstraction is only on weakness-to-weakness relationships, with minimal overlap and no categories. Thus, weaknesses would be present from the lowest levels all the way to roots of the tree. Each member weakness would cover a single error. The top-level entries of each of the hierarchies are called Pillars.

This approach lays the groundwork for a more solid understanding of the complexity of weaknesses and their impact. Once the foundation is established, the Research view can then be used to systematically identify theoretical gaps within CWE. We can then start adding in lower-level variations to create a more complex and thorough understanding of weaknesses.

The organizational structure of the Developer View (CWE-699) is not appropriate for the goals being used in the Research View (CWE-1000). In CWE-699, weaknesses belong to multiple categories, the relevance of which can change based on the context in which the weakness occurs, or the perspective of the viewer. As a result, weaknesses are often children of categories that have little to do with the nature of the underlying software error.

Process for Creating Abstractions in the Research View
Process for Creating Abstractions in the Research View

To unify the Research view to conform to a single perspective, we tried to build abstractions of the core issues behind each weakness.

This work was performed inductively, using available weakness data. The primary source was the existing repository of weaknesses from CWE Drafts 8 and 9. Other sources were background knowledge and observed examples from CVE work that were not yet classified, repositories that we received under NDA, and substantial suggestions from an anonymous source.

The inductive process was not formal, but the general approach evolved so that "core issues" were defined in terms of invalid behaviors on resources. Abstraction was then performed on these two key concepts. This was not always done, however.

Identification of Abstraction Levels

The CWE team labeled each weakness according to different levels of abstraction: Class, Base, and Variant. These levels have not been formally defined, so there are some inconsistencies in the definitions. In general, however, a Class describes a weakness in a very abstract fashion, typically independent of any specific language or technology. At the lowest level, a Variant describes a weakness at a very low level of detail, typically limited to a specific language, technology, programming idiom, or resource type. Between the Class and Variant levels, the Base describes a weakness in an abstract fashion, but with sufficient details that the reader could infer specific methods for detection and prevention based solely on the description.

For CWE-1000 (Research View), the team then identified and resolved logical inconsistencies in weakness relationships, such as a Variant that was a parent of a Class, or a Base that had no Class parents. This focused efforts in undeveloped areas of the hierarchy.

Performing Abstraction

Roughly speaking, the informal process was:

  • examine the weakness if it does not have any weakness parents.

  • determine if the issue is really describing a more complex combination of lower-level weaknesses, such as a chain or composite

  • if a chain or composite, or other compound element, treat each weakness separately, and capture appropriate relationships between them (e.g. using the CanPrecede relationship for chains)

  • label each weakness according to different levels of abstraction: Class, Base, and Variant.

  • for each weakness, identify the key behavior that seems to be invariant across any weakness of this type, and how the behavior is erroneous. Ensure that the behavior is reflected in the weakness, and build an abstraction of behaviors.

  • if multiple behaviors are involved, then the issue is probably a category that requires a split into separate weaknesses.

  • if the weakness is specific to a particular type of resource, identify the type of resources being manipulated, and build an abstraction of that resource.

  • for each arbitrary grouping (i.e. category), look at each child weakness, make sure that child has a parent that is a weakness, then remove the category from the view (typically this involved shifting the category relationships to the Development view).

  • when a weakness cannot be abstracted any more, it is treated as a Pillar - a root of the hierarchy.

At the release of CWE 1.0, only one category remains, which could not be resolved by release time because of the need for community involvement.

It should be noted that we did not merge pillars by following the "seven plus one" principle as used in 7PK. For example, both CWE-682 (Incorrect Calculation) and CWE-330 (Use of Insufficiently Random Values) are related to numbers, but they were not merged into one pillar, because it would force the creation of a category.

The modeling of behaviors and resources is most evident under the CWE-664 pillar, Insufficient Control of a Resource Through its Lifetime. Note that CWE-664 is still a work in progress, e.g. with respect to multiple perspectives.

Abstraction in Action

Consider CWE-8 (J2EE Misconfiguration: Entity Bean Declared Remote), as it was in Draft 8 (with a more up-to-date name for clarity).

  • examine parents: the only parent is the category CWE-4 (J2EE Environment Issues). This weakness needs a new parent under CWE-1000.

  • identify abstraction level: the weakness is technology-specific (J2EE) and resource-specific (Entity Bean). Thus the abstraction level is Variant.

  • identify behavior: the code specifies that an entity bean can be accessed remotely.

  • abstract behavior: the code specifies improper access for an entity bean, defining a sphere of control that is broader than intended.

  • abstract resource: the code specifies improper access for a resource, defining an improper sphere of control.

  • match CWE: this closely matches CWE-668 (Exposure of Resource to Wrong Sphere), a Class. Thus the mapping fit would be ABSTRACTION [Loveless].

  • The closest parent CWE-668 is a class, so is very abstract, whereas CWE-8 is a Variant. This is an inconsistency in the level of abstraction that suggests that a Base should probably be identified. In CWE 1.0, CWE-668 does not have any children that would be more appropriate.

Another example involves CWE-560 (Use of umask() with chmod-style Argument).

In Draft 8, this had an uninformative name of "Often Misused: umask()". The description was not much more specific, stating "The mask specified by the argument umask() is often confused with the argument to chmod()."

  • examine parents: the only parent is the category CWE-559 (Often Misused: Arguments and Parameters), which is intended to collect anything related to arguments and parameters. So, this item needs a new parent under CWE-1000.

  • identify abstraction level: the weakness is OS-specific (UNIX) and language-specific (C). Thus the abstraction level is Variant.

  • identify behavior: since CWE-560's name and description were not sufficiently clear, examine the context notes within CWE-560. This explains that the core error is that the programmer calls umask(), but the argument to umask is specified using a number that specifies permissions in chmod(). The same value has different interpretations in chmod() versus umask(), so the programmer is not specifying the argument to umask() correctly. This could cause the program to create files with insecure permissions.

  • identify chain: notice that the "incorrect specification of umask() argument" is primary to a later behavior, in which files are created with insecure permissions. Thus there is a chaining relationship; here, we will concentrate on the primary weakness, the incorrect specification of the argument.

  • abstract behavior: specification of the behavior is already pretty abstract, so there is not much room to go. "The programmer invokes a function with an incorrectly specified umask."

  • abstract resource: the argument to the function is a code-layer resource, so one could abstract to: "The programmer invokes a function with an incorrectly specified argument."

  • match CWE: this is an exact match with CWE-687 (Function Call With Incorrectly Specified Argument Value). This is labeled as a Variant, but it's more arguably a Base or Class, since it's not specific to any technology or language.

  • set CWE-687 as the parent of CWE-560 under view 1000.

  • preserve CWE-559 as another parent of CWE-560, but shift the relationship to the Development Concepts view (CWE-699).

Identification of Perspective Issues

During construction of the Research view, sometimes there would be confusion with respect to how to classify a weakness. In some cases, this was due to knowledge gaps within CWE, and sometimes these would be resolved. In other cases, multiple classifications remained, and we preserved these. We suspect that when there are multiple classifications, there are differences of perspective. However, we could not investigate this problem deeply before the CWE 1.0 release.

One of the main examples of the perspective problem is in chains and composites. We found that much of the classification task was simplified once we learned how to recognize chains. It may be that some of the weaknesses with multiple parents are composites.

Consider CWE-568 (finalize() Method Without super.finalize()). In CWE 1.0, it has two parents under View 1000:

  • 573 - Failure to Follow Specification

  • 404 - Improper Resource Shutdown or Release

In some sense, the CWE-568 finalize() weakness is a composite, since it's a failure to follow specification in code that is intended to perform resource shutdown. Thus it would be a child of the (573+404) composite. An alternate interpetation could be that this is a chain - by not following specification, the program performs an improper resource shutdown.

While the previous example demonstrates the level of complexity that can arise when trying to identify the core aspects of a weakness, the chain and composite concepts have also helped us to understand why classification has been such a difficult challenge in the past. In turn, these are probably responsible for many mapping errors and inconsistencies.

Many issues could be described by two or more high-level Pillars. For example, everything could be argued as "API Abuse" - by calling strcpy() with an overly long buffer, the programmer is violating the contract. However, this is not regarded as "API Abuse" under view 1000, nor under 7PK. How to clearly communicate these concepts will remain a challenge.

Some further description of the perspectives problem is at:

http://www.nabble.com/Identifying-Perspective-Issues-in-CWE-to18245971.html

Using the Research View to Identify Gaps

Another benefit of collocating similar core weaknesses is the ease of applying a consistent vocabulary across CWE. Weaknesses that had no relationships before are brought together and allow the CWE team to see inconspicuous relationships and trends. For example, CWE-696 (Incorrect Behavior Order) was created as a weakness class for several entries where the fundamental problem was performing operations in the incorrect sequence. The first level of children of CWE-696 under the Research view now reads:

  • CWE-179 Incorrect Behavior Order: Early Validation

  • CWE-408 Incorrect Behavior Order: Early Amplification

  • CWE-551 Incorrect Behavior Order: Authorization Before Parsing and Canonicalization

Since abstraction of nodes is primarily based on behaviors and weaknesses, and the view only covers weaknesses, a higher-level node could be used to provide guidance on how to identify potential new children. For example, CWE-118 (Improper Access of Indexable Resource) only has one child under view 1000, CWE-119 (Failure to Constrain Operations within the Bounds of an Allocated Memory Buffer). By considering other types of indexable resources, one could identify equivalent weaknesses, such as access of file segments through user-controlled file offsets.

In views such as 699, which have a less formal organization, the identification of gaps would be more haphazard and less systematic.

Identification of Duplicates

With an organization that focuses mostly on behaviors and resources, the development of View 1000 helped to identify duplicate CWE entries. For example, duplicates CWE-132 (Miscalculated Null Termination) and CWE-170 (Improper Null Termination) used to be in disjoint segments of the hierarchy. CWE-132 was under CWE-682 (Incorrect Calculation), which is really a chaining relationship, while CWE-170 was under CWE-169 (Technology-Specific Special Elements), which is a category for any technology-specific weakness related to special elements. Since CWE-170 was unique to null-terminated strings, it was reasonable that CWE-170 would be categorized that way.

When reorganized by core weakness, these two entries fell in the same location, because the behaviors and resource types were the same. Thus, identifying these entries as duplicates was much more obvious.

One critical step was to recognize that one way of describing the missing null termination was through a chain:

  • CWE-682 Incorrect Calculation

  • CWE-170 Improper Null Termination

  • CWE-119 Failure to Constrain Operations within the Bounds of an Allocated Memory Buffer

Notice how in this case, CWE-119 would be resultant, and CWE-682 would be primary. Due to an incorrect calculation (CWE-682), proper null termination would not occur (CWE-170), and ultimately the program might read or write memory outside of a buffer (CWE-119). The chaining relationship of these three weaknesses could introduce a vulnerability.

The question then becomes how to classify CWE-170, the middle step of this chain. During this step, the associated behavior does not properly maintain the structure of data (i.e. the null termination), which is relied on by the code. So, this falls under CWE-707 (Failure to Enforce that Messages or Data are Well-Formed).

Clarity of Names and Descriptions
Clarity of Names and Descriptions

All CWE entries should be clearly written to avoid confusion. This is important for educating the public, but it is also critical for ensuring that CWE mappings are repeatable.

Based on our experiences with how people are mapping to CWE, we have seen evidence that sometimes people can rely exclusively on the CWE name when deciding how to map an issue to CWE. This was seen when an issue was incorrectly mapped to a CWE entry when the entry had a vague name (which could match), and a precise description that clearly was not a match. There is frequent confusion based on the name, description, observed examples, and demonstrative examples.

One problem is that weakness/vulnerability terminology is not particularly expressive, outside of a handful of terms. Some terms have multiple uses (such as "overflow" and "leak), and there are often perspective problems, e.g. the attack-focused "SQL injection" phrase.

We are trying to make the CWE name and description more clear about the weakness being covered, and to keep the perspective on the weakness itself, instead of the attack or consequence. However, to keep CWE entries as accessible to the everyday user as possible, we want to preserve commonplace terminology where feasible.

We tried to change the names so that a CWE consumer would not have to depend so much on looking up the item's description and associated notes, just to figure out what the item is talking about.

Support for Alternate Names

Sometimes, multiple terms are used for the same weakness, such as "Cross-site Request Forgery" and "Cross-site Reference Forgery". These alternate terms are recorded in CWE. For terms that are extremely common, we include them in parentheses, such as:

  • Failure to Sanitize CRLF Sequences (aka 'CRLF Injection')

  • Inconsistent Interpretation of HTTP Requests (aka 'HTTP Request Smuggling')

Some entries contain Terminology Notes that describe the terminological issues for that entry and related weaknesses.

Many entries contain mappings to various taxonomies. These mappings contain the original names as used in those taxonomies.

Finally, when an entry's name changes, the previous node names are also recorded.

Properties of Good CWE Names

The criteria for a good name include:

  • single interpretation

  • avoid confusing terms

  • avoid vagueness

  • avoid multi-use or multi-perspective terms

  • use established terminology if available

The litmus test for a name change is simple: if a CWE analyst doesn't have a good idea of the issue after reading the name, then it needs to be changed. As a result, we removed a lot of non-specific terms such as "insecure," "improper," and "erroneous." We are slowly building a more consistent vocabulary, but this is still a work in progress.

Examples of Problematic Names

Some examples include:

  • Dangerous Functions: this phrase was originally used in 7PK. CWE inherited this phrase from 7PK and used it as the name for CWE-242, until the name was changed in Draft 9. Both the 7PK and CWE descriptions tried to make it clear that this category was for functions that have inherent vulnerabilities that can never be guaranteed to work safely, such as gets(). However, it was sometimes interpreted by people as including any function that has certain risks but can be safe if used properly, such as strcpy().

  • Overflow of static internal buffer: this phrase was inherited from CLASP and used in CWE-500 in Draft 8 and earlier. The phrase implies a general type of buffer overflow, but the core issue discussed in CLASP is that an object contains a field that hasn't been marked as final, which might enable modification of the field to trigger a buffer overflow. Presumably, other attacks are also possible. In CWE, the non-final field is regarded as a primary weakness in a chain that could have multiple resultant issues, such as an overflow. In this context, the CLASP-style name is describing a consequence, not the underlying weakness.

  • Improperly Freeing Heap Memory: this name was used in the Draft 8 version of CWE-590. Here, the "improper" term has multiple interpretations:

    • double-free

    • not clearing sensitive heap memory before freeing (heap inspection)

    • freeing memory too early because it's referenced later (use after free)

    • running free() on an object that was allocated using new()

    • running delete() on memory that was allocated using free()

    Another problem is that, since the name is in accordance with the Research Concepts model of behaviors and resources, it could be interpreted as a Base-level weakness with multiple Variant children, such as those outlined above. This would be an incorrect assumption, since the core issue being described by CWE-590 is actually a Variant instead of a Base.

    In Draft 9, CWE-590 was renamed to "Free of Invalid Pointer Not on the Heap."

  • Often Misused: Authentication: this phrase was inherited from 7PK and used in CWE-247 in Draft 8 and earlier. If only the name is used, it could be interpreted as a very general category that covers anything related to authentication. Closer examination of the description, however, makes it clear that this is a language-specific, OS-specific, low-level variant related to reliance on the getlogin() function.

  • Mobile Code: Object Hijack: this phrase was inherited from 7PK and used in CWE-491 in Draft 8 and earlier. However, it is attack oriented, and it does not specify the nature of the weakness that allows this attack to occur. In a mapping task, this allows room for multiple interpretations and increases the chance of mapping errors. At least, it forces the reader to perform more research in order to figure out what is being discussed.

  • Memory Locking: this phrase was used in CWE-591 in Draft 8 and earlier. It is not weakness-focused; it is about a behavior without any specification of what is wrong with the behavior. So it has multiple interpretations, such as:

    • this could be a category listing all weaknesses that are related to memory locking

    • this could be a weakness about a program that does not lock memory when it should

    • this could be a weakness about a program that attempts to lock memory correctly, but does not

    • this could be a weakness about a program that locks memory, but doesn't release the lock and enables deadlock conditions

    In Draft 9, the name was changed to "Sensitive Data Storage in Improperly Locked Memory."

Other examples of name changes can be found by reviewing the Previous_Entry_Names elements of individual CWE entries.

References
References

[7PK] Katrina Tsipenyuk, Brian Chess, Gary McGraw. "Seven Pernicious Kingdoms: A Taxonomy of Software Security Errors". November 2005. NIST Workshop on Software Security Assurance Tools, Techniques, and Metrics. http://cwe.mitre.org/documents/sources/SevenPerniciousKingdoms.pdf

[CLASP] John Viega. "The CLASP Application Security Process". 2005. http://searchsoftwarequality.techtarget.com/searchAppSecurity/downloads/clasp_v20.pdf

[Krsul] Ivan Krsul. "Software Vulnerability Analysis". 1998. Purdue University PhD Thesis. ftp://ftp.cerias.purdue.edu/pub/papers/ivan-krsul/krsul-phd-thesis.pdf

[Landwehr] Carl E. Landwehr, Alan R. Bull, John P. McDermott, William S. Choi. "A Taxonomy of Computer Program Security Flaws, with Examples". November 1993. NRL/FR/5542--93-9591. http://cwe.mitre.org/documents/sources/ATaxonomyofComputerProgramSecurityFlawswithExamples%5BLand...

[PLOVER] Steve Christey. The Preliminary List of Vulnerability Examples for Researchers (PLOVER). August 2005. NIST Workshop on Defining the State of the Art of Software Security Tools. http://cwe.mitre.org/documents/sources/PLOVER.pdf

[Vulncat] Fortify Software Security Research Group, Dr. Gary McGraw. "Fortify Taxonomy: Software Security Errors". 2008. http://www.fortify.com/vulncat/en/vulncat/index.html


More information is available — Please select a different filter.
Page Last Updated: October 29, 2008