CWE

Common Weakness Enumeration

A community-developed list of SW & HW weaknesses that can become vulnerabilities

New to CWE? click here!
CWE Most Important Hardware Weaknesses
CWE Top 25 Most Dangerous Weaknesses
Home > CWE List > CWE-1434: Insecure Setting of Generative AI/ML Model Inference Parameters (4.18)  
ID

CWE-1434: Insecure Setting of Generative AI/ML Model Inference Parameters

Weakness ID: 1434
Vulnerability Mapping: ALLOWED This CWE ID may be used to map to real-world vulnerabilities
Abstraction: Base Base - a weakness that is still mostly independent of a resource or technology, but with sufficient details to provide specific methods for detection and prevention. Base level weaknesses typically describe issues in terms of 2 or 3 of the following dimensions: behavior, property, technology, language, and resource.
View customized information:
For users who are interested in more notional aspects of a weakness. Example: educators, technical writers, and project/program managers. For users who are concerned with the practical application and details about the nature of a weakness and how to prevent it from happening. Example: tool developers, security researchers, pen-testers, incident response analysts. For users who are mapping an issue to CWE/CAPEC IDs, i.e., finding the most appropriate CWE for a specific issue (e.g., a CVE record). Example: tool developers, security researchers. For users who wish to see all available information for the CWE/CAPEC entry. For users who want to customize what details are displayed.
×

Edit Custom Filter


+ Description
The product has a component that relies on a generative AI/ML model configured with inference parameters that produce an unacceptably high rate of erroneous or unexpected outputs.
+ Extended Description

Generative AI/ML models, such as those used for text generation, image synthesis, and other creative tasks, rely on inference parameters that control model behavior, such as temperature, Top P, and Top K. These parameters affect the model's internal decision-making processes, learning rate, and probability distributions. Incorrect settings can lead to unusual behavior such as text "hallucinations," unrealistic images, or failure to converge during training. The impact of such misconfigurations can compromise the integrity of the application. If the results are used in security-critical operations or decisions, then this could violate the intended security policy, i.e., introduce a vulnerability.

+ Common Consequences
Section HelpThis table specifies different individual consequences associated with the weakness. The Scope identifies the application security area that is violated, while the Impact describes the negative technical impact that arises if an adversary succeeds in exploiting this weakness. The Likelihood provides information about how likely the specific consequence is expected to be seen relative to the other consequences in the list. For example, there may be high likelihood that a weakness will be exploited to achieve a certain impact, but a low likelihood that it will be exploited to achieve a different impact.
Impact Details

Varies by Context; Unexpected State

Scope: Integrity, Other

The product can generate inaccurate, misleading, or nonsensical information.

Alter Execution Logic; Unexpected State; Varies by Context

Scope: Other

If outputs are used in critical decision-making processes, errors could be propagated to other systems or components.
+ Potential Mitigations
Phase(s) Mitigation

Implementation; System Configuration; Operation

Develop and adhere to robust parameter tuning processes that include extensive testing and validation.

Implementation; System Configuration; Operation

Implement feedback mechanisms to continuously assess and adjust model performance.

Documentation

Provide comprehensive documentation and guidelines for parameter settings to ensure consistent and accurate model behavior.
+ Relationships
Section Help This table shows the weaknesses and high level categories that are related to this weakness. These relationships are defined as ChildOf, ParentOf, MemberOf and give insight to similar items that may exist at higher and lower levels of abstraction. In addition, relationships such as PeerOf and CanAlsoBe are defined to show similar weaknesses that the user may want to explore.
+ Relevant to the view "Research Concepts" (View-1000)
Nature Type ID Name
ChildOf Base Base - a weakness that is still mostly independent of a resource or technology, but with sufficient details to provide specific methods for detection and prevention. Base level weaknesses typically describe issues in terms of 2 or 3 of the following dimensions: behavior, property, technology, language, and resource. 440 Expected Behavior Violation
ChildOf Class Class - a weakness that is described in a very abstract fashion, typically independent of any specific language or technology. More specific than a Pillar Weakness, but more general than a Base Weakness. Class level weaknesses typically describe issues in terms of 1 or 2 of the following dimensions: behavior, property, and resource. 665 Improper Initialization
PeerOf Pillar Pillar - a weakness that is the most abstract type of weakness and represents a theme for all class/base/variant weaknesses related to it. A Pillar is different from a Category as a Pillar is still technically a type of weakness that describes a mistake, while a Category represents a common characteristic used to group related things. 691 Insufficient Control Flow Management
CanPrecede Class Class - a weakness that is described in a very abstract fashion, typically independent of any specific language or technology. More specific than a Pillar Weakness, but more general than a Base Weakness. Class level weaknesses typically describe issues in terms of 1 or 2 of the following dimensions: behavior, property, and resource. 684 Incorrect Provision of Specified Functionality
+ Modes Of Introduction
Section HelpThe different Modes of Introduction provide information about how and when this weakness may be introduced. The Phase identifies a point in the life cycle at which introduction may occur, while the Note provides a typical scenario related to introduction during the given phase.
Phase Note
Build and Compilation During model training, hyperparameters may be set without adequate validation or understanding of their impact.
Installation During deployment, model parameters may be adjusted to optimize performance without comprehensive testing.
Patching and Maintenance Updates or modifications may be made to the model that alter its behavior without thorough re-evaluation.
+ Applicable Platforms
Section HelpThis listing shows possible areas for which the given weakness could appear. These may be for specific named Languages, Operating Systems, Architectures, Paradigms, Technologies, or a class of such platforms. The platform is listed along with how frequently the given weakness appears for that instance.
Languages

Class: Not Language-Specific (Undetermined Prevalence)

Architectures

Class: Not Architecture-Specific (Undetermined Prevalence)

Technologies

AI/ML (Undetermined Prevalence)

Class: Not Technology-Specific (Undetermined Prevalence)

+ Demonstrative Examples

Example 1


Assume the product offers an LLM-based AI coding assistant to help users to write code as part of an Integrated Development Environment (IDE). Assume the model has been trained on real-world code, and the model behaves normally under its default settings. Suppose there is a default temperature of 1, with a range of temperature values from 0 (most deterministic) to 2.

Consider the following configuration.

(bad code)
Example Language: JSON 
{
"model": "my-coding-model",
"context_window": 8192,
"max_output_tokens": 4096,
"temperature", 1.5,
...
}

The problem is that the configuration contains a temperature hyperparameter that is higher than the default. This significantly increases the likelihood that the LLM will suggest a package that did not exist at training time, a behavior sometimes referred to as "package hallucination." Note that other possible behaviors could arise from higher temperature, not just package hallucination.

An adversary could anticipate which package names could be generated and create a malicious package. For example, it has been observed that the same LLM might hallucinate the same package regularly. Any code that is generated by the LLM, when run by the user, would download and execute the malicious package. This is similar to typosquatting.

The risk could be reduced by lowering the temperature so that it reduces the unpredictable outputs and has a better chance of staying more in line with the training data. If the temperature is set too low, then some of the power of the model will be lost, and it may be less capable of producing solutions for rarely-encountered problems that are not reflected in the training data. However, if the temperature is not set low enough, the risk of hallucinating package names may still be too high. Unfortunately, the "best" temperature cannot be determined a priori, and sufficient empirical testing is needed.

(good code)
Example Language: JSON 
{
...
"temperature", 0.2,
...
}

In addition to more restrictive temperature settings, consider adding guardrails that test that independently verify any referenced package to ensure that it exists, is not obsolete, and comes from a trusted party.

Note that reducing temperature does not entirely eliminate the risk of package hallucination. Even with very low temperatures or other settings, there is still a small chance that a non-existent package name will be generated.



+ Weakness Ordinalities
Ordinality Description
Primary
(where the weakness exists independent of other weaknesses)
+ Detection Methods
Method Details

Automated Dynamic Analysis

Manipulate inference parameters and perform comparative evaluation to assess the impact of selected values. Build a suite of systems using targeted tools that detect problems such as prompt injection (CWE-1427) and other problems. Consider statistically measuring token distribution to see if it is consistent with expected results.

Effectiveness: Moderate

Note:Given the large variety of outcomes, it can be difficult to design testing to be comprehensive enough, and there is still a risk of unpredictable behavior.

Manual Dynamic Analysis

Manipulate inference parameters and perform comparative evaluation to assess the impact of selected values. Build a suite of systems using targeted tools that detect problems such as prompt injection (CWE-1427) and other problems. Consider statistically measuring token distribution to see if it is consistent with expected results.

Effectiveness: Moderate

Note:Given the large variety of outcomes, it can be difficult to design testing to be comprehensive enough, and there is still a risk of unpredictable behavior.
+ Memberships
Section HelpThis MemberOf Relationships table shows additional CWE Categories and Views that reference this weakness as a member. This information is often useful in understanding where a weakness fits within the context of external information sources.
Nature Type ID Name
MemberOf CategoryCategory - a CWE entry that contains a set of other entries that share a common characteristic. 1412 Comprehensive Categorization: Poor Coding Practices
+ Vulnerability Mapping Notes
Usage ALLOWED
(this CWE ID may be used to map to real-world vulnerabilities)
Reason Acceptable-Use

Rationale

This CWE entry is at the Base level of abstraction, which is a preferred level of abstraction for mapping to the root causes of vulnerabilities.

Comments

Carefully read both the name and description to ensure that this mapping is an appropriate fit. Do not try to 'force' a mapping to a lower-level Base/Variant simply to comply with this preferred level of abstraction.
+ Notes

Research Gap

This weakness might be under-reported as of CWE 4.18, since there are no clear observed examples in CVE. However, inference parameters may be the root cause for various vulnerabilities - or important factors - but the vulnerability reports may concentrate more on the negative impact (e.g. code execution) or the weaknesses that the insecure settings contribute to. Alternately, dynamic techniques might not reveal the root cause if the researcher does not have access to the underlying source code and environment.
+ References
[REF-1487] Joseph Spracklen, Raveen Wijewickrama, A H M Nazmus Sakib, Anindya Maiti, Bimal Viswanath and Murtuza Jadliwala. "We Have a Package for You! A Comprehensive Analysis of Package Hallucinations by Code Generating LLMs". 2025-03-02.
<https://arxiv.org/abs/2406.10279>. (URL validated: 2025-09-08)
+ Content History
+ Submissions
Submission Date Submitter Organization
2024-06-28
(CWE 4.18, 2025-09-09)
Lily Wong MITRE
+ Contributions
Contribution Date Contributor Organization
2025-02-28
(CWE 4.18, 2025-09-09)
AI WG "New Entry" subgroup
Participated in regular meetings from February to August 2025 to develop and refine most elements of this entry.
Page Last Updated: September 09, 2025