decompiler
Module containing functions for decompiling binaries.
This module manages decompiler registration and configuration, allowing codablellm
to use different backends for binary decompilation.
GET_C_SYMBOLS_QUERY = '(function_definition declarator: (function_declarator declarator: (identifier) @function.symbols ))(call_expression function: (identifier) @function.symbols)'
module-attribute
Tree-sitter query used to extract all C symbols from a function definition.
DecompileConfig
dataclass
Configuration for decompiling binaries.
Source code in src/codablellm/core/decompiler.py
decompiler_args = field(default_factory=list)
class-attribute
instance-attribute
Positional arguments to pass to the decompiler's __init__ method.
decompiler_kwargs = field(default_factory=dict)
class-attribute
instance-attribute
Keyword arguments to pass to the decompiler's __init__ method.
max_workers = None
class-attribute
instance-attribute
Maximum number of binaries to decompile in parallel.
recursive = False
class-attribute
instance-attribute
If True, recursively scan directories for binaries to decompile.
strict = False
class-attribute
instance-attribute
If True, raise exceptions on decompilation failures; otherwise, continue and log warnings.
symbol_remover = None
class-attribute
instance-attribute
Optional strategy used to remove symbols from decompiled functions.
Decompiler
Bases: ABC
Abstract base class for a decompiler that extracts decompiled functions from compiled binaries.
Source code in src/codablellm/core/decompiler.py
163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 | |
decompile(path)
abstractmethod
Decompiles a binary and retrieves all decompiled functions contained in it.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
path
|
PathLike
|
The path to the binary file to be decompiled. |
required |
Returns:
| Type | Description |
|---|---|
Sequence[DecompiledFunction]
|
A sequence of |
Source code in src/codablellm/core/decompiler.py
decompile_stripped(path, strategy)
Decompiles a binary and applies a symbol removal strategy.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
path
|
PathLike
|
Path to the binary to decompile. |
required |
strategy
|
SymbolRemovalStrategy
|
Strategy for symbol removal. Options include "strip" (using the |
required |
Returns:
| Type | Description |
|---|---|
Sequence[DecompiledFunction]
|
A sequence of |
Source code in src/codablellm/core/decompiler.py
get_stripped_function_name(address)
abstractmethod
Returns the anonymized name for a function at the given address.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
address
|
int
|
The memory address of the function. |
required |
Returns:
| Type | Description |
|---|---|
str
|
A stripped-down or anonymized function name (e.g., |
Source code in src/codablellm/core/decompiler.py
create_decompiler(*args, **kwargs)
Initializes an instance of the decompiler that is being used by codablellm.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
*args
|
Any
|
Positional arguments to pass to the decompiler's |
()
|
**kwargs
|
Any
|
Keyword arguments to pass to the decompiler's |
{}
|
Returns:
| Type | Description |
|---|---|
Decompiler
|
An instance of the specified |
Raises:
| Type | Description |
|---|---|
DecompilerNotFound
|
If the specified decompiler cannot be imported or if the class cannot be found. |
Source code in src/codablellm/core/decompiler.py
decompile(*paths, config, as_flow=True)
Decompiles one or more binaries.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
paths
|
PathLike
|
One or more paths pointing to binary files or directories. |
()
|
config
|
DecompileConfig
|
A |
required |
Returns:
| Type | Description |
|---|---|
List[DecompiledFunction]
|
A list of all decompiled functions from the provided binaries. |
Source code in src/codablellm/core/decompiler.py
decompile_bins_task(*paths, config)
Decompiles binaries and extracts decompiled functions from the given path or list of paths.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
paths
|
PathLike
|
A single path or sequence of paths pointing to binary files or directories containing binaries. |
()
|
config
|
DecompileConfig
|
Decompilation configuration options. |
required |
as_callable_pool
|
If |
required |
Returns:
| Type | Description |
|---|---|
List[DecompiledFunction]
|
Either a list of |
Source code in src/codablellm/core/decompiler.py
decompile_task(decompiler, path, symbol_remover)
Prefect task for decompiling a single binary file using the specified decompiler.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
decompiler
|
Decompiler
|
An instance of a |
required |
path
|
PathLike
|
Path to the binary to decompile. |
required |
symbol_remover
|
Optional[SymbolRemovalStrategy]
|
Optional symbol removal strategy to apply. |
required |
Returns:
| Type | Description |
|---|---|
Sequence[DecompiledFunction]
|
A list of |
Source code in src/codablellm/core/decompiler.py
get()
Returns the currently registered decompiler.
Returns:
| Type | Description |
|---|---|
RegisteredDecompiler
|
A |
pseudo_strip(decompiler, function)
Creates a stripped version of the decompiled function with anonymized symbol names.
This method replaces all function symbols in both the function definition and assembly code
with generated placeholders (e.g., FUN_<addr> or sub_<addr>), ensuring sensitive or original
identifiers are removed. The resulting DecompiledFunction has an updated definition,
stripped function name, and modified assembly code.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
decompiler
|
Decompiler
|
The |
required |
function
|
DecompiledFunction
|
The |
required |
Returns:
| Type | Description |
|---|---|
DecompiledFunction
|
A new |
Source code in src/codablellm/core/decompiler.py
set(name, symbol)
Sets the decompiler used by codablellm.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
str
|
The display name of the decompiler (e.g., "Ghidra", "Angr"). |
required |
symbol
|
DynamicSymbol
|
A tuple containing the file path and class name of the decompiler implementation. |
required |