Skip to content

ADFA-1843 | Refactor Local LLM agent and tool handling#911

Merged
jatezzz merged 45 commits intostagefrom
feat/local-llm-work
Feb 11, 2026
Merged

ADFA-1843 | Refactor Local LLM agent and tool handling#911
jatezzz merged 45 commits intostagefrom
feat/local-llm-work

Conversation

@jatezzz
Copy link
Collaborator

@jatezzz jatezzz commented Feb 2, 2026

Description

This PR refactors the LocalAgenticRunner to implement a simplified single-call workflow. This change is designed to improve the reliability of on-device models by bypassing complex multi-step planning loops that often lead to "max steps" or retry exhaustion errors on local hardware.

Details

  • Simplified Workflow: Added runSimplifiedWorkflow to handle direct responses and tool selection in a single LLM pass.
  • Model Support: Added specific prompt builders for H2O, Qwen, and Gemma 3 model families.
  • Tool Improvements: Added the list_files tool and improved argument parsing and inference for missing parameters.
  • Stability: Fixed P0/P1 issues including Zip Slip vulnerabilities, thread safety, and memory leaks during batch processing.
  • Benchmarks: Integrated a local hardware benchmark runner to track inference performance (TTFT, tokens/sec).

Ticket

ADFA-1843

Observation

The simplified workflow is now forced for all local LLM configurations as they are significantly more stable than the agentic multi-step loop used by cloud models. Verification via SHA-256 has been added to ensure model integrity.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 2, 2026

📝 Walkthrough

Release Notes

New Features

  • Simplified Local LLM Workflow: Implemented single-pass inference workflow via runSimplifiedWorkflow() for on-device models, eliminating multi-step planning loops and reducing "max steps" exhaustion errors
  • Model-Specific Prompt Builders: Added detection and custom prompt formatting for H2O, Qwen, and Gemma 3 model families (in addition to existing support)
  • Chat UI Fragment: New ChatFragment with rich UI including message history, context management, image picking, and tool testing interface
  • File Listing Tool: Introduced list_files tool handler with project-scoped directory access and optional recursive listing
  • Enhanced Tool Call Parsing: Improved parser (Util.parseToolCall) supporting function-style calls, JSON extraction, argument inference, and multiple fallback formats
  • Token Counting: Added countTokens(text) API for better context management and prompt clipping
  • Adaptive Orientation: New setAdaptiveOrientation() utility for responsive layout on large vs. small screens
  • Local Model SHA-256 Verification: Added SHA-256 hash verification for model integrity and persistence in preferences
  • Inference Metrics: Integrated performance telemetry (TTFT, tokens/sec, duration tracking)

Stability & Security Improvements

  • Zip Slip Mitigation: Implemented path canonicalization and bounds checking in DynamicLibraryLoader to prevent directory traversal attacks during AAR extraction
  • Thread-Safety: Added Mutex-based synchronization (modelLoadMutex) around model loading paths in LlmInferenceEngine
  • Path Traversal Protection: list_files and ListFilesCommand now enforce project-scoped directory access with canonical path validation
  • Deterministic Retry Behavior: Improved tool-call retry logic with required-argument validation via Executor.requiredArgsForTool()
  • Memory Leak Fixes: Enhanced batch processing lifecycle with proper cleanup in native layer (free_1batch)
  • UTF-8 Sanitization: New new_jstring_utf8() helper in C++ layer eliminates encoding issues in JNI string emissions

Performance Enhancements

  • KV Cache Reuse: Implemented configurable cache persistence across inference calls to reduce latency
  • Dynamic Context Sizing: Context window now determined at runtime based on device capabilities
  • Configurable Threading/Sampling: New native entry points (configureThreads(), configureSampling(), configureContext())
  • Prompt Clipping: Automatic history and prompt truncation to fit configured context window
  • Token Counting: Enables accurate prompt budgeting before inference

Testing & Quality

  • Integration Test Suite (LocalLlmIntegrationTest): Comprehensive end-to-end tests including engine initialization, model loading, tool invocation, and benchmarking
  • Unit Tests (LocalAgenticRunnerSimplifiedWorkflowTest): Validates simplified workflow state transitions, error handling, and tool-call parsing
  • UI Test Coverage (AgentPromptTest): End-to-end UI test with project setup, agent prompt submission, and response validation
  • Whitelist Rule Tests: Added StrictMode violation rules for expected disk reads during service loading and onboarding checks
  • Auto-Install Test Assets: Gradle task extractLlamaAarForTests automatically extracts Llama AAR for integration tests

Infrastructure & Configuration

  • Constants Centralization: New Constants.kt with preference keys and default model configurations (Gemini 2.5, 1.5 variants)
  • Dynamic Library Loader: DynamicLibraryLoader v2 with ChildFirstDexClassLoader for prioritized class loading from AAR
  • Gemini Model Options: Updated supported models including Gemini 2.5-pro and 2.0-flash variants
  • Protobuf Dependency Management: Excluded protobuf-lite from AndroidTest configs to align with test runtime

UI/UX Enhancements

  • Localized Agent Messages: Replaced hardcoded error/status strings with resource-based localization (45+ new string resources)
  • Disclaimer Dialog: Added experimental AI use disclaimer with DisclaimerDialogFragment
  • Settings UI: New controls for model SHA-256 verification and simplified prompt toggle
  • Tool Testing Interface: Dynamic form generation with parameter hints, defaults, and JSON serialization
  • Transcript Sharing: ChatTranscriptUtils for exporting and sharing timestamped chat transcripts
  • Approval Workflow: UI support for tool approval with once/session/deny options

⚠️ Risks & Best Practice Violations

High Risk

  1. Reflection-Based Native Loading: LlmInferenceEngine uses reflection to load LLamaAndroid class with multiple fallback paths (_instance, Companion, getInstance). This is fragile and makes the code harder to maintain. Consider using dependency injection or explicit class loading.

  2. Dynamic ClassLoader Creation: DynamicLibraryLoader creates a custom ChildFirstDexClassLoader that bypasses standard class loading. While it includes Zip Slip protection, this approach:

    • Circumvents framework security mechanisms
    • Makes debugging and profiling difficult
    • Could cause class loading conflicts if not carefully isolated
  3. Mutable Native State with Synchronization Only at Java Level: C++ globals (g_batch_n_tokens, g_generated_text, g_stop_strings) are mutexed at Java level but accessed concurrently from native code. Race conditions could occur if native callbacks bypass Java synchronization.

Medium Risk

  1. Tool Argument Inference: The fillMissingArgs() logic infers missing tool parameters heuristically (e.g., extracting filenames from query text). This could lead to unexpected behavior or security issues if inference patterns are exploited.

  2. Path Normalization Complexity: Multiple path normalization routines in different handlers (ListFilesHandler.normalizePath(), ListFilesCommand, Executor.execute() for "list_files"). Inconsistencies could enable path traversal if normalization isn't uniform.

  3. Hard-Coded Memory/Context Defaults: Native defaults (4096 context, 512 max tokens) are set statically. No validation that these fit device memory constraints; could cause OOM crashes on low-end devices.

Low Risk / Best Practice Notes

  1. Test Asset Auto-Install: Gradle task automatically extracts test assets to buildDir/generated. Ensure build output directory is cleaned between test runs to avoid stale assets.

  2. SHA-256 Verification Optional: Model SHA-256 is optional (expectedSha256: String? = null). This weakens integrity checks; consider making it mandatory or warning users when unavailable.

  3. Extensive Native Layer Changes: 300+ lines of C++ code added for sampling, batching, and context management. These should undergo thorough testing on multiple device/API levels to catch memory leaks or ABI incompatibilities.

  4. New String Resources (45+ added): Ensure all locales have these strings translated before release; missing translations will show resource keys to users.

Walkthrough

Adds an on-device AI agent: chat UI and ViewModel, local LLM integration (native + dynamic AAR loading), tool parsing/handlers and executor/runner, new prefs/resources, tests, build tasks for extracting Llama AARs, protobuf exclusions for AndroidTest, and several app utilities (orientation, logcat, strictmode whitelist).

Changes

Cohort / File(s) Summary
Agent build & test scaffolding
agent/build.gradle.kts
New Gradle config for Android/Kotlin, dependencies including v7/v8 AARs, and extractLlamaAarForTests task wired into AndroidTest assets.
Agent UI & dialogs
agent/src/main/java/.../fragments/ChatFragment.kt, agent/src/main/java/.../fragments/DisclaimerDialogFragment.kt, agent/src/main/res/layout/layout_ai_disclaimer.xml
New ChatFragment (rich UI, image picking, tool testing, approvals, transcripts) and DisclaimerDialogFragment with layout resource.
ViewModel & UI state
agent/src/main/java/.../viewmodel/ChatViewModel.kt, app/src/main/java/.../agent/viewmodel/AiSettingsViewModel.kt
Large ChatViewModel added; session/backend management, local-model reload using stored SHA, and AiSettingsViewModel prefs for model SHA and simple-prompt toggle.
Local agent runtime & orchestration
agent/src/main/java/.../repository/LocalAgenticRunner.kt, agent/src/main/java/.../repository/Executor.kt, agent/src/main/java/.../repository/Constants.kt
LocalAgenticRunner implements simplified on-device workflow with tool selection/arg inference; Executor dispatches function calls; new constants for prefs/models.
Parsing & utilities
agent/src/main/java/.../repository/utils.kt, agent/src/main/java/.../utils/ChatTranscriptUtils.kt, agent/src/main/java/.../utils/DynamicLibraryLoader.kt
Robust tool-call parser, transcript write/share helpers, dynamic AAR unzip + child-first DexClassLoader for llama AARs.
Tool handlers & tools
agent/src/main/java/.../tool/ListFilesHandler.kt, .../ReadFileHandler.kt, .../SearchProjectHandler.kt, app/src/main/java/.../repository/Tool.kt, app/src/main/java/.../api/commands/ListFilesCommand.kt
New tool handlers (list_files, read_file, search_project), ListFilesTool and ListFilesCommand tightened with canonicalization and project-root access checks.
LLM engine & native
app/src/main/java/.../repository/LlmInferenceEngine.kt, agent/src/main/java/.../repository/LocalLlmRepositoryImpl.kt, llama-api/.../ILlamaController.kt, llama-impl/src/main/.../llama-android.cpp, llama-impl/src/main/java/.../LLamaAndroid.kt
Reflective LLama loading, model lifecycle, sampling/context config, countTokens API; native JNI adds config hooks, batching, stop-strings and UTF-8-safe emissions.
Settings, prefs & UI
agent/src/main/java/.../fragments/AiSettingsFragment.kt, app/src/main/res/layout/layout_settings_local_llm.xml, preferences/src/main/java/.../GeneralPreferences.kt, app/src/main/res/values/strings.xml
UI for model SHA and simple-prompt toggle; prefs read/write for SHA and toggle; new GeneralPreferences key for logcat capture and added resource strings.
Tests & test infra
agent/src/androidTest/.../LocalLlmIntegrationTest.kt, agent/src/test/.../LocalAgenticRunnerSimplifiedWorkflowTest.kt, app/src/androidTest/.../AgentPromptTest.kt, app/src/androidTest/.../WhitelistRulesTest.kt, testing/android/build.gradle.kts, app/build.gradle.kts
New integration/unit tests for engine/runner/UI and strictmode whitelist; AndroidTest protobuf-lite exclusions and testing module protobuf runtime alignment.
StrictMode & whitelist
app/src/main/java/.../app/strictmode/WhitelistEngine.kt
Adds two DiskReadViolation whitelist rules and matching tests.
Logcat capture
app/src/main/java/.../activities/editor/IDELogcatReader.kt, preferences/src/main/.../GeneralPreferences.kt
Optional "-b all" controlled by preference, buffered read with 5MB cap and safer process lifecycle handling.
Orientation & miscellaneous
common/src/main/java/.../OrientationUtilities.kt, app/src/main/java/.../activities/OnboardingActivity.kt, layouteditor/src/.../BaseActivity.kt
New adaptive orientation API and replace fixed portrait assignments with adaptive calls.
Project manager & build logic
subprojects/projects/.../ProjectManagerImpl.kt, subprojects/projects/.../IProjectManager.kt, composite-builds/.../IDEWrapperGenerator.java
Add cached pluginProject flag, IO existence checks; conditional wrapper URL validation via system property.
Strings & resources
resources/src/main/res/values/strings.xml
Large set of new agent/AI-related string resources added.
Minor formatting
various layout files
Tiny formatting/newline adjustments across some layout files.

Sequence Diagram(s)

sequenceDiagram
  participant UI as ChatFragment (UI)
  participant VM as ChatViewModel
  participant Repo as Local/Gemini Repository
  participant Engine as LlmInferenceEngine
  participant Native as LLamaAndroid (JNI)
  participant Tooling as Tooling API

  UI->>VM: submitUserPrompt(prompt)
  VM->>Repo: requestResponse(prompt, session)
  Repo->>Engine: buildPromptAndRunInference(prompt, history)
  Engine->>Native: configure(...) / runInference(...)
  Native-->>Engine: streamTokens / result
  Engine-->>Repo: inferenceResult
  alt toolCallDetected
    Repo->>VM: notifyToolCall(toolCall)
    VM->>Tooling: requestApproval / executeTool(args)
    Tooling-->>VM: ToolResult
    VM->>Repo: provideToolResult(toolResult)
    Repo->>Engine: runFollowupInference(with tool result)
  end
  Repo-->>VM: finalMessage
  VM-->>UI: displayMessage
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Suggested reviewers

  • Daniel-ADFA
  • itsaky-adfa
  • jomen-adfa

Poem

🐇 I hopped through code and unzipped AARs with flair,
I taught threads to chatter and models to care.
Tools now obey, transcripts tucked neat,
The IDE listens — the rabbit takes a seat.
Hop, test, and ship — the agent's in the air.

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 6.62% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The PR title clearly and concisely summarizes the main change: refactoring the Local LLM agent and tool handling, which is directly reflected throughout the changeset.
Description check ✅ Passed The PR description is directly related to the changeset, detailing the simplified workflow, model support, tool improvements, stability fixes, and benchmarks that are present in the files changed.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feat/local-llm-work

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@jatezzz jatezzz marked this pull request as draft February 2, 2026 20:58
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 10

Note

Due to the large number of review comments, Critical, Major severity comments were prioritized as inline comments.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
app/src/main/java/com/itsaky/androidide/app/DeviceProtectedApplicationLoader.kt (1)

61-66: ⚠️ Potential issue | 🟡 Minor

Confirm intent: hard-coding isReprieveEnabled = true removes feature-flag control.

If “reprieve” relaxes StrictMode penalties, always enabling it (even in debug) could hide violations and make debugging harder. If the flag was removed intentionally, consider documenting why; otherwise, restore the flag to keep runtime control.

🔧 Suggested change (restore flag-based behavior)
-                isReprieveEnabled = true,
+                isReprieveEnabled = FeatureFlags.isReprieveEnabled,
🤖 Fix all issues with AI agents
In `@agent/build.gradle.kts`:
- Around line 41-77: The task extractLlamaAarForTests currently shells out to
the external "unzip" CLI and writes to v8Dest/v7Dest output streams without
explicit closing; replace the project.exec blocks with a portable JVM-based
extraction using java.util.zip.ZipInputStream (or Gradle's zipTree) to open
v8AssetsZip and v7AssetsZip, iterate entries to find "dynamic_libs/llama.aar",
copy that entry to the corresponding File (v8Dest/v7Dest) using buffered
streams, and ensure all input/output streams are closed (use
use/try-with-resources) so extraction works cross-platform and deterministically
while keeping the same inputs.files(v8AssetsZip, v7AssetsZip) and
outputs.dir(outputDir) behavior.

In `@agent/src/main/java/com/itsaky/androidide/agent/repository/Executor.kt`:
- Around line 38-45: The requiredArgsForTool mapping in
Executor.requiredArgsForTool is incorrect for two tools: change the
"add_dependency" key to require "dependency_string" (replace "dependency") and
keep or include "build_file_path" if the tool schema expects it; and make
"get_build_output" return an empty list (no required args) so calls with {} are
accepted. Update the when-branches for "add_dependency" and "get_build_output"
in the requiredArgsForTool function to match the actual tool input schema.

In
`@app/src/main/java/com/itsaky/androidide/activities/editor/IDELogcatReader.kt`:
- Around line 89-95: The current ProcessBuilder call in IDELogcatReader (the
"logcat", "-b", "all", "-v", "threadtime" invocation that creates variable
process) should not unconditionally capture all buffers; change it to either (A)
restrict to the app's process by adding a PID filter (use Android process PID
from android.os.Process.myPid() or logcat's --pid/UID option) so only AndroidIDE
logs are collected, or (B) gate "-b all" behind an explicit user opt‑in
preference and implement a size cap/rotation policy for saved logs to prevent
large/sensitive dumps; update IDELogcatReader to read the opt‑in flag and apply
the PID/UID filter by default and only append "-b", "all" when the user enabled
broad capture, and ensure log files are rotated/truncated to a configured
maximum size.

In `@app/src/main/java/com/itsaky/androidide/agent/repository/Executor.kt`:
- Around line 79-92: The "list_files" handler in Executor.kt currently
normalizes paths weakly and calls IDEApiFacade.listFiles which ultimately uses
File(baseDir, sanitizedPath); fix this by resolving and canonicalizing both the
project root and the requested path (use File(...).canonicalFile), reject or
normalize absolute/parent-traversal attempts, then verify the canonical child
path starts with the canonical project root (using startsWith on path strings or
Path#startsWith) before calling IDEApiFacade.listFiles; if the check fails,
return an explicit error/permission denied instead of proceeding. Ensure you
reference and update the code around the "list_files" branch, the creation of
File(baseDir, sanitizedPath) and any use of relativeTo(baseDir) to perform the
canonicalization and boundary enforcement.

In
`@app/src/main/java/com/itsaky/androidide/agent/repository/LocalLlmRepositoryImpl.kt`:
- Around line 563-588: The H2O tool prompt builder is being invoked even when
requiredTool is null, causing empty-tool forced tool-call responses; update the
logic so that when requiredTool is null you do not force a tool-only prompt.
Either (A) modify buildPromptWithHistory to only call buildH2oToolPrompt when
requiredTool != null and otherwise call the regular/direct-answer prompt
builder, or (B) update buildH2oToolPrompt to detect requiredTool == null and
return a fallback direct-answer/tool-optional prompt instead; reference
buildH2oToolPrompt and buildPromptWithHistory when making the change.

In `@app/src/main/java/com/itsaky/androidide/agent/repository/Tool.kt`:
- Around line 50-63: The ListFilesCommand invocation in ListFilesTool currently
allows paths that escape the project workspace; update the code that builds the
target directory (where File(baseDir, sanitizedPath) is used) to normalize and
validate canonical paths before listing: compute canonicalPath for both baseDir
and the resolved targetDir (handle empty/"." cases mapping to baseDir), then
reject requests where targetCanonical is neither equal to baseCanonical nor
startsWith(baseCanonical + File.separator) by returning a ToolResult failure
like "Access denied: path outside project directory"; ensure this validation
lives in the code path used by ListFilesTool/ListFilesCommand so directory
traversal and absolute paths are prevented.

In `@app/src/main/java/com/itsaky/androidide/agent/viewmodel/ChatViewModel.kt`:
- Around line 140-168: needsReload currently only checks engine.isModelLoaded
and engine.loadedModelPath, so changes to the stored SHA-256 won't trigger a
reload; update the reload logic in the block that computes needsReload (the code
using modelPath, needsReload, prefs and PREF_KEY_LOCAL_MODEL_SHA256) to also
compare the current stored hash (prefs.getString(PREF_KEY_LOCAL_MODEL_SHA256,
null)) against a tracked last-used hash (add a property to ChatViewModel, e.g.,
lastLoadedModelHash) and include that comparison in needsReload; after a
successful engine.initModelFromFile call, set lastLoadedModelHash to the
expectedHash (or null) so future changes to PREF_KEY_LOCAL_MODEL_SHA256 force
re-verification and reload when necessary, keeping the rest of the
LocalLlmRepositoryImpl creation and onStateUpdate assignment unchanged.

In `@app/src/main/java/com/itsaky/androidide/app/strictmode/WhitelistEngine.kt`:
- Around line 173-204: Add two unit tests to WhitelistRulesTest.kt that cover
the new DiskReadViolation rules: one that constructs a stack trace (or uses the
test helper) matching com.itsaky.androidide.utils.ServiceLoader.parse followed
by
com.itsaky.androidide.app.configuration.IJdkDistributionProvider$Companion._instance_delegate$lambda$0
and asserts that the rule allows the violation, and another that matches
java.io.File.exists followed by
com.itsaky.androidide.activities.OnboardingActivity.checkToolsIsInstalled and
asserts allowance; place these tests alongside existing whitelist tests and use
the same test helpers/assertions used elsewhere in WhitelistRulesTest.kt to
validate the rules.

In `@llama-impl/src/main/cpp/llama-android.cpp`:
- Line 26: The stop-string handling currently drops the entire
cached_token_chars when a stop substring is found; instead, locate the stop
substring inside cached_token_chars, append only the prefix before the stop to
g_generated_text, remove/trim that prefix (and the stop bytes) from
cached_token_chars (or mark the remainder for later), and set the generator's
stop flag so generation halts on the next call; update all occurrences of this
logic (places handling cached_token_chars and stop_strings around the symbols
cached_token_chars and g_generated_text) to perform prefix emission + trim +
set-stop rather than discarding the whole chunk.
- Line 552: completion_init currently returns only batch->n_tokens when KV-cache
reuse skips re-evaluating the whole prompt, which makes n_cur too small and
causes negative 'generated' and extra tokens; change completion_init to return
the full prompt token count (the total prompt length) instead of just
batch->n_tokens and use that value to initialize n_cur wherever completion_init
is called (references: function completion_init, variable n_cur,
batch->n_tokens, g_prompt_tokens, n_len, generated). Update all call sites
(including the regions around the mentioned spans) so n_cur reflects the entire
prompt length when reuse is active, ensuring token positions and generated
calculations remain correct.
🟡 Minor comments (13)
app/src/androidTest/kotlin/com/itsaky/androidide/AgentPromptTest.kt-127-137 (1)

127-137: ⚠️ Potential issue | 🟡 Minor

Verification may pass without actual agent functionality.

The check accepts "Agent", "System", or "Open AI Settings" as valid responses. The "Open AI Settings" case indicates the AI isn't configured, which means the test passes even when the agent prompt feature isn't actually working.

Consider separating this into distinct assertions or making the test skip/fail gracefully when AI settings aren't configured, to avoid false positives.

app/src/androidTest/kotlin/com/itsaky/androidide/AgentPromptTest.kt-215-222 (1)

215-222: ⚠️ Potential issue | 🟡 Minor

Fragile reflection and silent exception swallowing.

  1. Reflection on _installedDistributions will silently fail if the field is renamed/removed.
  2. Catching all exceptions with catch (_: Exception) hides potential issues during test setup.

Consider at minimum logging the exception to aid debugging when this fails:

Suggested improvement
         try {
             val provider = IJdkDistributionProvider.getInstance()
             val field = provider::class.java.getDeclaredField("_installedDistributions")
             field.isAccessible = true
             field.set(provider, listOf(JdkDistribution("21.0.0", jvmDir.absolutePath)))
-        } catch (_: Exception) {
+        } catch (e: Exception) {
+            android.util.Log.w("AgentPromptTest", "Failed to seed JDK distributions via reflection, falling back", e)
             IJdkDistributionProvider.getInstance().loadDistributions()
         }
app/src/androidTest/kotlin/com/itsaky/androidide/AgentPromptTest.kt-41-56 (1)

41-56: ⚠️ Potential issue | 🟡 Minor

Resource leak: executeShellCommand returns ParcelFileDescriptor that should be closed.

The UiAutomation.executeShellCommand() returns a ParcelFileDescriptor for stdout. Not closing it may cause resource leaks across test runs.

Proposed fix
     `@Before`
     fun enableExperiments() {
         val instrumentation = InstrumentationRegistry.getInstrumentation()
         preseedSetupState(instrumentation)
-        instrumentation.uiAutomation.executeShellCommand(
+        instrumentation.uiAutomation.executeShellCommand(
             "mkdir -p /sdcard/Download && touch /sdcard/Download/CodeOnTheGo.exp"
-        )
+        ).close()
     }

     `@After`
     fun cleanUp() {
         val instrumentation = InstrumentationRegistry.getInstrumentation()
-        instrumentation.uiAutomation.executeShellCommand(
+        instrumentation.uiAutomation.executeShellCommand(
             "rm -f /sdcard/Download/CodeOnTheGo.exp"
-        )
+        ).close()
     }
app/src/androidTest/kotlin/com/itsaky/androidide/AgentPromptTest.kt-226-238 (1)

226-238: ⚠️ Potential issue | 🟡 Minor

Resource leak: Multiple executeShellCommand calls without closing returned descriptors.

Each executeShellCommand() returns a ParcelFileDescriptor that should be closed to avoid resource leaks. Consider creating a helper extension:

Suggested helper and usage
private fun android.app.UiAutomation.runShellCommand(cmd: String) {
    executeShellCommand(cmd).close()
}

Then replace all calls:

-        uiAutomation.executeShellCommand("appops set $pkg MANAGE_EXTERNAL_STORAGE allow")
+        uiAutomation.runShellCommand("appops set $pkg MANAGE_EXTERNAL_STORAGE allow")
composite-builds/build-logic/plugins/src/main/java/com/itsaky/androidide/plugins/tasks/IDEWrapperGenerator.java-466-468 (1)

466-468: ⚠️ Potential issue | 🟡 Minor

--validate-url becomes a no‑op; add a backing field and setter to enable CLI option binding.

The @Option annotation requires a setter (preferred) or field (alternative) to bind command-line options. Annotating only a getter is ineffective—Gradle cannot set the property from the CLI. Currently, getValidateDistributionUrl() reads only a system property, making --validate-url inert. Add a backing field and setter so the CLI option can actually configure the task, using the system property only as a default initial value.

Suggested fix
+  private boolean validateDistributionUrl =
+      Boolean.parseBoolean(System.getProperty("androidide.wrapper.validateUrl", "false"));
+
+  `@Option`(option = "validate-url", description = "Sets task to validate the configured distribution url.")
+  public void setValidateDistributionUrl(boolean validateDistributionUrl) {
+    this.validateDistributionUrl = validateDistributionUrl;
+  }
...
   `@Option`(option = "validate-url", description = "Sets task to validate the configured distribution url.")
   public Boolean getValidateDistributionUrl() {
-      return Boolean.parseBoolean(System.getProperty("androidide.wrapper.validateUrl", "false"));
+      return validateDistributionUrl;
   }
agent/src/main/java/com/itsaky/androidide/agent/fragments/DisclaimerDialogFragment.kt-11-16 (1)

11-16: ⚠️ Potential issue | 🟡 Minor

Localize the positive button label.

Hardcoding "OK" skips localization. Prefer a string resource or android.R.string.ok.

🔧 Proposed fix
-            .setPositiveButton("OK", null)
+            .setPositiveButton(android.R.string.ok, null)
app/src/main/java/com/itsaky/androidide/activities/OnboardingActivity.kt-84-88 (1)

84-88: ⚠️ Potential issue | 🟡 Minor

Unnecessary setOrientation wrapper adds no value.

setOrientation is a pass-through that immediately invokes its function parameter without adding any logic. Call setAdaptiveOrientation directly within setOrientationFunction:

setOrientationFunction {
    OrientationUtilities.setAdaptiveOrientation(this) { requestedOrientation = it }
}
agent/src/main/java/com/itsaky/androidide/agent/repository/LocalAgenticRunner.kt-348-391 (1)

348-391: ⚠️ Potential issue | 🟡 Minor

Align inferred arg key with tool schema.

add_dependency is using dependency_string in tests/inputs, but inference fills dependency. This prevents inference from satisfying required args after the Executor map is corrected.

🔧 Suggested fix
             "add_dependency" -> {
-                if (updated["dependency"].isNullOrBlank()) {
-                    inferDependency(userPrompt)?.let { updated["dependency"] = it }
+                if (updated["dependency_string"].isNullOrBlank()) {
+                    inferDependency(userPrompt)?.let { updated["dependency_string"] = it }
                 }
                 if (updated["build_file_path"].isNullOrBlank()) {
                     inferBuildFilePath(userPrompt)?.let { updated["build_file_path"] = it }
                 }
             }
agent/src/main/java/com/itsaky/androidide/agent/repository/LocalAgenticRunner.kt-68-158 (1)

68-158: ⚠️ Potential issue | 🟡 Minor

Protect sampling reset and avoid truncating valid '<' content.

If the retry inference throws, sampling defaults won’t be restored. Also, the direct-response path truncates any response containing <, even when it’s legitimate text. Use a try/finally for sampling reset and only strip content when a <tool_call> tag is present.

🔧 Suggested fix
                     if (initial.isBlank()) {
                         log.warn("Simplified workflow: empty response, retrying with no stop strings.")
                         engine.updateSampling(temperature = 0.2f, topP = 0.9f, topK = 40)
-                        val retry = engine.runInference(
-                            simplifiedPrompt,
-                            clearCache = history.isEmpty()
-                        )
-                        engine.resetSamplingDefaults()
+                        val retry = try {
+                            engine.runInference(
+                                simplifiedPrompt,
+                                clearCache = history.isEmpty()
+                            )
+                        } finally {
+                            engine.resetSamplingDefaults()
+                        }
                         retry
                     } else {
                         initial
                     }
                 }
@@
-                val cleanResponse = cleanedResponse.trim().takeWhile { it != '<' }.trim()
+                val cleanResponse = if (cleanedResponse.contains("<tool_call>")) {
+                    cleanedResponse.substringBefore("<tool_call>")
+                } else {
+                    cleanedResponse
+                }.trim()
agent/src/main/java/com/itsaky/androidide/agent/repository/utils.kt-34-54 (1)

34-54: ⚠️ Potential issue | 🟡 Minor

Normalize tool-only fallback before validation.

If the model emits <tool_call>list_dir</tool_call>, the current fallback rejects it because normalization happens only in the JSON path. Apply the same normalization for the tool-only fallback.

🔧 Suggested fix
-            if (toolOnly != null && toolKeys.contains(toolOnly)) {
+            if (toolOnly != null) {
+                val resolvedToolOnly = if (toolOnly == "list_dir") "list_files" else toolOnly
+                if (toolKeys.contains(resolvedToolOnly)) {
                     val fallback = buildJsonObject {
-                        put("name", toolOnly)
+                        put("name", resolvedToolOnly)
                         put("args", buildJsonObject { })
                     }.toString()
                 logger.debug("Fallback tool-only call parsed as JSON: {}", fallback)
                 return parseJsonObject(fallback)?.let { jsonObject ->
                     val argsObject = jsonObject["args"]?.jsonObject
                     val args = argsObject?.mapValues { (_, value) ->
                         value.toToolArgString()
                     } ?: emptyMap()
-                    val result = LocalLLMToolCall(toolOnly, args)
+                    val result = LocalLLMToolCall(resolvedToolOnly, args)
                     logger.debug("SUCCESS: Parsed tool-only call: {}", result)
                     result
                 }
+                }
             }
app/src/main/res/layout/layout_settings_local_llm.xml-47-69 (1)

47-69: ⚠️ Potential issue | 🟡 Minor

Move the SHA-256 hint into a string resource.

The hint is hardcoded, which blocks localization. Use a string resource instead.

💬 Proposed change
-    android:hint="Optional SHA-256 (for verification)">
+    android:hint="@string/ai_setting_optional_sha256_hint">

Add the new string in strings.xml.

agent/src/main/java/com/itsaky/androidide/agent/fragments/ChatFragment.kt-479-495 (1)

479-495: ⚠️ Potential issue | 🟡 Minor

Unguarded access to IProjectManager may throw.

Accessing IProjectManager.getInstance().projectDir.path without error handling could crash if the project manager isn't initialized. Wrap in runCatching for safety.

🛡️ Proposed fix
 private fun getDefaultValueForParameter(toolName: String, paramName: String): String {
     return when {
         paramName == "offset" -> "0"
         paramName == "limit" -> "1000"
         paramName == "file_path" && toolName == "read_file" -> {
-            // Try to get current project path
-            com.itsaky.androidide.projects.IProjectManager.getInstance().projectDir.path + "/"
+            runCatching {
+                com.itsaky.androidide.projects.IProjectManager.getInstance().projectDir.path + "/"
+            }.getOrDefault("")
         }
 
         paramName == "path" -> {
-            // Try to get current project path
-            com.itsaky.androidide.projects.IProjectManager.getInstance().projectDir.path + "/"
+            runCatching {
+                com.itsaky.androidide.projects.IProjectManager.getInstance().projectDir.path + "/"
+            }.getOrDefault("")
         }
 
         else -> ""
     }
 }
app/src/main/java/com/itsaky/androidide/agent/repository/LlmInferenceEngine.kt-208-218 (1)

208-218: ⚠️ Potential issue | 🟡 Minor

Exception is swallowed without logging.

When countTokens fails, the exception is silently caught and falls back to estimation. Consider logging the error for debugging purposes.

🔍 Proposed fix
             try {
                 controller.countTokens(text)
             } catch (e: Throwable) {
+                log.warn("Token counting failed, using estimation", e)
                 estimateTokenCount(text)
             }
🧹 Nitpick comments (25)
app/src/androidTest/kotlin/com/itsaky/androidide/AgentPromptTest.kt (2)

64-81: Clarify step description: "verify init failed message" sounds like an error state.

The step description "Drag bottom sheet up and verify init failed message" suggests verifying an error condition. If this is expected test behavior (e.g., project init fails in test environment), consider renaming to clarify intent, such as "verify project status is displayed" or add a comment explaining why init failure is expected.


142-172: Consider simplifying navigation logic.

The fallback at lines 170-171 calls navigateToMainScreen() unconditionally, even if it was already called at line 158. This could cause duplicate navigation attempts.

Suggested simplification
     private fun TestContext<Unit>.ensureProjectIsOpen() {
         val pkg = BuildConfig.APPLICATION_ID
         val editorSelector = UiSelector().resourceId("$pkg:id/editor_appBarLayout")
         val actionsSelector = UiSelector().resourceId("$pkg:id/actions")
         val onboardingSelector = UiSelector().resourceId("$pkg:id/next")

         if (device.uiDevice.findObject(editorSelector).exists()) {
             return
         }

         if (device.uiDevice.findObject(actionsSelector).exists()) {
             createEmptyProject()
             return
         }

         if (device.uiDevice.findObject(onboardingSelector).exists()) {
             navigateToMainScreen()
+            if (device.uiDevice.findObject(editorSelector).exists()) {
+                return
+            }
+            if (device.uiDevice.findObject(actionsSelector).exists()) {
+                createEmptyProject()
+                return
+            }
         }

-        if (device.uiDevice.findObject(editorSelector).exists()) {
-            return
-        }
-
-        if (device.uiDevice.findObject(actionsSelector).exists()) {
-            createEmptyProject()
-            return
-        }
-
-        navigateToMainScreen()
+        // Fallback: navigate and create project
+        if (!device.uiDevice.findObject(actionsSelector).exists()) {
+            navigateToMainScreen()
+        }
         createEmptyProject()
     }
agent/src/main/java/com/itsaky/androidide/agent/utils/DynamicLibraryLoader.kt (2)

77-77: Consider defensive check for SUPPORTED_ABIS.

While Build.SUPPORTED_ABIS is practically never empty on real devices, adding a defensive check would improve robustness and provide a clearer error message if this edge case ever occurs.

💡 Suggested defensive check
+        if (Build.SUPPORTED_ABIS.isEmpty()) {
+            Log.e("DynamicLoad", "No supported ABIs found on this device")
+            return null
+        }
         val abi = Build.SUPPORTED_ABIS[0]

99-102: Check mkdirs() return value for clearer error handling.

mkdirs() returns false if directory creation fails. Currently, failure here would surface later as a DexClassLoader exception with a less informative message.

💡 Suggested improvement
         if (!optimizedDir.exists()) {
-            optimizedDir.mkdirs()
+            if (!optimizedDir.mkdirs() && !optimizedDir.exists()) {
+                Log.e("DynamicLoad", "Failed to create optimized dex directory: ${optimizedDir.absolutePath}")
+                return null
+            }
         }
app/src/androidTest/kotlin/com/itsaky/androidide/scenarios/NavigateToMainScreenScenario.kt (2)

15-30: Avoid silently continuing when no onboarding CTA is found.

If all fallback selectors miss, the scenario proceeds and can hide UI regressions. Consider failing fast or explicitly aborting the step when nothing is found so unexpected states surface clearly.


91-97: Consider flattening the deep fallback chain.

The nested try/catch ladder is getting hard to maintain (now 8 variants). A small helper that iterates a list of alternatives would make this easier to extend and read.

app/src/main/java/com/itsaky/androidide/agent/repository/SwitchableGeminiRepository.kt (1)

8-11: Consider consolidating preference keys.

If these keys already exist in a shared constants file, re‑exporting from a single source will prevent drift.

testing/android/build.gradle.kts (1)

29-41: Remove redundant protobuf-lite exclude from espresso.contrib dependency.

The global configurations.configureEach { exclude(...) } at line 30 already excludes protobuf-lite from all configurations. The nested exclude on espresso.contrib at lines 39-41 is redundant and can be removed.

♻️ Proposed simplification
-    api(libs.tests.androidx.espresso.contrib) {
-        exclude(group = "com.google.protobuf", module = "protobuf-lite")
-    }
+    api(libs.tests.androidx.espresso.contrib)
common/src/main/java/com/itsaky/androidide/app/BaseIDEActivity.kt (1)

50-64: Consider removing the @SuppressLint annotation.

Since the code now uses adaptive orientation instead of a fixed portrait orientation, the @SuppressLint("SourceLockedOrientationActivity") annotation on line 50 may no longer be necessary. The lint warning was meant for activities that lock to a single orientation, which this code no longer does.

🔧 Proposed fix
-  `@SuppressLint`("SourceLockedOrientationActivity")
   override fun onCreate(savedInstanceState: Bundle?) {
layouteditor/src/main/java/org/appdevforall/codeonthego/layouteditor/BaseActivity.kt (1)

17-26: Consider removing the @SuppressLint annotation.

Same as in BaseIDEActivity.kt: since adaptive orientation is now used, the @SuppressLint("SourceLockedOrientationActivity") annotation may be obsolete.

🔧 Proposed fix
-  `@SuppressLint`("SourceLockedOrientationActivity")
   override fun onCreate(savedInstanceState: Bundle?) {
agent/src/main/java/com/itsaky/androidide/agent/tool/ListFilesHandler.kt (1)

42-50: Consider handling trailing slashes in path normalization.

The normalization handles "./" but not paths like "src/" which would retain the trailing slash. This is likely fine depending on how the tooling API handles paths, but worth noting for consistency.

🔧 Optional enhancement
 private fun normalizePath(path: String): String {
     val trimmed = path.trim()
     if (trimmed.isEmpty()) return ""
     val lowered = trimmed.lowercase()
     return when (lowered) {
         ".", "./", "root", "project", "project root", "root project", "project_root" -> ""
-        else -> trimmed
+        else -> trimmed.trimEnd('/')
     }
 }
agent/src/main/java/com/itsaky/androidide/agent/tool/ReadFileHandler.kt (1)

7-17: Add logging for consistency with other tool handlers.

Unlike ListFilesHandler and SearchProjectHandler, this handler lacks debug/warning logging. Adding consistent logging improves observability and debugging across all tool handlers.

🔧 Proposed enhancement
 package com.itsaky.androidide.agent.tool
 
 import com.itsaky.androidide.agent.api.AgentDependencies
 import com.itsaky.androidide.agent.model.ReadFileArgs
 import com.itsaky.androidide.agent.model.ToolResult
+import org.slf4j.LoggerFactory
 
 class ReadFileHandler : ToolHandler {
     override val name: String = "read_file"
 
     override suspend fun invoke(args: Map<String, Any?>): ToolResult {
         val toolArgs = decodeArgs<ReadFileArgs>(args)
         if (toolArgs.filePath.isBlank()) {
             return ToolResult.failure("Missing required argument: file_path")
         }
-        return AgentDependencies.requireToolingApi()
+        logger.debug("Invoking read_file path='{}', offset={}, limit={}", 
+            toolArgs.filePath, toolArgs.offset, toolArgs.limit)
+        val result = AgentDependencies.requireToolingApi()
             .readFile(toolArgs.filePath, toolArgs.offset, toolArgs.limit)
+        if (!result.success) {
+            logger.warn("read_file failed path='{}'. message='{}'", 
+                toolArgs.filePath, result.message)
+        }
+        return result
     }
+
+    companion object {
+        private val logger = LoggerFactory.getLogger(ReadFileHandler::class.java)
+    }
 }
agent/src/test/java/com/itsaky/androidide/agent/repository/LocalAgenticRunnerSimplifiedWorkflowTest.kt (1)

146-163: Strengthen IDE-tool prompt assertions.

The test only checks the prompt is non-empty; consider asserting at least one IDE tool (e.g., list_files) to avoid false positives if the IDE toolset drops out.

Suggested change
         assertTrue(
             "Prompt should be generated",
             capturedPrompt.isNotEmpty()
         )
+        assertTrue(
+            "Prompt should include list_files tool for IDE queries",
+            capturedPrompt.contains("list_files")
+        )
app/src/main/java/com/itsaky/androidide/utils/LocalBenchmarkRunner.kt (1)

66-129: Consider releasing the model after benchmark.

Repeated runs re-initialize and load a model without explicit teardown. If LocalLlmRepositoryImpl or LlmInferenceEngine exposes unload/release APIs, call them in a finally block to avoid native memory growth across runs.

app/src/main/java/com/itsaky/androidide/activities/MainActivity.kt (1)

151-151: Please confirm benchmark FAB should ship to all users.

LocalBenchmarkRunner(this).attach() runs unconditionally on every app start. If this is intended only for internal benchmarks, gate it behind BuildConfig.DEBUG or a feature flag to avoid impacting production UX.

agent/src/main/java/com/itsaky/androidide/agent/prompt/FirstTurnContextProvider.kt (1)

18-29: Escape XML-like context values.

cwd/shell can contain <, &, etc., which can break the structured block. Consider XML-escaping to keep the prompt well‑formed.

Possible change
-        val cwd = resolveWorkingDirectory(context)
-        val shell = Environment.BASH_SHELL?.name ?: "bash"
+        val cwd = escapeXml(resolveWorkingDirectory(context))
+        val shell = escapeXml(Environment.BASH_SHELL?.name ?: "bash")
         return buildString {
             appendLine("<environment_context>")
             appendLine("  <cwd>$cwd</cwd>")
             appendLine("  <approval_policy>$APPROVAL_POLICY</approval_policy>")
             appendLine("  <sandbox_mode>$SANDBOX_MODE</sandbox_mode>")
             appendLine("  <network_access>$NETWORK_ACCESS</network_access>")
             appendLine("  <shell>$shell</shell>")
             append("</environment_context>")
         }
     }
+
+    private fun escapeXml(value: String): String =
+        value.replace("&", "&amp;")
+            .replace("<", "&lt;")
+            .replace(">", "&gt;")
+            .replace("\"", "&quot;")
+            .replace("'", "&apos;")
app/src/main/java/com/itsaky/androidide/agent/repository/utils.kt (1)

55-69: Improve fallback when <tool_call> doesn’t contain JSON.

If the model outputs <tool_call>name</tool_call>{...}, the current path returns null. Consider falling back to extracting JSON from the full response before giving up (to match the agent module’s robustness).

♻️ Suggested tweak
         val toolBlock = extractFirstToolCallBlock(responseText)
         if (toolBlock != null) {
             extractFirstJsonObject(toolBlock)?.let { return it }
-            return null
+            // Fallback: some models emit <tool_call>name</tool_call>{...}
+            extractFirstJsonObject(responseText)?.let { return it }
         }
agent/src/main/java/com/itsaky/androidide/agent/viewmodel/ChatViewModel.kt (2)

463-464: LiveData reassignment may not trigger observers.

Assigning _sessions.value = _sessions.value doesn't trigger observers since the reference hasn't changed. If the intent is to notify observers of internal list mutations, consider using postValue with a new list or a different state management approach.

♻️ Suggested fix
     session.messages.add(systemMessage)
-    _sessions.value = _sessions.value
+    _sessions.postValue(_sessions.value?.toMutableList())

761-764: Empty event handlers are intentional for tool testing.

The empty ShellCommandEventEmitter implementations are acceptable here since the testTool function only needs to execute the tool and display results, not track command lifecycle events. However, consider using a named no-op implementation for clarity.

♻️ Optional: Extract to a named constant
private val noOpEventEmitter = object : ShellCommandEventEmitter {
    override suspend fun onCommandBegin(event: ExecCommandBegin) = Unit
    override suspend fun onCommandEnd(event: ExecCommandEnd) = Unit
}
agent/src/main/java/com/itsaky/androidide/agent/fragments/ChatFragment.kt (4)

111-115: Window insets listener may affect other fragments.

Setting the insets listener on activity?.window?.decorView affects all fragments in the activity. Consider using ViewCompat.setOnApplyWindowInsetsListener on the fragment's root view instead for better isolation.

♻️ Suggested approach
 override fun onResume() {
     super.onResume()
-    activity?.window?.decorView?.setOnApplyWindowInsetsListener(insetsListener)
+    ViewCompat.setOnApplyWindowInsetsListener(binding.root, insetsListener)
     chatViewModel.checkBackendStatusOnResume(requireContext())
 }

395-398: Hardcoded color may not work well with dark theme.

Using Color.GRAY directly may not provide sufficient contrast in dark mode. Consider using a theme attribute instead.

♻️ Suggested fix
-                    setTextColor(android.graphics.Color.GRAY)
+                    setTextColor(
+                        com.google.android.material.R.attr.colorOnSurfaceVariant.let { attr ->
+                            val typedValue = android.util.TypedValue()
+                            context.theme.resolveAttribute(attr, typedValue, true)
+                            typedValue.data
+                        }
+                    )

167-172: Hardcoded UI string should be externalized.

The string "Tokens: $percentage%" should be moved to strings.xml for internationalization support.


798-815: Consider cleaning up old transcript exports.

Files in chat_exports directory accumulate over time. Consider deleting old exports before creating new ones, or implementing periodic cleanup.

app/src/main/java/com/itsaky/androidide/agent/repository/LlmInferenceEngine.kt (2)

91-125: Reflection fallback chain is robust but consider documenting the expected scenarios.

The fallback chain (static instance()Companion.instance()_instance field) handles various Kotlin compilation outputs. Consider adding a brief comment explaining when each fallback is expected to trigger.


186-195: updateMaxTokens does not clamp the value.

Unlike configureNativeDefaults which clamps maxTokens to 128-512, the updateMaxTokens function passes the value directly. The native layer clamps to 64-1024 (per LLamaAndroid.kt), but consider adding consistent clamping here for predictable behavior.

♻️ Optional: Add consistent clamping
 fun updateMaxTokens(maxTokens: Int) {
     val klass = llamaAndroidClass ?: return
     try {
-        configuredMaxTokens = maxTokens
+        val clamped = maxTokens.coerceIn(64, 1024)
+        configuredMaxTokens = clamped
         klass.methods.firstOrNull { it.name == "configureMaxTokens" }
-            ?.invoke(null, maxTokens)
+            ?.invoke(null, clamped)
     } catch (e: Exception) {
         android.util.Log.w("LlmEngine", "Failed to update max tokens", e)
     }
 }

@jatezzz jatezzz marked this pull request as ready for review February 3, 2026 18:29
jatezzz and others added 23 commits February 3, 2026 13:37
…eps error

Local LLMs now always use the simplified single-call workflow, bypassing
the multi-step planning loop that can result in "max steps" or step retry
exhaustion errors.

The multi-step loop with maxSteps=20 was designed for more capable cloud
models (Gemini). On-device models work better with the simpler approach
that makes a single LLM call to either respond directly or select a tool.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add unit tests and integration tests to verify Local LLM always uses
the simplified workflow and does not hit "max steps" errors.

Unit tests (LocalAgenticRunnerSimplifiedWorkflowTest.kt):
- Verify shouldUseSimplifiedInstructions() always returns true
- Test direct response handling
- Test tool call parsing
- Test error handling
- Test tool selection for IDE vs non-IDE queries

Integration tests (LocalLlmIntegrationTest.kt):
- Test with actual GGUF model on device
- Verify no max steps error with complex queries
- Test engine initialization and model loading
- Test simple inference and tool calling

To run integration tests:
1. Download a GGUF model (e.g., gemma-3-1b-it.Q4_K_M.gguf)
2. Place in device's Download folder
3. Run: ./gradlew :agent:connectedAndroidTest

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit fixes numerous build errors in the agent module that were
preventing compilation:

Build fixes:
- Add llama-api dependency to agent module
- Copy DynamicLibraryLoader to agent/utils (can't depend on app)
- Copy DisclaimerDialogFragment to agent module
- Copy layout_ai_disclaimer.xml to agent resources
- Fix duplicate loadModelFromUri function in AiSettingsViewModel
- Add missing android.util.Log import
- Fix BaseApplication.getBaseInstance() -> baseInstance
- Fix FragmentAgentContainerBinding import path
- Fix EmptyStateFragment to use StateFlow properly (collect instead of observe)
- Fix ChatFragment val reassignment (use setter methods)
- Remove duplicate retry string resource

Test fixes:
- Update LocalAgenticRunnerSimplifiedWorkflowTest to not access protected method

All 8 new simplified workflow tests pass.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
… not installed

- Check for Llama library installation before creating LlmInferenceEngine
- Make engine nullable to handle tearDown gracefully
- Add more GGUF model paths including gemma-3-1b-it-UD-IQ1_S.gguf
- Add androidTest dependencies for instrumentation tests
- Add packaging excludes for META-INF conflicts

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…ion tests

- Add Gradle task to extract processed Llama AAR (with classes.dex) from
  app's assets-*.zip for integration tests
- Auto-install Llama library from test assets during test setUp
- Add Android 12+ (API 31) requirement check due to Logback module requirements
- Tests now skip gracefully on older Android versions with clear message

To run the integration tests:
1. Build app assets first: ./gradlew :app:assembleV8Assets
2. Push a GGUF model to device: adb push model.gguf /sdcard/Download/
3. Run tests on Android 12+ device: ./gradlew :agent:connectedV8DebugAndroidTest

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit introduces several fixes to enhance the reliability of local LLM (Large Language Model) integration tests and the dynamic loading of the Llama library.

Key changes:
*   **Replaced Logback with slf4j-android in tests:** Removes the dependency on Logback, which required Android 12+, allowing tests to run on a wider range of Android versions.
*   **Improved dynamic library loading:**
    *   Introduced a `ChildFirstDexClassLoader` to prevent class loading conflicts.
    *   Added caching for the ClassLoader to improve performance.
    *   Ensures `.dex` files are set to read-only after extraction to comply with Android 10+ requirements.
*   **Enhanced test robustness:**
    *   Adopted shell permissions (`READ_EXTERNAL_STORAGE`) to allow tests to find model files in more locations, such as `/sdcard/Download`.
    *   Improved the model search logic to check instrumentation arguments, system properties, and various common directories.
    *   The Llama AAR is now re-extracted for each test session to prevent stale builds from causing issues.
    *   The inference test prompt and validation have been made more flexible.
*   **Refined Llama engine initialization:**
    *   Added more robust reflection logic to find the Llama instance, supporting multiple variations of the library's structure.
    *   Improved handling of `file://` URI schemes when loading a model.
This change introduces several improvements to the local agentic runner and the underlying llama.cpp implementation.

Key changes:
*   **Parameter Configuration**: Adds the ability to configure threading (`n_threads`, `n_threads_batch`) and sampling parameters (`temperature`, `top_p`, `top_k`) for the llama.cpp engine from the Kotlin/Java layer. Default values are now set during engine initialization.
*   **Prompt Clipping**: Implements clipping for conversation history and user prompts to prevent excessively long inputs to the model. This helps manage context window size and improve performance.
*   **Native Layer Updates**: The C++ layer now uses the configured threading and sampling parameters. It also defaults to more sophisticated sampling (Top-K, Top-P, temperature-based) instead of simple greedy sampling if the temperature is above zero.
This commit introduces significant improvements to the local LLM agent's capabilities, focusing on more robust tool parsing, smarter tool selection, and overall better performance and reliability.

- **feat(agent): Add function-style tool call parsing**
  - The agent can now parse tool calls written in a function-like syntax (e.g., `list_dir(".")`) in addition to JSON, making it compatible with a wider range of models.

- **feat(agent): Enhance tool selection and argument handling**
  - Implement keyword-based heuristics to select a relevant subset of tools for the prompt, reducing complexity and improving model accuracy.
  - Add logic to infer missing required arguments (e.g., file paths, dependencies) from the user's prompt.
  - Validate for required arguments before executing a tool, asking the user for clarification if necessary.

- **refactor(agent): Improve local agent workflow**
  - Rework the simplified workflow to provide better feedback for empty or failed model responses.
  - The local model inference now runs indefinitely until an end-of-generation token is produced or the stop strings are met, rather than being limited by a fixed token length.
  - Simplify tool result formatting for a cleaner user experience.

- **refactor(EmptyStateFragment): Move to common-ui module**
  - Relocate `EmptyStateFragment` from the `app` module to the `common-ui` module for better code organization.
  - Improve state caching to prevent UI inconsistencies when the fragment is detached and re-attached.

- **build: Disable Gradle Wrapper URL validation**
  - The wrapper validation task now defaults to `false` and is controlled by the `androidide.wrapper.validateUrl` system property. This speeds up builds by avoiding unnecessary network requests.

- **test: Add integration and benchmark tests for local LLM**
  - Introduce new integration tests to verify that all agent tools can be correctly invoked.
  - Add a benchmark suite to measure the performance and accuracy of tool selection for various prompts.
This commit introduces several enhancements to the local LLM engine:

-   **KV Cache Reuse:** Implements KV cache reuse to speed up subsequent inferences with overlapping prompts. This avoids re-evaluating the entire context on every turn. A `clearCache` parameter is added to inference calls for explicit control.
-   **Context-Aware History:** The prompt construction now considers the model's context size. It dynamically includes as much conversation history as possible without exceeding an input budget (70% of total context size), preventing context overflow errors.
-   **Dynamic Context Configuration:** The context size (`n_ctx`) is now configurable and determined at runtime based on the device's total RAM. This allows for larger context windows on more capable devices.
-   **Token Counting:** A `countTokens` method is added to the inference engine to accurately measure prompt and history sizes.
-   **Model SHA-256 Verification:** Adds an optional SHA-256 hash verification step when loading a local model to ensure file integrity.
-   **Retry Logic Improvement:** When a simplified prompt fails, the retry attempt now uses a lower temperature for a more deterministic response.
This commit introduces several enhancements to the local LLM engine:

-   **KV Cache Reuse:** Implements KV cache reuse to speed up subsequent inferences with overlapping prompts. This avoids re-evaluating the entire context on every turn. A `clearCache` parameter is added to inference calls for explicit control.
-   **Context-Aware History:** The prompt construction now considers the model's context size. It dynamically includes as much conversation history as possible without exceeding an input budget (70% of total context size), preventing context overflow errors.
-   **Dynamic Context Configuration:** The context size (`n_ctx`) is now configurable and determined at runtime based on the device's total RAM. This allows for larger context windows on more capable devices.
-   **Token Counting:** A `countTokens` method is added to the inference engine to accurately measure prompt and history sizes.
-   **Model SHA-256 Verification:** Adds an optional SHA-256 hash verification step when loading a local model to ensure file integrity.
-   **Retry Logic Improvement:** When a simplified prompt fails, the retry attempt now uses a lower temperature for a more deterministic response.
This commit introduces an end-to-end UI test for the agent functionality, verifying that a user can open the agent, send a prompt, and receive a response.

Key changes:
-   Added `AgentPromptTest.kt` to simulate user interaction with the agent from the bottom sheet.
-   Improved the `search_project` tool by inferring search queries from file names mentioned in the prompt and using the user's prompt as a fallback if no specific query is found.
-   Made the test scenario for navigating through onboarding more robust by adding fallbacks for finding and clicking "Next" or "Continue" buttons.
-   Resolved a `protobuf-lite` dependency conflict in the `androidTest` configuration by excluding it and aligning on `protobuf-java`.
- Add canonical path validation in DynamicLibraryLoader to prevent Zip Slip
- Free seq_id sub-arrays in free_batch to fix native memory leak
- Parse and enforce stop strings in completion_init/completion_loop
- Remove duplicate TAG/LOGi/LOGe macro definitions

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…loop

- Add @volatile to LlmInferenceEngine mutable state fields (isModelLoaded,
  loadedModelName, loadedModelPath, currentModelFamily) for cross-thread visibility
- Add @synchronized to DynamicLibraryLoader.getLlamaClassLoader to prevent
  concurrent unzip and class loader creation
- Fix onChanged callback in ChatFragment that called notifyDataSetChanged()
  causing a potential infinite loop; replaced with scrollToPosition

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Deduplicate requiredArgsForTool into Executor.companion and reuse in
  LocalAgenticRunner.getMissingRequiredArgs
- Fix sender label inconsistency: use "assistant" consistently instead of
  "model" in history formatting
- Merge buildHistoryWithBudget and buildHistoryWithBudgetNoTools into a
  single method parameterized by base prompt text
- Remove ~120 lines of dead code: createInitialPlan, planForStep,
  buildToolSelectionPrompt, buildSimplifiedInstructionOverride,
  parsePlanSteps, extractDescription, extractJsonArray (unreachable since
  shouldUseSimplifiedInstructions always returns true)
- Remove unused fields (json, toolNames) and imports
- Fix double inferSearchQuery call in fillMissingArgs for search_project
- Replace fire-and-forget CoroutineScope in destroy() with runBlocking
  to ensure model unload completes before the runner is garbage collected

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
EmptyStateFragment was moved to common-ui but depended on app-only
classes (EmptyStateFragmentViewModel, EmptyView, EditorLongPressEvent).
This caused unresolved reference errors since common-ui cannot depend
on app.

- Restore EmptyStateFragment in app module (from stage)
- Delete broken copy from common-ui
- Change agent ChatFragment to extend FragmentWithBinding directly
  instead of EmptyStateFragment, since it never used the empty state
  (immediately set isEmpty=false, empty onFragmentLongPressed override)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add missing string resource pref_use_simple_local_prompt
- Revert broken import renames to non-existent classes (AppIntentUtils,
  AppIdeToolingApi, AgentDependencies) in BaseEditorActivity,
  EditorHandlerActivity, IDEApplication, AgentFragmentContainer,
  IDEApiFacade
- Fix LlmInferenceEngine DynamicLibraryLoader import to use app-local
  copy instead of agent module copy

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
IJdkDistributionProvider.getInstance() lazily initializes via
ServiceLoader (disk read) and checkToolsIsInstalled calls File.exists(),
both on the main thread during onCreate. Add whitelist rules following
the existing codebase pattern instead of using allowThreadDiskReads.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit introduces a new `list_files` tool for the AI agent, allowing it to list files and directories within the current project.

Key changes include:
*   Adding the `ListFilesTool` implementation.
*   Refactoring the tool call parser in `Util.kt` to be more robust. It now handles function-style tool calls (e.g., `Tool Call: list_files(...)`) and improves JSON argument parsing.
*   Adding performance and memory logging to the agent loop to better diagnose resource usage during a session.
*   Refining the model loading logic in `LlmInferenceEngine` and `ChatViewModel` to ensure the correct model is loaded on-demand and to prevent race conditions using a mutex.
*   Caching the `isPluginProject()` check for better performance.
*   Restricting automatic link detection in chat messages to only web URLs.
This commit introduces an adaptive orientation feature.

The app will now use sensor-based orientation on large screens (tablets) and lock to portrait mode on smaller devices (phones). This behavior is encapsulated in the new `OrientationUtilities.setAdaptiveOrientation` method.

The following activities have been updated to use this new adaptive orientation logic:
- `BaseIDEActivity`
- `OnboardingActivity`
- `layouteditor.BaseActivity`
This commit introduces significant enhancements to the agent's tool-calling capabilities and adds support for Qwen models.

Key changes include:

*   **Robust Tool Parsing:**
    *   Implements a more reliable JSON object extractor to handle complex or malformed model outputs, replacing the previous brittle string search.
    *   Adds logic to specifically parse content within `<tool_call>` XML tags.
    *   Filters out `null` values from tool arguments.

*   **Forced Tool-Call Mechanism:**
    *   Introduces a keyword-based system (`detectRequiredTool`) to identify when a user prompt requires a specific tool.
    *   If the model fails to generate a tool call on the first attempt, a second attempt is made with a more forceful prompt, increasing the reliability of tool execution.

*   **Qwen Model Family Support:**
    *   Adds `Qwen` to the `ModelFamily` enum.
    *   Implements a dedicated, strict tool-calling prompt for Qwen models.
    *   Updates model detection logic to recognize Qwen models by filename.

*   **Enhanced Prompting and History:**
    *   Refines prompt construction for Llama3 and Gemma2 to include tool requirement hints.
    *   Improves Llama3 history management by cleaning and truncating messages to create a more effective context.

*   **Improved Logging and Metrics:**
    *   Introduces streaming inference to collect detailed performance metrics, including Time To First Token (TTFT) and tokens/sec.
    *   Enhances performance logs with these new metrics.
    *   Changes `IDELogcatReader` to read from all log buffers for more comprehensive debugging.

*   **Native Code and Build:**
    *   Fixes potential UTF-8 conversion crashes in the JNI layer by sanitizing invalid byte sequences.
    *   Bumps the native library version to reflect these changes.
This commit introduces support for the H2O model family, specifically for tool-calling tasks.

Key changes include:
*   Adding `H2O` to the `ModelFamily` enum.
*   Implementing a new `buildH2oToolPrompt` function to create tool-specific prompts for H2O models.
*   Adding logic to normalize H2O's occasionally non-standard tool call responses to ensure they can be parsed correctly.
*   Introducing the ability to dynamically override and reset the `max_tokens` for an inference run, which is used to limit H2O tool call generation to 128 tokens.
*   Improving JSON object extraction by adding a check for `responseText` that starts with `{`.
@jatezzz jatezzz force-pushed the feat/local-llm-work branch from 15ceceb to 4d65698 Compare February 3, 2026 18:38
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 14

🤖 Fix all issues with AI agents
In
`@agent/src/androidTest/java/com/itsaky/androidide/agent/repository/LocalLlmIntegrationTest.kt`:
- Around line 577-615: The benchmark can NPE when
context.getExternalFilesDir(null) returns null; update the
LocalLlmIntegrationTest benchmark code to pick a safe output directory (e.g.,
val outDir = context.getExternalFilesDir(null) ?: context.filesDir), ensure
outDir exists (create directories if needed), then construct the outputFile with
File(outDir, "local_llm_benchmark_$timestamp.csv"); keep using
FileWriter(outputFile).use { ... } but consider catching IOExceptions around
file creation/writing to fail gracefully or skip writing if no writable
directory is available.

In `@agent/src/main/java/com/itsaky/androidide/agent/fragments/ChatFragment.kt`:
- Around line 167-172: Guard against zero/unknown token limits in
onTokenUsageEvent: check TokenUsageEvent.tokenLimit before computing the
percentage (in the onTokenUsageEvent method) and if tokenLimit is 0 or <= 0
treat it as unknown—set binding.tokenUsageText to a safe fallback like "Tokens:
N/A" or "Tokens: 0%" instead of calculating percentage; ensure
binding.agentStatusContainer.isVisible behavior remains the same and use
event.tokenCount and event.tokenLimit only after the limit check to avoid
division-by-zero/overflow.
- Around line 479-495: The getDefaultValueForParameter function currently
assumes IProjectManager.getInstance().projectDir is non-null and uses
projectDir.path, which can NPE if no project is open; update
getDefaultValueForParameter to null-check IProjectManager.getInstance() and its
projectDir (and projectDir.path) before concatenating "/" and return a safe
fallback (e.g., empty string or a known default) when any are null—specifically
modify the branches handling "file_path" for toolName "read_file" and "path" to
guard against null projectDir and avoid dereferencing projectDir.path.

In
`@agent/src/main/java/com/itsaky/androidide/agent/repository/LocalAgenticRunner.kt`:
- Around line 141-148: The current cleanup uses takeWhile { it != '<' } which
removes any content after a '<' (breaking valid text like "2 < 3" or XML);
instead, modify the cleanup around cleanedResponse / cleanResponse in
LocalAgenticRunner.kt to remove only explicit tool-call blocks (e.g., patterns
or markers that denote tool output) rather than truncating at the first '<'.
Introduce or call a small helper (e.g., removeToolBlocks or stripToolCalls) that
uses a targeted regex or marker-aware parsing to strip known tool-call segments
(like <tool>...</tool> or other delimiters) while leaving other '<' characters
intact, then use that sanitized string to compute finalResponse.

In `@agent/src/main/java/com/itsaky/androidide/agent/tool/ListFilesHandler.kt`:
- Around line 42-49: The ListFilesCommand currently constructs targetDir with
File(baseDir, sanitizedPath) which allows path traversal; after resolving the
File (e.g., targetDir) compute canonical paths for target and base
(canonicalTarget and canonicalBase) and validate that canonicalTarget either
equals canonicalBase or startsWith canonicalBase + File.separator, returning a
ToolResult.failure("Path escapes project root") if it does not; add this check
in ListFilesCommand just before any file listing (do not rely only on
normalizePath).

In
`@agent/src/main/java/com/itsaky/androidide/agent/utils/DynamicLibraryLoader.kt`:
- Around line 77-78: In DynamicLibraryLoader.kt update the logic that reads
Build.SUPPORTED_ABIS[0] to defensively handle an empty array: in the
method/initializer that computes abi and nativeLibDir, first obtain a safe ABI
value by checking Build.SUPPORTED_ABIS is not empty (e.g., use firstOrNull()),
falling back to Build.CPU_ABI and then to a sensible default string if both are
null/empty; then use that safe abi when building nativeLibDir ("jni/$abi") and
make sure any downstream code that assumes a valid ABI handles the fallback or
throws a clear exception.

In `@agent/src/main/java/com/itsaky/androidide/agent/viewmodel/ChatViewModel.kt`:
- Around line 210-219: The agentRepository is being stopped and set to null
without releasing its resources; call agentRepository.destroy() before setting
agentRepository = null (in the configChanged branch shown and the other similar
branch around lines 511-518) so native/model resources are released, then call
observeRepositoryMessages(null), clear _commandMessages, and proceed with
ensureHistoryVisible using _currentSession.value?.messages as before; reference
the agentRepository.stop(), agentRepository.destroy(),
observeRepositoryMessages, _commandMessages, ensureHistoryVisible and
_currentSession symbols when making the change.
- Around line 456-489: postSystemMessage and postSystemError currently append
the same ChatMessage to both session.messages and _commandMessages which later
get concatenated and cause duplicates; modify both functions (postSystemMessage
and postSystemError) to avoid double-emitting by either 1) only updating
_commandMessages when the session is ephemeral/no active repo (e.g., when the
session is not persisted or when an active repo/session history repository is
absent), or 2) before adding to _commandMessages perform a dedupe check against
existing messages by id/text so you skip emitting if the message already exists
in session.messages; keep adding to session.messages, calling
ensureHistoryVisible, scheduleSaveCurrentSession and setting _agentState as
before.

In `@app/src/main/java/com/itsaky/androidide/activities/MainActivity.kt`:
- Line 151: The benchmark overlay is currently attached unconditionally via
LocalBenchmarkRunner(this).attach(), exposing a developer FAB in production;
modify MainActivity to only call LocalBenchmarkRunner(...).attach() when a
debug/feature flag is enabled (e.g., BuildConfig.DEBUG or your existing feature
flag check), by wrapping the attach() call in an if (BuildConfig.DEBUG) {
LocalBenchmarkRunner(this).attach() } or equivalent feature-flag conditional to
ensure the overlay is only active for dev/test builds or opted-in users.

In
`@app/src/main/java/com/itsaky/androidide/agent/repository/LlmInferenceEngine.kt`:
- Around line 208-217: In countTokens, the catch block currently swallows
exceptions silently; update the catch in suspend fun countTokens(text: String)
(which uses llamaController, isModelLoaded, ioDispatcher, and
estimateTokenCount) to log the thrown Throwable before falling back to
estimateTokenCount so failures are observable (e.g., call your
logger/processLogger with a concise message and the exception inside the catch).
- Around line 265-273: The DISPLAY_NAME lookup can crash because getColumnIndex
may return -1 and cursor.moveToFirst() may be false; update the
contentResolver.query usage around modelUri/modelUriString and the displayName
assignment to first check that the returned cursor is non-null,
cursor.moveToFirst() returns true, and that
cursor.getColumnIndex(OpenableColumns.DISPLAY_NAME) != -1 before calling
cursor.getString(nameIndex); if any check fails, fall through to the existing
fallback (the run { ... } block) so the fallback path is honored and no
IllegalArgumentException or invalid cursor access occurs.

In `@app/src/main/java/com/itsaky/androidide/agent/repository/utils.kt`:
- Around line 110-122: The regex in parseFunctionStyleToolCall doesn't match
multiline arguments because (.*) doesn't include newlines; update the pattern to
allow dot to match newlines (e.g., enable DOT_MATCHES_ALL via
RegexOption.DOT_MATCHES_ALL or use the (?s) inline flag) so the capture group
for argsRaw can span multiple lines, then keep the existing argsRaw handling
(substringBefore("<"), jsonParser.parseToJsonElement, and buildToolCall)
unchanged.

In `@app/src/main/java/com/itsaky/androidide/agent/viewmodel/ChatViewModel.kt`:
- Around line 379-390: The saveAllSessionsAndState function cancels saveJob but
doesn't reassign it, so multiple rapid calls can run concurrent coroutines;
update saveAllSessionsAndState to assign the launched coroutine to saveJob
(e.g., saveJob = viewModelScope.launch(Dispatchers.IO) { ... }) so subsequent
calls can cancel the active job, and keep existing usage of saveJob?.cancel() at
the top to ensure previously started save coroutines are canceled before
starting a new one; reference the saveAllSessionsAndState function and the
saveJob property when making this change.

In `@app/src/main/java/com/itsaky/androidide/utils/LocalBenchmarkRunner.kt`:
- Around line 33-63: The full-screen overlay FrameLayout named "overlay" blocks
touches to underlying views even though isClickable/isFocusable are false;
remove the overlay and add the FloatingActionButton instance "fab" directly to
the existing "root" view using the same FrameLayout.LayoutParams (gravity
bottom|end and margins) so hits outside the FAB reach underlying UI; delete
creation/attachment of "overlay" and replace overlay.addView(fab, params);
root.addView(overlay) with root.addView(fab, params) (ensuring the params type
matches root's layout if needed).
🧹 Nitpick comments (14)
subprojects/projects/src/main/java/com/itsaky/androidide/projects/IProjectManager.kt (1)

178-185: Leaky abstraction: interface extension depends on implementation class.

This extension function is defined on the IProjectManager interface but directly casts to and manipulates ProjectManagerImpl. This couples the interface layer to a specific implementation, which:

  • Violates interface/implementation separation
  • Silently skips caching for any other IProjectManager implementations
  • Makes testing with mock implementations ineffective for cache behavior

Consider moving the caching logic into ProjectManagerImpl itself (e.g., override a method or use a lazy property), keeping the interface layer implementation-agnostic.

agent/src/main/java/com/itsaky/androidide/agent/utils/DynamicLibraryLoader.kt (1)

50-54: Consider failing entirely on Zip Slip detection rather than skipping.

When a malicious zip entry is detected, the code logs and continues extraction. For a security-sensitive class loader, partial extraction could leave the directory in an inconsistent state. Consider deleting the directory and returning null (or throwing) on any Zip Slip violation.

🛡️ Proposed fix to fail on Zip Slip detection
                         // Zip Slip protection: ensure the entry resolves inside the target dir
                         if (!outputFile.canonicalPath.startsWith(versionedUnzipDir.canonicalPath + File.separator)) {
                             Log.e("DynamicLoad", "Zip entry outside target dir: ${entry.name}")
-                            continue
+                            baseUnzipDir.deleteRecursively()
+                            return null
                         }
app/src/androidTest/kotlin/com/itsaky/androidide/scenarios/NavigateToMainScreenScenario.kt (1)

48-106: Consider refactoring deeply nested try-catch blocks into a loop.

The 8+ levels of nested try-catch blocks for storage permission alternatives are difficult to maintain and read. A loop-based approach would be cleaner and more extensible.

♻️ Suggested refactor using a sequence of alternatives
SystemPermissionsScreen {
    val permissionViews = listOf(
        { storagePermissionView { click() } },
        { storagePermissionViewAlt1 { click() } },
        { storagePermissionViewAlt2 { click() } },
        { storagePermissionViewAlt3 { click() } },
        { storagePermissionViewAlt4 { click() } },
        { storagePermissionViewAlt5 { click() } },
        { storagePermissionViewAlt6 { click() } },
        { storagePermissionViewAlt7 { click() } },
        { storagePermissionViewAlt8 { click() } }
    )
    
    val clicked = permissionViews.any { action ->
        runCatching { action() }.isSuccess
    }
    
    if (!clicked) {
        println("No matching storage permission option found")
    }
}
agent/build.gradle.kts (1)

32-36: Use layout.buildDirectory instead of deprecated $buildDir.

$buildDir is deprecated in Gradle 8+. Use layout.buildDirectory.dir(...) for forward compatibility.

♻️ Suggested fix
     sourceSets {
         getByName("androidTest") {
-            assets.srcDirs("src/androidTest/assets", "$buildDir/generated/androidTest/assets")
+            assets.srcDirs("src/androidTest/assets", layout.buildDirectory.dir("generated/androidTest/assets"))
         }
     }

Similarly update line 46:

-    val outputDir = file("$buildDir/generated/androidTest/assets/dynamic_libs")
+    val outputDir = layout.buildDirectory.dir("generated/androidTest/assets/dynamic_libs").get().asFile
app/src/main/java/com/itsaky/androidide/agent/viewmodel/AiSettingsViewModel.kt (1)

91-101: Validate SHA‑256 format before passing to the engine.

If a user enters an invalid hash, the load will fail later with a vague error. Consider validating for 64 hex chars and surfacing a clear message.

🔧 Suggested guard
         _modelLoadingState.value = ModelLoadingState.Loading
-        val expectedHash = getLocalModelSha256()
-        val success = llmInferenceEngine.initModelFromFile(context, path, expectedHash)
+        val rawHash = getLocalModelSha256()
+        val expectedHash = rawHash
+            ?.trim()
+            ?.lowercase()
+            ?.takeIf { it.matches(Regex("^[0-9a-f]{64}$")) }
+        if (rawHash != null && expectedHash == null) {
+            _modelLoadingState.value = ModelLoadingState.Error("Invalid SHA-256 hash.")
+            return@launch
+        }
+        val success = llmInferenceEngine.initModelFromFile(context, path, expectedHash)
agent/src/androidTest/java/com/itsaky/androidide/agent/repository/LocalLlmIntegrationTest.kt (2)

117-168: Include the exception in the asset-missing log.

Logging the FileNotFoundException would make test setup failures easier to debug.

🔧 Suggested tweak
-        } catch (e: FileNotFoundException) {
-            Log.w(TAG, "Llama AAR not found in test assets: $assetName. " +
-                    "Make sure the copyLlamaAarForTests task ran during build.")
+        } catch (e: FileNotFoundException) {
+            Log.w(
+                TAG,
+                "Llama AAR not found in test assets: $assetName. " +
+                        "Make sure the copyLlamaAarForTests task ran during build.",
+                e
+            )

420-427: Consider logging failures when reading system properties.

Swallowing the exception makes diagnosing device-specific failures harder.

agent/src/test/java/com/itsaky/androidide/agent/repository/LocalAgenticRunnerSimplifiedWorkflowTest.kt (3)

116-144: Add negative assertions for IDE-only tools.
The test doesn’t currently prove IDE tools are excluded for non‑IDE queries.

💡 Suggested assertion additions
         assertTrue(
             "Prompt should include get_device_battery tool",
             capturedPrompt.contains("get_device_battery")
         )
+        assertTrue(
+            "Prompt should not include list_files tool",
+            !capturedPrompt.contains("list_files")
+        )
+        assertTrue(
+            "Prompt should not include read_file tool",
+            !capturedPrompt.contains("read_file")
+        )

146-165: Strengthen IDE-query test to assert IDE tool inclusion.
Right now it only checks the prompt is non-empty.

💡 Suggested assertion addition
         assertTrue(
             "Prompt should be generated",
             capturedPrompt.isNotEmpty()
         )
+        assertTrue(
+            "Prompt should include IDE tools",
+            capturedPrompt.contains("list_files") || capturedPrompt.contains("read_file")
+        )

166-181: Consider consolidating with the earlier “simplified workflow” test.
This duplicates the same plan-absence assertion.

agent/src/main/java/com/itsaky/androidide/agent/fragments/ChatFragment.kt (1)

174-223: Consider capping context file size in the master prompt.
Reading full file contents without a limit can lead to huge prompts or memory spikes.

agent/src/main/java/com/itsaky/androidide/agent/viewmodel/ChatViewModel.kt (1)

752-765: Make intentional no-op callbacks explicit.

Detekt flags empty function blocks. Using = Unit (or a suppression) makes intent explicit and clears the warning.

🧹 Minimal tweak
             val emitter = object : ShellCommandEventEmitter {
-                override suspend fun onCommandBegin(event: ExecCommandBegin) {}
-                override suspend fun onCommandEnd(event: ExecCommandEnd) {}
+                override suspend fun onCommandBegin(event: ExecCommandBegin) = Unit
+                override suspend fun onCommandEnd(event: ExecCommandEnd) = Unit
             }
app/src/main/java/com/itsaky/androidide/agent/repository/LlmInferenceEngine.kt (2)

186-195: Clamp maxTokens to native bounds to keep Kotlin state consistent.

Native configureMaxTokens clamps values; mirroring that here avoids state drift between configuredMaxTokens and actual native settings.

🔧 Suggested adjustment
     fun updateMaxTokens(maxTokens: Int) {
         val klass = llamaAndroidClass ?: return
         try {
-            configuredMaxTokens = maxTokens
-            klass.methods.firstOrNull { it.name == "configureMaxTokens" }
-                ?.invoke(null, maxTokens)
+            val clamped = maxTokens.coerceIn(64, 1024)
+            configuredMaxTokens = clamped
+            klass.methods.firstOrNull { it.name == "configureMaxTokens" }
+                ?.invoke(null, clamped)
         } catch (e: Exception) {
             android.util.Log.w("LlmEngine", "Failed to update max tokens", e)
         }
     }

381-395: Move KV cache clearing off the caller thread.

clearKvCache() can be heavy; running it on the caller thread risks UI jank. Move it to ioDispatcher before returning the Flow.

⚡ Suggested adjustment
     suspend fun runStreamingInference(
         prompt: String,
         stopStrings: List<String> = emptyList(),
         clearCache: Boolean = true,
         formatChat: Boolean = false
     ): Flow<String> {
         val controller = llamaController ?: throw IllegalStateException("Engine not initialized.")
         if (!isModelLoaded) throw IllegalStateException("Model is not loaded.")
 
-        if (clearCache) {
-            controller.clearKvCache()
-        }
+        if (clearCache) {
+            withContext(ioDispatcher) { controller.clearKvCache() }
+        }
         return controller.send(prompt, formatChat = formatChat, stop = stopStrings)
             .flowOn(ioDispatcher)
     }

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
app/src/main/java/com/itsaky/androidide/agent/repository/AgenticRunner.kt (1)

892-906: ⚠️ Potential issue | 🟡 Minor

Returning Unit for null primitives may cause unexpected behavior.

When a JSON primitive is null, the cascaded null-coalescing falls through to return Unit. This is semantically incorrect since Unit is not a meaningful value representation, and consumers of toStandardMap() may not expect Unit in the resulting map values.

🐛 Suggested fix
 private fun JsonElement.toAny(): Any {
     return when (this) {
         is JsonObject -> this.toStandardMap()
         is JsonArray -> this.map { it.toAny() }
         is JsonPrimitive -> {
             if (this.isString) {
                 this.content
             } else {
                 // Handles numbers, booleans, and nulls
-                this.contentOrNull ?: this.booleanOrNull ?: this.longOrNull ?: this.doubleOrNull
-                ?: Unit
+                this.booleanOrNull
+                    ?: this.longOrNull
+                    ?: this.doubleOrNull
+                    ?: this.contentOrNull
+                    ?: "null"  // or throw IllegalArgumentException("Unexpected null primitive")
             }
         }
     }
 }

Note: Also reordered the checks to prioritize typed accessors (booleanOrNull, longOrNull, doubleOrNull) before contentOrNull, since contentOrNull would return string representations of numbers/booleans.

🤖 Fix all issues with AI agents
In
`@agent/src/main/java/com/itsaky/androidide/agent/repository/LocalAgenticRunner.kt`:
- Around line 632-639: The destroy() method currently uses runBlocking which can
block the main thread (called from ChatViewModel.onCleared()); change it to a
non-blocking, fire-and-forget coroutine so unloadModel() runs off the main
thread: replace the runBlocking(Dispatchers.IO) { ... } block with launching a
coroutine on Dispatchers.IO (e.g., CoroutineScope(Dispatchers.IO).launch { if
(engine.isModelLoaded) engine.unloadModel() }) so engine.isModelLoaded and
engine.unloadModel() execute asynchronously; if you need synchronous teardown
instead, ensure destroy() is invoked from a background thread rather than from
ChatViewModel.onCleared().

In
`@agent/src/main/java/com/itsaky/androidide/agent/utils/ChatTranscriptUtils.kt`:
- Around line 36-38: The filename creation currently uses seconds-only timestamp
(SimpleDateFormat("yyyyMMdd-HHmmss") -> timestamp) which can collide; update the
logic that builds file (File(exportsDir, "chat-transcript-$timestamp.txt") in
ChatTranscriptUtils.kt) to include a higher-resolution suffix (e.g., add
milliseconds via SimpleDateFormat pattern "yyyyMMdd-HHmmssSSS" or append a UUID
via java.util.UUID.randomUUID()) so the generated filename is unique before
calling file.writeText(transcript, Charsets.UTF_8); keep exportsDir and the
"chat-transcript-" prefix unchanged for discoverability.
- Around line 31-33: The code creates exportsDir via the variable exportsDir but
only checks exists()/mkdirs(); update the logic to validate that if
exportsDir.exists() then exportsDir.isDirectory() is true, otherwise throw an
IOException with a clear message (including exportsDir.path) before attempting
to write; if it doesn't exist, continue to create it with mkdirs() and handle
failure as currently done.

In `@agent/src/main/java/com/itsaky/androidide/agent/viewmodel/ChatViewModel.kt`:
- Around line 290-320: The formatTime function in ChatViewModel is missing its
closing brace which causes onMessageAction and the class braces to be
misaligned; close the formatTime function right after the final return (the else
branch's String.format) by adding the missing '}' so formatTime(...) is properly
terminated, then ensure the subsequent onMessageAction(...) and the class-level
closing brace remain as-is and compile correctly.

In
`@app/src/main/java/com/itsaky/androidide/agent/repository/LocalLlmRepositoryImpl.kt`:
- Around line 183-196: The current stop list for tool outputs adds "}" for
ModelFamily.LLAMA3, QWEN, and H2O which can prematurely truncate nested JSON;
remove the plain "}" entry from toolStops in the engine.currentModelFamily
handling (symbols: toolStops, engine.currentModelFamily,
ModelFamily.LLAMA3/QWEN/H2O) and rely on the tag-based "</tool_call>" stop; if
JSON-only tool outputs must be supported, implement brace-aware
extraction/accumulation (balanced-brace parsing) in the tool parsing logic
instead of using "}" as a stop string.

In
`@subprojects/projects/src/main/java/com/itsaky/androidide/projects/ProjectManagerImpl.kt`:
- Around line 106-114: The destroy() method should also clear the
pluginProjectCached state to avoid retaining stale cache between reuse; update
the destroy() implementation to set pluginProjectCached = null (or the
appropriate reset value) alongside clearing workspace and androidBuildVariants,
referencing the pluginProjectCached property and the destroy() method so the
cache is reset whenever the manager is torn down even if projectPath/projectDir
were previously initialized.
🧹 Nitpick comments (8)
app/src/main/java/com/itsaky/androidide/agent/repository/AgenticRunner.kt (4)

363-383: Inconsistent localization: hardcoded English strings remain.

While other messages have been localized, getCurrentThoughtTitle() still returns hardcoded English strings at lines 373 and 382. For consistency with the PR's localization effort, consider extracting these to string resources as well.

Additionally, line 365 has irregular indentation (appears to use a tab character for the closing brace).

♻️ Suggested improvements
 private fun updateProcessingState(token: Long, stepNumber: Int) {
     onStateUpdateIfActive(token, AgentState.Processing("Step $stepNumber: ${getCurrentThoughtTitle()}"))
-		}
+}

 private fun getCurrentThoughtTitle(): String {
     if (currentProcessTitle.isNotBlank()) {
         return currentProcessTitle
     }

     if (currentAIAgentThought.isBlank()) {
-        return "Processing current request..."
+        return context.getString(R.string.agent_processing_current_request)
     }

     val fallback = currentAIAgentThought
         .lineSequence()
         .map { extractTaskDescription(it) }
         .firstOrNull { it.isNotBlank() }
         .orEmpty()

-    return fallback.ifBlank { "Processing current request..." }
+    return fallback.ifBlank { context.getString(R.string.agent_processing_current_request) }
 }

432-450: Inconsistent indentation style.

Lines 433-435 use tab characters while the rest of the file uses spaces. This creates visual inconsistency.

♻️ Suggested fix
 private fun extractPlannerMetadata(text: String): Pair<String?, String> {
-		if (text.isBlank()) {
-			return null to ""
-        }
+    if (text.isBlank()) {
+        return null to ""
+    }

728-735: Consider localizing user-facing fallback message.

While the instruction strings are prompts for the LLM (and appropriately kept in English), the fallbackText at line 731 is displayed to the user via updateLastMessage(). For consistency with other user-facing messages, consider localizing this fallback text.

♻️ Suggested change
     val thoughtSummary = currentAIAgentThought.takeIf { it.isNotBlank() }
-        ?: "Please review the latest tool outputs to continue where I left off."
+        ?: context.getString(R.string.agent_thought_summary_fallback)
     val fallbackText =
-        "Unable to complete tool-based actions, but here is the current understanding:\n\n$thoughtSummary"
+        context.getString(R.string.agent_unable_to_complete_actions, thoughtSummary)

795-798: Consider handling directory creation failure explicitly.

mkdirs() returns false if directory creation fails (e.g., due to permissions), but the result is ignored. While the outer try-catch will handle the subsequent write failure, explicit checking could provide a more descriptive error message.

♻️ Suggested improvement
     val logDir = File(context.getExternalFilesDir(null), "logs")
-    if (!logDir.exists()) {
-        logDir.mkdirs()
+    if (!logDir.exists() && !logDir.mkdirs()) {
+        log.warn("Failed to create log directory: ${logDir.absolutePath}")
+        return
     }
app/src/main/java/com/itsaky/androidide/agent/viewmodel/ChatViewModel.kt (2)

148-179: Minor: Redundant variable assignment.

Line 158 assigns val expectedHash = storedHash but storedHash is already available and immutable in this scope. The variable can be eliminated.

♻️ Suggested simplification
                            val needsReload =
                                !engine.isModelLoaded ||
                                    engine.loadedModelSourceUri != expectedModelPath ||
                                    storedHash != lastLoadedModelHash
-                            val expectedHash = storedHash
                            if (needsReload) {
                                val loaded = withContext(Dispatchers.IO) {
                                    engine.initModelFromFile(
                                        context,
                                        expectedModelPath,
-                                        expectedHash
+                                        storedHash
                                    )
                                }
                                if (!loaded) {
                                    log.error("Initialization failed: Local LLM model load failed.")
                                    return@run null
                                }
                            }

-							lastLoadedModelHash = expectedHash
+                           lastLoadedModelHash = storedHash

417-429: Fire-and-forget coroutine for previous session save is not cancellable.

The coroutine launched at lines 420-422 to save the previous session is not assigned to any job, so it cannot be cancelled if needed (e.g., during ViewModel cleanup). While concurrent saves for different sessions are generally safe, consider tracking this job or using a structured approach.

♻️ Optional: Track previous session saves
+    private var previousSessionSaveJob: Job? = null
+
     fun setCurrentSession(sessionId: String) {
         saveJob?.cancel()
+        previousSessionSaveJob?.cancel()
         val previousSession = _currentSession.value
-        viewModelScope.launch(Dispatchers.IO) {
+        previousSessionSaveJob = viewModelScope.launch(Dispatchers.IO) {
             previousSession?.let { chatStorageManager.saveSession(it) }
         }
agent/src/main/java/com/itsaky/androidide/agent/repository/Executor.kt (1)

26-38: Redundant alias: parallelSafeTools duplicates PARALLEL_SAFE_TOOLS.

Line 38 creates a private member that simply references the same constant. Consider removing the alias and using PARALLEL_SAFE_TOOLS directly in the partition call at line 58.

♻️ Suggested simplification
-        // Tools that only read IDE state and can be executed in parallel safely.
-        private val parallelSafeTools = PARALLEL_SAFE_TOOLS

Then update line 58:

-            parallelSafeTools.contains(call.name().getOrNull())
+            PARALLEL_SAFE_TOOLS.contains(call.name().getOrNull())
agent/src/main/java/com/itsaky/androidide/agent/viewmodel/ChatViewModel.kt (1)

802-805: Consider documenting intentional no-op callbacks.

The empty onCommandBegin and onCommandEnd implementations are flagged by detekt. Since these are intentionally no-ops for the test context, consider adding a brief comment or using explicit Unit returns to signal intent.

♻️ Suggested clarification
             val emitter = object : ShellCommandEventEmitter {
-                override suspend fun onCommandBegin(event: ExecCommandBegin) {}
-                override suspend fun onCommandEnd(event: ExecCommandEnd) {}
+                override suspend fun onCommandBegin(event: ExecCommandBegin) = Unit // No-op for test
+                override suspend fun onCommandEnd(event: ExecCommandEnd) = Unit // No-op for test
             }

jatezzz and others added 2 commits February 4, 2026 10:11
- Add milliseconds to chat transcript filenames to avoid conflicts.
- Clear the plugin project cache when the project manager is destroyed.
- Use `CoroutineScope` for model unloading to avoid blocking.
- Remove an unnecessary tool stop for Llama3 models.
- Add stricter checks for the chat exports directory.
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In
`@app/src/main/java/com/itsaky/androidide/agent/repository/LocalLlmRepositoryImpl.kt`:
- Around line 436-446: normalizeH2oToolResponse currently prefixes args with
"<tool_call>{\"name\":\"$requiredTool\",\"args\":" but does not ensure the
trailing "}" and "</tool_call>" are appended, causing parsing failures; update
normalizeH2oToolResponse to build the normalized string as prefix + trimmed +
"}</tool_call>" when the trimmed block looks like JSON args (e.g.
startsWith("{") or startsWith("}")) and only append the closing "}</tool_call>"
if the trimmed string does not already contain a closing brace or
"</tool_call>"; use the existing variables trimmed, prefix and requiredTool and
log the fully closed normalized string as before.
🧹 Nitpick comments (2)
agent/src/main/java/com/itsaky/androidide/agent/utils/ChatTranscriptUtils.kt (1)

28-51: Consider adding cache cleanup for old transcript files.

The chat_exports directory can accumulate files over time. While Android may clear the cache automatically, you could add a helper to prune old exports (e.g., files older than 7 days) to keep the cache lean, especially if users export transcripts frequently.

💡 Example cleanup helper
private const val MAX_AGE_MS = 7 * 24 * 60 * 60 * 1000L // 7 days

private fun pruneOldExports(exportsDir: File) {
    val cutoff = System.currentTimeMillis() - MAX_AGE_MS
    exportsDir.listFiles()?.filter { 
        it.isFile && it.lastModified() < cutoff 
    }?.forEach { it.delete() }
}

Call this at the start of writeTranscriptToCache after ensuring the directory exists.

agent/src/main/java/com/itsaky/androidide/agent/viewmodel/ChatViewModel.kt (1)

794-844: Optional: silence detekt empty-block warnings.
The no-op emitter methods trigger EmptyFunctionBlock. Consider expression bodies to avoid lint noise.

♻️ Suggested tweak
-            val emitter = object : ShellCommandEventEmitter {
-                override suspend fun onCommandBegin(event: ExecCommandBegin) {}
-                override suspend fun onCommandEnd(event: ExecCommandEnd) {}
-            }
+            val emitter = object : ShellCommandEventEmitter {
+                override suspend fun onCommandBegin(event: ExecCommandBegin) = Unit
+                override suspend fun onCommandEnd(event: ExecCommandEnd) = Unit
+            }

@jatezzz jatezzz changed the title Refactor Local LLM agent and tool handling ADFA-1843 | Refactor Local LLM agent and tool handling Feb 4, 2026
jatezzz and others added 2 commits February 4, 2026 12:34
This commit enhances the normalization of tool responses from the H2O model.
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In
`@app/src/main/java/com/itsaky/androidide/agent/repository/LocalLlmRepositoryImpl.kt`:
- Around line 109-118: The H2O branch in ModelFamily.H2O currently returns only
the last user message when requiredTool == null, losing conversation context;
implement a new buildH2oPrompt(history: List<Message>) (or similarly named) that
mirrors the pattern used by buildGemma2Prompt/buildLlama3Prompt but uses
H2O-appropriate system instructions and formatting (not Gemma-specific JSON/tool
instructions), then replace the last-user fallback in the H2O branch with a call
to buildH2oPrompt(history) and keep buildH2oToolPrompt(history, requiredTool)
for tool-required cases so both tool and non-tool flows preserve recent
conversation context.
🧹 Nitpick comments (2)
app/src/main/java/com/itsaky/androidide/agent/repository/LocalLlmRepositoryImpl.kt (2)

341-349: Consider displaying list_files output as an AGENT message.
Right now the “Files in project root” text overwrites the TOOL message, so duration tracking won’t attach to a user-visible agent bubble.

♻️ Suggested tweak
                         addMessage(result, Sender.TOOL)
                         if (toolCall.name == "list_files") {
-                            updateLastMessage("Files in project root:\n$result")
-                            updateLastMessageDuration(totalDurationMs)
-                            logSessionSummary(sessionStartWall, sessionStartCpu)
-                            break
+                            addMessage("Files in project root:\n$result", Sender.AGENT)
+                            updateLastMessageDuration(totalDurationMs)
+                            logSessionSummary(sessionStartWall, sessionStartCpu)
+                            break
                         }

755-780: Use word boundaries in detectRequiredTool to reduce false positives.
Substring matching (e.g., “estimate”) can trigger unintended tool calls.

♻️ Suggested refinement
         val text = prompt.lowercase()

-        if (text.contains("battery")) return "get_device_battery"
+        if (Regex("""\bbattery\b""").containsMatchIn(text)) return "get_device_battery"

-        val timeKeywords = listOf("time", "date", "datetime", "current time", "current date", "now")
-        if (timeKeywords.any { text.contains(it) }) return "get_current_datetime"
+        val timeKeywords = listOf(
+            Regex("""\btime\b"""),
+            Regex("""\bdate\b"""),
+            Regex("""\bdatetime\b"""),
+            Regex("""\bcurrent time\b"""),
+            Regex("""\bcurrent date\b"""),
+            Regex("""\bnow\b""")
+        )
+        if (timeKeywords.any { it.containsMatchIn(text) }) return "get_current_datetime"

@jatezzz jatezzz requested a review from Daniel-ADFA February 5, 2026 13:19
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 8

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
app/src/main/java/com/itsaky/androidide/agent/repository/LlmInferenceEngine.kt (2)

94-98: ⚠️ Potential issue | 🟡 Minor

Null check missing before unsafe cast — resolveLlamaInstance can return null.

If all reflection fallbacks fail, resolveLlamaInstance returns null. The cast at line 96 (null as ILlamaController) throws a TypeCastException that is caught but produces a misleading "Failed to initialize Llama class via reflection" log without indicating that no instance was resolved.

Proposed fix
             val llamaInstance = resolveLlamaInstance(llamaAndroidClass)
+            if (llamaInstance == null) {
+                log.error("Failed to resolve a LLamaAndroid instance from any fallback path.")
+                return@withContext false
+            }
 
-            llamaController = llamaInstance as ILlamaController
+            llamaController = llamaInstance as? ILlamaController
+                ?: run {
+                    log.error("Resolved instance does not implement ILlamaController: {}", llamaInstance.javaClass)
+                    return@withContext false
+                }
             isInitialized = true

404-414: ⚠️ Potential issue | 🟠 Major

unloadModel duplicates resetLoadedModelState and is not guarded by modelLoadMutex.

Two issues:

  1. DRY: Lines 409–413 duplicate the reset logic in resetLoadedModelState() (lines 383–389). If a new state field is added later, one site may be missed.

  2. Race: initModelFromFile guards model lifecycle with modelLoadMutex, but unloadModel does not. A concurrent call to unloadModel from a ViewModel (e.g., user navigates away) while initModelFromFile is in progress can corrupt state.

Proposed fix
     suspend fun unloadModel() {
+        modelLoadMutex.withLock {
         if (!isModelLoaded) return
         withContext(ioDispatcher) {
             llamaController?.unload()
         }
-        isModelLoaded = false
-        loadedModelPath = null
-        loadedModelSourceUri = null
-        loadedModelName = null
-        currentModelFamily = ModelFamily.UNKNOWN
+        resetLoadedModelState()
+        }
     }
🤖 Fix all issues with AI agents
In
`@app/src/main/java/com/itsaky/androidide/agent/repository/LlmInferenceEngine.kt`:
- Around line 445-449: The call to controller.clearKvCache() is running on the
caller's thread and must be executed on ioDispatcher; wrap or move the
clearKvCache() call into a withContext(ioDispatcher) (same pattern as
runInference) so the cache clearing occurs on the IO dispatcher before returning
controller.send(...). Update the method containing controller.send(...) to
ensure clearKvCache() runs inside withContext(ioDispatcher) (or otherwise
dispatch to ioDispatcher) and then call controller.send(prompt, formatChat =
formatChat, stop = stopStrings).flowOn(ioDispatcher).
- Around line 308-314: The code sets isModelLoaded and related state
unconditionally after calling llamaController?.load(), which silently no-ops
when llamaController is null; modify the block around llamaController?.load() so
you explicitly check for null (e.g., if (llamaController == null) {
log.error(...); throw or return a failure) and only set isModelLoaded,
loadedModelPath, loadedModelSourceUri, loadedModelName, and currentModelFamily
after a successful load; reference the llamaController.load call and the state
fields isModelLoaded, loadedModelPath, loadedModelSourceUri, loadedModelName,
and detectModelFamily(displayName) to locate and fix the logic.
- Around line 216-219: The method resetSamplingDefaults currently calls
configureNativeDefaults(context, llamaAndroidClass) which resets threads,
context size, max tokens, KV cache reuse and sampling; rename the method to
reflect the broader behavior (e.g., resetNativeDefaults) or change its
implementation to only reset sampling-related fields by calling a new utility
that restores sampling parameters (leave threads/context/maxTokens/KV cache
untouched). Update references to resetSamplingDefaults accordingly and keep
configureNativeDefaults and cachedContext usage intact.

In
`@app/src/main/java/com/itsaky/androidide/agent/repository/LocalLlmRepositoryImpl.kt`:
- Around line 571-600: buildQwenStrictToolPrompt currently tells the model to
"Return ONLY the JSON object" while ModelFamily.QWEN's stop-strings expect a
</tool_call> wrapper, causing generation to not stop; fix by aligning them:
update buildQwenStrictToolPrompt to require the JSON be wrapped in explicit tags
(e.g., "<tool_call>{...}</tool_call>") so the existing stop string
(</tool_call>) will terminate generation, or alternatively add JSON-aware stops
for QWEN (for example add "}" or "}}\n" as stop strings for ModelFamily.QWEN);
reference buildQwenStrictToolPrompt and the ModelFamily.QWEN stop-strings when
making the change.
- Around line 815-841: The detectRequiredTool function is using String.contains
which matches substrings and causes false positives; change the checks to
whole-word matching by using regular-expression word boundaries (e.g.,
Regex("\\b(...|...)\\b") or individual Regex("\\bkeyword\\b")) instead of
contains for all keyword lists (timeKeywords, fileKeywords, listKeywords,
projectKeywords and single checks like "battery", "weather", "temperature",
"file tree", "project structure"); build combined Regexes for each category (or
compile them once as val patterns) and call
pattern.containsMatchIn(text.lowercase()) so only whole-word matches trigger
get_current_datetime, list_files, get_weather, get_device_battery, etc.
- Around line 363-370: The code calls addMessage(result, Sender.TOOL) then
updateLastMessage(...) and updateLastMessageDuration(...), but
updateLastMessage/updateLastMessageDuration operate on the last AGENT message;
because addMessage(Tool) was added first they target the TOOL message. Fix by
ensuring the AGENT message exists before updating it: move the addMessage("",
Sender.AGENT) call so it runs before addMessage(result, Sender.TOOL) (or
alternatively call updateLastMessage on the correct message by inserting an
explicit AGENT message first). Update references to addMessage,
updateLastMessage, updateLastMessageDuration and Sender.AGENT/TOLF to reflect
this ordering so logSessionSummary(sessionStartWall, sessionStartCpu) and
duration updating apply to the AGENT final response.

In `@app/src/main/java/com/itsaky/androidide/agent/repository/utils.kt`:
- Around line 29-32: The parseFunctionStyleToolCall path can throw an uncaught
exception when parseFunctionStyleToolCall encounters malformed JSON; wrap the
call to parseFunctionStyleToolCall(responseText) (the block that currently
returns validateToolCall(parsed, toolKeys)) in a try/catch, catch JSON/parse
exceptions, log or ignore the error as appropriate, and return null (or continue
to the next parsing path) instead of letting the exception propagate; ensure
validateToolCall(parsed, toolKeys) is only invoked when parsing succeeded.

In `@resources/src/main/res/values/strings.xml`:
- Around line 44-45: Remove the fragile leading space inside the string resource
agent_planner_tool_blocked_note and instead include the separator whitespace in
the format pattern of agent_planner_retry_instruction by changing the
concatenation from "%1$s%2$s" to "%1$s %2$s" so callers combining %1$s and %2$s
get a single predictable space; update the two resources
(agent_planner_tool_blocked_note and agent_planner_retry_instruction)
accordingly and run a quick string-format check to ensure no double-spaces occur
in existing usages.
🧹 Nitpick comments (7)
resources/src/main/res/values/strings.xml (1)

1074-1082: Hardcoded emoji in translatable strings may cause i18n and rendering issues.

Several strings (e.g., ai_setting_saved, ai_setting_no_model_is_currently_loaded, ai_api_key_not_found) embed emoji directly. This couples the icon to the translated text—translators on Crowdin would need to preserve them manually, and the ZWJ sequence 🤷‍♀️ on Line 1076 may not render correctly on older Android versions. Consider moving the emoji out of the string resources and prepending them in code, or using a separate non-translatable prefix.

app/src/main/java/com/itsaky/androidide/agent/repository/utils.kt (2)

129-144: Consider a map for tool-name aliases instead of inline if.

Line 137's single hardcoded alias works today, but if more aliases emerge this will grow unwieldy. A private val toolAliases = mapOf("list_dir" to "list_files") lookup would be cleaner.


26-27: Debug logging of full response text — verify no sensitive content leaks.

log.debug will emit the entire LLM response at debug level. On-device this is low-risk, but if debug logs are ever shipped off-device (crash reporters, log aggregators), user prompts/responses could leak. Worth a quick sanity check on log routing.

app/src/main/java/com/itsaky/androidide/agent/repository/LocalLlmRepositoryImpl.kt (4)

487-505: Dead code: internal isFinalAnswerTurn check is unreachable.

buildGemma2Prompt is only called from buildPromptWithHistory (line 128) when the caller has already determined isFinalAnswerTurn == false. Since both checks use the same history reference and both inspect history.getOrNull(history.size - 2)?.sender == Sender.TOOL, the internal condition at line 488 can never be true. This dead branch adds confusion about which "final answer" definition governs the flow.

Consider removing lines 487–505 and keeping only the tool-selection prompt logic.


552-569: Hardcoded tool-name cleanup is incomplete and likely a no-op.

Line 555 only strips [Tool Result for get_current_datetime]: but not results from get_weather, get_device_battery, or list_files. Since addMessage(result, Sender.TOOL) on line 363 doesn't prepend any such prefix, this .replace(...) is currently dead code. Either generalize the cleanup (e.g., use a regex like Regex("\\[Tool Result for [a-zA-Z_]+]:")) or remove it entirely to avoid misleading future readers.


255-270: Duplicated step-end logging blocks.

The step-end metric-gathering and logging at lines 255–270 and 328–343 are nearly identical. Consider extracting to a small helper (e.g., logStepEnd(currentTurn, totalDurationMs, modelResponse, lastAttempt)) to reduce duplication and keep runAgentLoop more readable.

Also applies to: 328-343


768-782: Remove unused private method parseToolCall.

This method at lines 768–782 is never called within the class. All tool-call parsing goes through Util.parseToolCall (lines 283, 325).

@jatezzz jatezzz merged commit 9b963c1 into stage Feb 11, 2026
2 checks passed
@jatezzz jatezzz deleted the feat/local-llm-work branch February 11, 2026 13:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants