Core
| Feature | Status |
|---|---|
| LLM inference (llama.cpp via FFI) | β Done |
| Default tools: shell, read, write, glob, websearch | β Done |
| MCP server support | β Done |
| REST API | π§ Planned |
Orchestrator
A multi-agent orchestration layer allowing multiple specialized workers to collaborate on complex tasks.| Feature | Status |
|---|---|
| Model routing between workers | π§ Planned |
| Per-worker system prompts | π§ Planned |
| Per-worker built-in tool sets | π§ Planned |
Voice Mode
Fully local speech interaction using embedded inference for all stages.| Feature | Status |
|---|---|
| Voice Activity Detection (VAD) inference | π§ Planned |
| Noise Cancellation inference | π§ Planned |
| Speech-to-Text (STT) inference | π§ Planned |
| Text-to-Speech (TTS) inference | π§ Planned |
Multimodal
| Feature | Status |
|---|---|
| Image generation model inference | β Done |
| Video generation model inference | π§ Planned |
Contributing
Want to help build these features? Contributions are welcome.- Open a pull request to contribute code
- Open an issue to report bugs or propose ideas