Two backends, one tool surface
The point isn't to chat with the model — it's to give it real verbs inside the actual Emacs session. Both backends share one tool surface, switchable in any gptel buffer with C-u C-c RET:
- Local (default) —
llama.cppservinggpt-oss-120bover an OpenAI-compatible HTTP endpoint. Built from source against the Vulkan backend for the AMD W7900; see the Talos page for the build loop. - Remote — Anthropic API (Opus 4.7, Sonnet 4.6, Haiku 4.5). The key is read from auth-source for
api.anthropic.com; never hard-coded in the elisp.
The split maps onto the silo boundary used by the corpus pipeline (see the RAG page): the local backend is the personal silo, the remote backend is the workspace silo. Same Emacs, two different blast radii — and the toolkit is identical so the model behaves consistently across both.
Tools
| Category | Tool | What it does |
|---|---|---|
| web | search_web | DuckDuckGo HTML search, result rendered through shr to plain text. |
read_url | Fetch a URL and return shr-rendered text. | |
| math | calc_eval | Emacs Calc — arithmetic, big-int, units. |
r_eval | Rscript -e with output captured. | |
maxima_eval | Maxima CAS — algebra, calculus, equation solving. | |
python_eval | Sandboxed venv with numpy, scipy, pandas, sympy; code written to a tempfile and run as a script. | |
| system | safe_shell | Read-only shell, allowlist + blocklist gated. See below. |
| filesystem | search_files | rg --no-heading --line-number --max-count=50 --max-columns=200 over a given directory. |
find_files | find ... -name ... -type f, capped at 50 results. | |
| emacs | read_buffer | Read an open buffer by name. |
list_buffers | List visible buffers (skips internal " " buffers). | |
| reference | man_page | man -f plus the first 80 lines of man at COLUMNS=80. |
| util | current_datetime | Wall-clock day, date, and timezone. |
On top of these, llm-tool-collection appends its own grep / bash / filesystem / buffer tools so the model has a deep enough toolbox to finish tasks instead of stopping mid-thought to ask.
Sandboxed Python: python_eval
The Python tool runs against a dedicated venv under ~/.local/share/, isolated from system Python and Guix Python. Packages are numpy, scipy, pandas, sympy — no matplotlib (pointless in a CLI workstation; for plots, use R or ASCII).
Each call writes the model's code to a tempfile, runs python3 tempfile.py 2>&1 | head -500 from /tmp, and returns the captured output. The venv path, the line cap, and the working directory are baked in elisp — the model controls neither which Python it runs against nor where stdout goes.
SymPy is preferred for chained symbolic → numerical workflows (everything in one Python script). Maxima stays installed for quick one-liner symbolic queries where it's more concise. The two are complementary, not redundant.
Output discipline
Every tool that returns external data caps its output. This isn't politeness; it's flow control. An LLM that gets the entire journalctl back will choke on its own context.
- Web pages and buffer reads — first 8000 characters.
- Shell, Python, ripgrep — first 500 lines.
- Man pages — first 80 lines at 80 columns.
find/rgmatch counts — 50 hits.
The shell allowlist
safe_shell is the most powerful tool in the surface and the most dangerous. Three gates run in order on the trimmed command:
- Blocklist — if the command contains any blocked substring, refuse.
- Allowlist — the command must equal or start with a permitted prefix.
- Run —
cd ~ && <command> 2>&1 | head -500.
Allowlist (read-only / diagnostic):
ls find cat head tail wc du df free uptime uname whoami hostname date file stat diff sort uniq tr cut awk sed grep rg git log git status git diff git show git branch git remote git tag git rev-parse systemctl status systemctl is-active systemctl list-units journalctl ip ss ping dig curl spack find spack info spack list zpool status zpool list zfs list zfs get guix package guix system dpkg -l dpkg -L dpkg -s apt list apt show env printenv id groups lsblk lscpu lspci top -bn1 ps pgrep nproc getconf
Blocklist (substring match):
rm rmdir mv cp dd mkfs chmod chown kill killall reboot shutdown poweroff halt mount umount fdisk parted wipefs iptables nft useradd userdel passwd > >> |rm ; rm && rm sudo su eval exec
Allowlist over denylist by design: an allowlist fails as "rejected something legitimate," a denylist fails as "ran something it shouldn't have." The blocklist sits in front of the allowlist to catch shell-escape patterns (>, >>, ; rm, sudo) hiding inside an otherwise-allowed prefix.
The allowlist / blocklist pair is the load-bearing safety boundary for the entire local-LLM workflow. Changes there warrant care.
How the directives are written
System prompt for both backends includes the line: "It is so nice to meet you in Emacs! Please be concise and correct. If information is beyond your cutoff date, please do a web search. My notes are in ~/notes. My native tongue is org-mode not markdown. Thank you so much! Reasoning: high"
Two choices worth flagging. First, "org-mode not markdown" — the model defaults to markdown formatting, which renders as visual noise inside an Emacs buffer. Asking explicitly for org keeps headings, tables, and code blocks legible at the point of consumption. Second, "Reasoning: high" — for the local backend (gpt-oss), this nudges the model toward fuller chain-of-thought before answering. Anthropic models ignore it harmlessly.