Blog
BeeLlama.cpp on a Mac Mini M4 16GB: DFlash and TurboQuant for Local LLMs (May 2026)
Anbeeld's BeeLlama.cpp ships DFlash speculative decoding, TurboQuant/TCQ KV-cache compression, and adaptive draft control. Honest take on which parts pay off on 16GB Apple Silicon.
DFlash and DDTree: 8x Faster LLM Inference via Block Diffusion and Draft Trees (Apr 2026)
Two 2026 papers that compound. DFlash swaps the autoregressive drafter in speculative decoding for a block-diffusion model; DDTree reuses DFlash's per-position distributions as a verified draft tree. Lossless. Up to 8.22x on Qwen3-Coder-30B.
Running Qwen 3.6 Locally on a Mac Mini M4 with 16GB RAM (Apr 2026)
A practical guide to running Qwen 3.6-35B-A3B locally on a $599 Mac Mini — real benchmarks and setup for llama.cpp, Ollama, LM Studio, and MLX.
The $20K Bug That Changed How We Think About Evals (Mar 2026)
How a $20K bug in SWE-Bench Pro revealed that benchmarking setup fundamentally changes what you measure about AI models.
Prompt Injection Attacks on Agentic Coding Assistants (Jan 2026)
Systematic analysis of prompt injection vulnerabilities in skills, tools, and protocol ecosystems of agentic coding assistants.
Breaking the Protocol: MCP Security Analysis and Prompt Injection in Tool-Integrated LLM Agents (Jan 2026)
Security analysis of the Model Context Protocol specification and prompt injection vulnerabilities in tool-integrated LLM agents.
Zencoder Leads SWE-bench Verified with 70% Success Rate (May 2025)
Zencoder achieved a 70% success rate on SWE-bench Verified using parallel agent execution and critic-based solution selection.
Investigating LLM-as-a-Judge Vulnerability to Prompt Injection (May 2025)
Security analysis of LLM-based evaluation systems and their susceptibility to adversarial manipulation.
Adversarial Attacks on LLM-as-a-Judge Systems (Apr 2025)
Comprehensive study of prompt injection techniques targeting automated LLM evaluation pipelines.
Neurosurgical Instrument Segmentation (Aug 2024)
Computer vision for tracking surgical instruments to assess microsurgical skill.
Prompt Injection Attacks in Defended Systems (Jun 2024)
Evaluating the effectiveness of prompt injection attacks against LLMs with defensive mechanisms.
Trojan Detection in Large Language Models (Apr 2024)
Insights from the Trojan Detection Challenge on identifying backdoors in language models.
Low-Resource Language Text Classification (2023)
Fine-tuning multilingual pretrained models for African language sentiment analysis.
Blind Face Restoration Survey (2023)
Comprehensive survey of deep learning methods for restoring degraded face images.
DIALOG-22 RuATD: Generated Text Detection (2022)
Detecting AI-generated text in Russian language using machine learning classifiers.
Noninvasive Glioma Grading with Deep Learning (2022)
Deep learning approach for non-invasive brain tumor classification from MRI scans.
MR-guided Non-invasive Brain Glioma Typing (2022)
Machine learning for MRI-based glioma classification in clinical neurosurgery settings.
Synthesis of L-coordinate Parallel Mechanism (2020)
Design of singularity-free parallel robotic mechanisms for precise positioning.