LINUS-AI is a self-hosted inference engine that runs powerful language models entirely on your hardware. One binary, no cloud dependencies, no telemetry, no API keys. Your data never leaves your machine.
LINUS-AI packs enterprise-grade inference capabilities into a single binary. No runtime, no containers, no external services.
Your prompts, responses, and embeddings never leave your hardware. No telemetry, no model training on your data, no remote logging.
One self-contained executable. No Python, no Node, no Docker required. Just download, chmod +x, and run on macOS, Linux, or Windows.
Split 70B+ models across up to 8 GPUs automatically. Shard weight tensors horizontally with NVLink-optimized AllReduce synchronization.
Distribute inference across machines on your private network. Peer auto-discovery via mDNS. Encrypted transport. No Kubernetes needed.
AES-256-GCM encrypted storage for conversation history, embeddings, and sensitive context. The vault key never leaves your machine.
Drop-in replacement for OpenAI's API. Works with LangChain, LlamaIndex, Open WebUI, Continue.dev, and any OpenAI SDK without code changes.
Auto backend selection: CUDA (NVIDIA), Metal (Apple Silicon), ROCm (AMD), and optimized CPU with AVX2 and AVX-512 acceleration.
Llama 3.3, Mistral 3, Phi-4, Qwen 2.5, Gemma 3, DeepSeek-R1, and more. GGUF quantization Q2_K through F16. Custom model support.
Spread transformer layers across multiple machines. Micro-batching hides inter-stage latency for near-linear throughput scaling.
Built-in tool calling, function execution, RAG pipeline, and multi-step reasoning. Run autonomous agents entirely on your hardware.
Serve multiple concurrent users efficiently with dynamic batching and priority queue scheduling for production deployments.
Built-in /metrics endpoint with token throughput, latency percentiles, GPU utilization, and queue depth. Grafana dashboards included.
HIPAA, GDPR, SOX, PCI-DSS, FINRA, EEOC, FERPA compliance tiers. HMAC-chained tamper-evident audit logs. Automatic PII scanning with blocking and redaction. Prompt injection detection.
Document-level access control enforced per user, department, division, company, and clearance level. PUBLIC to TOP_SECRET classification. Full tamper-evident RAG access audit trail.
Whether you're a solo developer or a security-conscious enterprise, LINUS-AI fits your deployment model.
HIPAA-compliant AI for clinical notes, research analysis, and internal documentation โ with zero PHI exposure risk.
Analyze contracts, draft documents, and research case law without sending confidential client data to third-party AI providers.
Run AI on trading data, internal reports, and customer analytics in an environment satisfying SOC 2 and regulatory requirements.
Air-gapped deployments on classified networks. No internet dependency after activation. Supports FIPS-adjacent encryption configs.
Run a local coding assistant via Continue.dev or Cursor on your MacBook Pro or Linux workstation. Instant responses, no API costs.
Deploy a company-wide ChatGPT alternative on your servers. Connect to your internal knowledge base. No data leaves your VPC.
vs. cloud AI APIs and other self-hosted solutions.
90-day access from $33 โ pay once or subscribe. Buy once, own forever โ updates are optional, not forced. Annual plans include continuous updates.
LINUS-AI is source-available, not open source. Licensed under LINUS-AI Source License v2.0. Free for personal use & orgs under $100K/yr revenue. Commercial use requires a paid tier.
| Feature | Community | Professional | Team | Enterprise | Enterprise Plus |
|---|---|---|---|---|---|
| Max model size | 5B | 70B+ | 70B+ | Unlimited | Unlimited |
| Industry AI profiles | 6 | 14 | 14 | Custom | Custom |
| Custom system prompts | โ | โ | โ | โ | โ |
| Tensor Parallelism | โ | โ | โ | โ | โ |
| Pipeline Parallelism | โ | โ | โ | โ | โ |
| Mesh Networking | โ | โ | โ | โ | โ |
| Federated Learning | โ | โ | โ | โ | โ |
| Seats | 1 | 1 | 5 | Unlimited | Unlimited |
| Air-gap activation | โ | โ | โ | โ | โ |
| Blockchain audit log | โ | โ | โ | โ | โ |
| SSO / LDAP | โ | โ | โ | โ | โ |
| HIPAA / SOC 2 BAA | โ | โ | โ | โ | โ |
| OEM / White-label | โ | โ | โ | Up to 3 products | Unlimited |
| Support | Community | Priority email | Email + 99% SLA | 24h SLA + named AM | |
| Price | Free | $499 + $99/yr | $1,499 + $199/yr | $7,999/yr | $14,999/yr |
Actively maintained and regularly updated. Full changelog on GitHub.
NEXUM multi-node orchestration, mDNS peer discovery, live thermal throttle rerouting, distributed audit ledger, encrypted vault, Tauri 2.0 packaging, Llama 3.3 + Phi-4 support.
Built-in tool calling, multi-step reasoning agent, RAG with local vector store, full /v1/chat/completions compatibility, SSE streaming. Works with LangChain, LlamaIndex, and Open WebUI.
Native Metal acceleration for Apple Silicon (M1โM4), ROCm backend for AMD GPUs, auto-backend detection, and improved Windows support.
Single binary, CPU + CUDA inference, GGUF model support, basic REST API, CLI chat, license activation system.
Everything you need to deploy, configure, and extend LINUS-AI. From single-node chat to multi-GPU distributed inference.
Install, configure, and chat. Start here if you're new to LINUS-AI. Covers setup, model loading, chat modes, and the CLI reference.
Deploy, manage, and secure LINUS-AI in production. Covers multi-user deployments, access control, monitoring, and updates.
APIs, integrations, and extensions. Build on top of LINUS-AI with the REST API, WebSocket streaming, and plugin system.
Distributed topologies, tensor parallelism, pipeline parallelism, and mesh networking for large-scale private AI.
Quickstart, platform binaries, modes, API reference, CLI reference, compliance overview, and licensing โ the complete README.
Architecture overview, module reference, security model, mesh protocol, inference pipeline, tensor parallelism, compliance layer, and changelog.
Complete REST API reference: all endpoints, request/response formats, and examples for inference, models, compliance, RAG, mesh, and billing.
Interactive flow diagrams for every subsystem: inference pipeline, payment flow, mesh networking, shell handler, blockchain ledger, and build system.
Download the free Community edition and have private AI running on your machine in under 5 minutes. No account required. No credit card. No telemetry.
Available for macOS, Linux, and Windows ยท Source-available ยท LINUS-AI License v2.0