๐ŸŽ‰ LINUS-AI v4.0.2 is out โ€” NEXUM Multi-Node Mesh & Thermal Routing. Download now โ†’
โ† Product Portfolio โ€บ LINUS-AI Inference Engine
v4.0.2 โ€” NEXUM Multi-Node Mesh & Thermal Routing

Private AI.
Your Hardware.
Zero Compromise.

LINUS-AI is a self-hosted inference engine that runs powerful language models entirely on your hardware. One binary, no cloud dependencies, no telemetry, no API keys. Your data never leaves your machine.

macOS ยท Linux ยท Windows โšก Single Binary ๐Ÿ”’ 100% Offline ๐Ÿฆ™ Llama ยท Mistral ยท Phi ยท Qwen ๐ŸŽ› CUDA ยท Metal ยท ROCm ยท CPU
linus-ai โ€” quick start
$ linus-ai --activate LNAI-XXXX-XXXX-XXXX-XXXX
โœ“ License activated ยท Professional ยท 1/1 seats
 
$ linus-ai --pull-model llama3.2 && linus-ai --serve
Pulling llama3.2 (4.1 GB) โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ 100%
โœ“ API server running โ†’ http://localhost:9480
โœ“ OpenAI-compatible ยท Zero telemetry ยท Fully offline
50+
Supported Models
8
Max GPUs (Tensor Parallel)
0
External API Calls
1
Binary, No Dependencies
Source-Available
LINUS-AI License v2.0

Everything you need for private AI

LINUS-AI packs enterprise-grade inference capabilities into a single binary. No runtime, no containers, no external services.

๐Ÿ”’

100% Private by Design

Your prompts, responses, and embeddings never leave your hardware. No telemetry, no model training on your data, no remote logging.

โšก

Single Binary

One self-contained executable. No Python, no Node, no Docker required. Just download, chmod +x, and run on macOS, Linux, or Windows.

โˆ‘

Tensor Parallelism

Split 70B+ models across up to 8 GPUs automatically. Shard weight tensors horizontally with NVLink-optimized AllReduce synchronization.

๐Ÿ•ธ

Mesh Networking

Distribute inference across machines on your private network. Peer auto-discovery via mDNS. Encrypted transport. No Kubernetes needed.

๐Ÿ›ก

Encrypted Vault

AES-256-GCM encrypted storage for conversation history, embeddings, and sensitive context. The vault key never leaves your machine.

๐Ÿ”Œ

OpenAI-Compatible API

Drop-in replacement for OpenAI's API. Works with LangChain, LlamaIndex, Open WebUI, Continue.dev, and any OpenAI SDK without code changes.

๐ŸŽ›

Multi-Backend Inference

Auto backend selection: CUDA (NVIDIA), Metal (Apple Silicon), ROCm (AMD), and optimized CPU with AVX2 and AVX-512 acceleration.

๐Ÿค–

50+ Models Supported

Llama 3.3, Mistral 3, Phi-4, Qwen 2.5, Gemma 3, DeepSeek-R1, and more. GGUF quantization Q2_K through F16. Custom model support.

๐Ÿ”—

Pipeline Parallelism

Spread transformer layers across multiple machines. Micro-batching hides inter-stage latency for near-linear throughput scaling.

๐Ÿ“ฆ

Agentic Mode

Built-in tool calling, function execution, RAG pipeline, and multi-step reasoning. Run autonomous agents entirely on your hardware.

๐Ÿ”„

Continuous Batching

Serve multiple concurrent users efficiently with dynamic batching and priority queue scheduling for production deployments.

๐Ÿ“Š

Prometheus Metrics

Built-in /metrics endpoint with token throughput, latency percentiles, GPU utilization, and queue depth. Grafana dashboards included.

โš–๏ธ

Compliance & Audit

HIPAA, GDPR, SOX, PCI-DSS, FINRA, EEOC, FERPA compliance tiers. HMAC-chained tamper-evident audit logs. Automatic PII scanning with blocking and redaction. Prompt injection detection.

๐Ÿ—‚

RAG Access Control

Document-level access control enforced per user, department, division, company, and clearance level. PUBLIC to TOP_SECRET classification. Full tamper-evident RAG access audit trail.

Built for teams that can't afford data exposure

Whether you're a solo developer or a security-conscious enterprise, LINUS-AI fits your deployment model.

๐Ÿฅ

Healthcare & Life Sciences

HIPAA-compliant AI for clinical notes, research analysis, and internal documentation โ€” with zero PHI exposure risk.

โš–๏ธ

Legal & Compliance

Analyze contracts, draft documents, and research case law without sending confidential client data to third-party AI providers.

๐Ÿฆ

Financial Services

Run AI on trading data, internal reports, and customer analytics in an environment satisfying SOC 2 and regulatory requirements.

๐Ÿ›ก

Defense & Government

Air-gapped deployments on classified networks. No internet dependency after activation. Supports FIPS-adjacent encryption configs.

๐Ÿ’ป

Developer Workstation

Run a local coding assistant via Continue.dev or Cursor on your MacBook Pro or Linux workstation. Instant responses, no API costs.

๐Ÿข

Enterprise Private Chat

Deploy a company-wide ChatGPT alternative on your servers. Connect to your internal knowledge base. No data leaves your VPC.

How LINUS-AI compares

vs. cloud AI APIs and other self-hosted solutions.

โ˜๏ธ Cloud AI APIs

PrivacyNone
Data Retention30+ days
Offline UseImpossible
Per-Token Cost$0.002โ€“0.06
LatencyVariable
Setup TimeMinutes
DependenciesSDK only

โ˜… LINUS-AI

PrivacyTotal
Data RetentionYou control
Offline UseFull support
Per-Token Cost$0
LatencyLocal speed
Setup Time60 seconds
DependenciesNone

๐Ÿณ Other Self-Hosted

PrivacyGood
Data RetentionYou control
Offline UseYes
Per-Token Cost$0
LatencyLocal speed
Setup TimeHours (Docker)
DependenciesPython, Docker

Try 90 days.
Scale when you need to.

90-day access from $33 โ€” pay once or subscribe. Buy once, own forever โ€” updates are optional, not forced. Annual plans include continuous updates.

LINUS-AI is source-available, not open source. Licensed under LINUS-AI Source License v2.0. Free for personal use & orgs under $100K/yr revenue. Commercial use requires a paid tier.

Community
$0
Free forever
  • Single node inference
  • 5B model limit
  • 6 core AI profiles
  • 90-day access available from $33
  • Local OpenAI-compatible API
  • Browser control panel GUI
  • Tensor Parallelism
  • Pipeline Parallelism
  • Mesh Networking
  • 70B+ models
  • All 14 profiles
  • Support
Download
Team
$1,499
one-time ยท perpetual
+
$199/year updates
  • Everything in Professional
  • Up to 5 seats
  • Centralised licence management
  • Priority email support (<24h)
  • Licence transfer between machines
  • Federated Learning support
  • Blockchain audit ledger
  • Multi-node tensor clusters
  • Custom profile deployment
  • 1 year of updates (perpetual)
  • Air-gap activation
  • SLA
Enterprise
Enterprise
$7,999
per year ยท unlimited seats
  • Everything in Team
  • Unlimited seats
  • Air-gap / offline activation
  • Custom SLA (99.9% uptime)
  • Custom model fine-tuning
  • SSO / LDAP integration
  • HIPAA / SOC 2 BAA available
  • OEM rights (up to 3 products)
  • Email support + 99% uptime SLA
  • Invoice / PO available
Need an invoice? Contact Sales โ†’
OEM ยท White-label
Enterprise Plus
$14,999
per year ยท unlimited OEM
  • Everything in Enterprise
  • Unlimited OEM products
  • Full white-label / brand removal
  • 24h SLA for critical issues
  • Named account manager
  • Annual architecture review
  • Source code escrow available
  • Custom build flags on request
Prefer to talk first? โ†’
All paid plans include: no per-token costs, full offline operation after activation, and 12 months of updates from purchase. Buy once, own forever โ€” updates are optional, not forced.

Feature comparison

Feature CommunityProfessionalTeam EnterpriseEnterprise Plus
Max model size5B70B+70B+UnlimitedUnlimited
Industry AI profiles61414CustomCustom
Custom system promptsโ€”โœ“โœ“โœ“โœ“
Tensor Parallelismโ€”โœ“โœ“โœ“โœ“
Pipeline Parallelismโ€”โœ“โœ“โœ“โœ“
Mesh Networkingโ€”โœ“โœ“โœ“โœ“
Federated Learningโ€”โ€”โœ“โœ“โœ“
Seats115UnlimitedUnlimited
Air-gap activationโ€”โ€”โ€”โœ“โœ“
Blockchain audit logโ€”โ€”โœ“โœ“โœ“
SSO / LDAPโ€”โ€”โ€”โœ“โœ“
HIPAA / SOC 2 BAAโ€”โ€”โ€”โœ“โœ“
OEM / White-labelโ€”โ€”โ€”Up to 3 productsUnlimited
SupportCommunityEmailPriority emailEmail + 99% SLA24h SLA + named AM
PriceFree$499 + $99/yr$1,499 + $199/yr$7,999/yr$14,999/yr

Recent Releases

Actively maintained and regularly updated. Full changelog on GitHub.

v4.0.0

NEXUM Platform, Multi-Node Mesh & Thermal Routing Latest

NEXUM multi-node orchestration, mDNS peer discovery, live thermal throttle rerouting, distributed audit ledger, encrypted vault, Tauri 2.0 packaging, Llama 3.3 + Phi-4 support.

v3.0.0

Agentic Mode, RAG Pipeline & OpenAI Compatibility

Built-in tool calling, multi-step reasoning agent, RAG with local vector store, full /v1/chat/completions compatibility, SSE streaming. Works with LangChain, LlamaIndex, and Open WebUI.

v2.0.0

Apple Metal & AMD ROCm Support

Native Metal acceleration for Apple Silicon (M1โ€“M4), ROCm backend for AMD GPUs, auto-backend detection, and improved Windows support.

v1.0.0

Initial Release

Single binary, CPU + CUDA inference, GGUF model support, basic REST API, CLI chat, license activation system.

Frequently Asked Questions

Is LINUS-AI truly private? Where does my data go?+
Yes โ€” completely private. All inference happens on your local hardware. We never receive your prompts, model outputs, or conversation history. License activation (one-time) sends only your license key and machine fingerprint. After that, the software operates fully offline with zero outbound connections.
What models are supported? Can I use my own fine-tuned model?+
LINUS-AI supports any GGUF-format model: Llama 3.x, Mistral 3, Phi-4, Qwen 2.5, Gemma 3, DeepSeek-R1, Falcon, StarCoder, and many more. You can load custom fine-tuned models by pointing to the GGUF file in your config.
Do I need a GPU? What hardware do I need?+
No GPU required. LINUS-AI runs on any CPU with AVX2 support (most CPUs since 2013). A modern laptop with 16 GB RAM can comfortably run 7Bโ€“13B parameter models. GPUs dramatically improve performance โ€” an RTX 3090 runs a 70B model in Q4 quantization at 20โ€“30 tokens/second. Apple Silicon is particularly well-supported via Metal.
How does it compare to Ollama, LM Studio, or llama.cpp?+
LINUS-AI goes further in the enterprise direction: tensor parallelism, mesh networking, encrypted vault, access control, and commercial SLAs. If you just need a simple local chatbot, Ollama is simpler. If you need production-grade private AI infrastructure, LINUS-AI is built for that.
Is the source code open?+
LINUS-AI is source-available under the LINUS-AI Source License v2.0 โ€” not MIT or open source. Community Edition is free for personal use and companies under $100K/yr revenue. Commercial use requires a paid license.
Is there a refund policy?+
Perpetual and annual licences: contact support@linus-ai.com within 7 days of purchase if you have a technical issue we cannot resolve. 90-day access plans: non-refundable once the license file has been generated and delivered. Subscriptions: cancel anytime to stop future charges; no refund for the current paid period.
What happens when the optional updates plan expires?+
The software continues to run indefinitely โ€” your perpetual licence never expires. You can use the version you downloaded forever. You just won't receive new feature updates unless you renew the updates plan.

LINUS-AI Documentation

Everything you need to deploy, configure, and extend LINUS-AI. From single-node chat to multi-GPU distributed inference.

โ—ˆ

User Guide

Install, configure, and chat. Start here if you're new to LINUS-AI. Covers setup, model loading, chat modes, and the CLI reference.

Open User Guide โ†’
โš™

Admin Guide

Deploy, manage, and secure LINUS-AI in production. Covers multi-user deployments, access control, monitoring, and updates.

Open Admin Guide โ†’
โŒจ

Developer Guide

APIs, integrations, and extensions. Build on top of LINUS-AI with the REST API, WebSocket streaming, and plugin system.

Open Developer Guide โ†’
โˆ‘

Architect Guide

Distributed topologies, tensor parallelism, pipeline parallelism, and mesh networking for large-scale private AI.

Open Architect Guide โ†’
๐Ÿ“–

README

Quickstart, platform binaries, modes, API reference, CLI reference, compliance overview, and licensing โ€” the complete README.

Open README โ†’
โš—

Technical Specification

Architecture overview, module reference, security model, mesh protocol, inference pipeline, tensor parallelism, compliance layer, and changelog.

Open Tech Spec โ†’
โฌก

API Endpoint Reference

Complete REST API reference: all endpoints, request/response formats, and examples for inference, models, compliance, RAG, mesh, and billing.

Open API Ref โ†’
โ—ซ

System Diagrams

Interactive flow diagrams for every subsystem: inference pipeline, payment flow, mesh networking, shell handler, blockchain ledger, and build system.

Open Diagrams โ†’

Ready to take AI off the cloud?

Download the free Community edition and have private AI running on your machine in under 5 minutes. No account required. No credit card. No telemetry.

Available for macOS, Linux, and Windows ยท Source-available ยท LINUS-AI License v2.0