Features

Everything you need for private AI

LINUS-AI packs enterprise-grade inference capabilities into a single binary. No runtime, no containers, no external services.

🔒

100% Private by Design

Your prompts, responses, and embeddings never leave your hardware. No telemetry, no model training on your data, no remote logging.

⚡

Single Binary

One self-contained executable. No Python, no Node, no Docker required. Just download, chmod +x, and run on macOS, Linux, or Windows.

∑

Tensor Parallelism

Split 70B+ models across up to 8 GPUs automatically. Shard weight tensors horizontally with NVLink-optimized AllReduce synchronization.

🕸

Mesh Networking

Distribute inference across machines on your private network. Peer auto-discovery via mDNS. Encrypted transport. No Kubernetes needed.

🛡

Encrypted Vault

AES-256-GCM encrypted storage for conversation history, embeddings, and sensitive context. The vault key never leaves your machine.

🔌

OpenAI-Compatible API

Drop-in replacement for OpenAI's API. Works with LangChain, LlamaIndex, Open WebUI, Continue.dev, and any OpenAI SDK without code changes.

🎛

Multi-Backend Inference

Auto backend selection: CUDA (NVIDIA), Metal (Apple Silicon), ROCm (AMD), and optimized CPU with AVX2 and AVX-512 acceleration.

🤖

50+ Models Supported

Llama 3.3, Mistral 3, Phi-4, Qwen 2.5, Gemma 3, DeepSeek-R1, and more. GGUF quantization Q2_K through F16. Custom model support.

🔗

Pipeline Parallelism

Spread transformer layers across multiple machines. Micro-batching hides inter-stage latency for near-linear throughput scaling.

📦

Agentic Mode

Built-in tool calling, function execution, RAG pipeline, and multi-step reasoning. Run autonomous agents entirely on your hardware.

🔄

Continuous Batching

Serve multiple concurrent users efficiently with dynamic batching and priority queue scheduling for production deployments.

📊

Prometheus Metrics

Built-in /metrics endpoint with token throughput, latency percentiles, GPU utilization, and queue depth. Grafana dashboards included.

⚖️

Compliance & Audit

HIPAA, GDPR, SOX, PCI-DSS, FINRA, EEOC, FERPA compliance tiers. HMAC-chained tamper-evident audit logs. Automatic PII scanning with blocking and redaction. Prompt injection detection.

🗂

RAG Access Control

Document-level access control enforced per user, department, division, company, and clearance level. PUBLIC to TOP_SECRET classification. Full tamper-evident RAG access audit trail.

Use Cases

Built for teams that can't afford data exposure

Whether you're a solo developer or a security-conscious enterprise, LINUS-AI fits your deployment model.

🏥

Healthcare & Life Sciences

HIPAA-compliant AI for clinical notes, research analysis, and internal documentation — with zero PHI exposure risk.

⚖️

Legal & Compliance

Analyze contracts, draft documents, and research case law without sending confidential client data to third-party AI providers.

🏦

Financial Services

Run AI on trading data, internal reports, and customer analytics in an environment satisfying SOC 2 and regulatory requirements.

🛡

Defense & Government

Air-gapped deployments on classified networks. No internet dependency after activation. Supports FIPS-adjacent encryption configs.

💻

Developer Workstation

Run a local coding assistant via Continue.dev or Cursor on your MacBook Pro or Linux workstation. Instant responses, no API costs.

🏢

Enterprise Private Chat

Deploy a company-wide ChatGPT alternative on your servers. Connect to your internal knowledge base. No data leaves your VPC.

Comparison

How LINUS-AI compares

vs. cloud AI APIs and other self-hosted solutions.

☁️ Cloud AI APIs

PrivacyNone

Data Retention30+ days

Offline UseImpossible

Per-Token Cost$0.002–0.06

LatencyVariable

Setup TimeMinutes

DependenciesSDK only

★ LINUS-AIPrivacyTotal
Data RetentionYou control
Offline UseFull support
Per-Token Cost$0
LatencyLocal speed
Setup Time60 seconds
DependenciesNone

🐳 Other Self-Hosted

PrivacyGood

Data RetentionYou control

Offline UseYes

Per-Token Cost$0

LatencyLocal speed

Setup TimeHours (Docker)

DependenciesPython, Docker

Pricing

Try 90 days.
Scale when you need to.

90-day access from $33 — pay once or subscribe. Buy once, own forever — updates are optional, not forced. Annual plans include continuous updates.

LINUS-AI is source-available, not open source. Licensed under LINUS-AI Source License v2.0. Free for personal use & orgs under $100K/yr revenue. Commercial use requires a paid tier.

Community

$0

Free forever

Single node inference
5B model limit
6 core AI profiles
90-day access available from $33
Local OpenAI-compatible API
Browser control panel GUI
Tensor Parallelism
Pipeline Parallelism
Mesh Networking
70B+ models
All 14 profiles
Support

Download

Feature comparison

Feature	Community	Professional	Team	Enterprise	Enterprise Plus
Max model size	5B	70B+	70B+	Unlimited	Unlimited
Industry AI profiles	6	14	14	Custom	Custom
Custom system prompts	—	✓	✓	✓	✓
Tensor Parallelism	—	✓	✓	✓	✓
Pipeline Parallelism	—	✓	✓	✓	✓
Mesh Networking	—	✓	✓	✓	✓
Federated Learning	—	—	✓	✓	✓
Seats	1	1	5	Unlimited	Unlimited
Air-gap activation	—	—	—	✓	✓
Blockchain audit log	—	—	✓	✓	✓
SSO / LDAP	—	—	—	✓	✓
HIPAA / SOC 2 BAA	—	—	—	✓	✓
OEM / White-label	—	—	—	Up to 3 products	Unlimited
Support	Community	Email	Priority email	Email + 99% SLA	24h SLA + named AM
Price	Free	$499 + $99/yr	$1,499 + $199/yr	$7,999/yr	$14,999/yr

Changelog

Recent Releases

Actively maintained and regularly updated. Full changelog on GitHub.

v4.0.0

NEXUM Platform, Multi-Node Mesh & Thermal Routing Latest

NEXUM multi-node orchestration, mDNS peer discovery, live thermal throttle rerouting, distributed audit ledger, encrypted vault, Tauri 2.0 packaging, Llama 3.3 + Phi-4 support.

v3.0.0

Agentic Mode, RAG Pipeline & OpenAI Compatibility

Built-in tool calling, multi-step reasoning agent, RAG with local vector store, full /v1/chat/completions compatibility, SSE streaming. Works with LangChain, LlamaIndex, and Open WebUI.

v2.0.0

Apple Metal & AMD ROCm Support

Native Metal acceleration for Apple Silicon (M1–M4), ROCm backend for AMD GPUs, auto-backend detection, and improved Windows support.

v1.0.0

Initial Release

Single binary, CPU + CUDA inference, GGUF model support, basic REST API, CLI chat, license activation system.

FAQ

Frequently Asked Questions

Is LINUS-AI truly private? Where does my data go?+

Yes — completely private. All inference happens on your local hardware. We never receive your prompts, model outputs, or conversation history. License activation (one-time) sends only your license key and machine fingerprint. After that, the software operates fully offline with zero outbound connections.

What models are supported? Can I use my own fine-tuned model?+

LINUS-AI supports any GGUF-format model: Llama 3.x, Mistral 3, Phi-4, Qwen 2.5, Gemma 3, DeepSeek-R1, Falcon, StarCoder, and many more. You can load custom fine-tuned models by pointing to the GGUF file in your config.

Do I need a GPU? What hardware do I need?+

No GPU required. LINUS-AI runs on any CPU with AVX2 support (most CPUs since 2013). A modern laptop with 16 GB RAM can comfortably run 7B–13B parameter models. GPUs dramatically improve performance — an RTX 3090 runs a 70B model in Q4 quantization at 20–30 tokens/second. Apple Silicon is particularly well-supported via Metal.

How does it compare to Ollama, LM Studio, or llama.cpp?+

LINUS-AI goes further in the enterprise direction: tensor parallelism, mesh networking, encrypted vault, access control, and commercial SLAs. If you just need a simple local chatbot, Ollama is simpler. If you need production-grade private AI infrastructure, LINUS-AI is built for that.

Is the source code open?+

LINUS-AI is source-available under the LINUS-AI Source License v2.0 — not MIT or open source. Community Edition is free for personal use and companies under $100K/yr revenue. Commercial use requires a paid license.

Is there a refund policy?+

Perpetual and annual licences: contact support@linus-ai.com within 7 days of purchase if you have a technical issue we cannot resolve. 90-day access plans: non-refundable once the license file has been generated and delivered. Subscriptions: cancel anytime to stop future charges; no refund for the current paid period.

What happens when the optional updates plan expires?+

The software continues to run indefinitely — your perpetual licence never expires. You can use the version you downloaded forever. You just won't receive new feature updates unless you renew the updates plan.

Documentation