FAQ

Answers to the most common questions about FowyldAI.

General

What is FowyldAI?

FowyldAI is a sovereign AI inference engine that runs entirely on your own hardware. It provides multi-model orchestration, OpenAI-compatible APIs, and zero external dependencies — your data never leaves your infrastructure.

How is FowyldAI different from other local AI tools?

Most local AI tools are wrappers around a single model. FowyldAI is a complete inference platform with multi-model orchestration, automatic query routing, a built-in network guard, TLS/mTLS support, and production-grade deployment options including air-gap and Kubernetes.

Is FowyldAI open source?

FowyldAI is a commercial product with four tiers: Starter, Pro, Business, and Enterprise. All plans include unlimited local inference with zero per-token cost. See the pricing page for details.

Sovereignty & Security

Does FowyldAI send any data externally?

No. FowyldAI makes zero outbound network connections. No telemetry, no analytics, no model downloads at runtime. You can verify this with GET /sovereignty/status or by running a network monitor.

Can I run FowyldAI in an air-gapped environment?

Yes. FowyldAI is designed for air-gap deployment. Models are shipped as sealed packages, and the Docker container runs with --network none. See the Air-Gap Deployment Guide.

What compliance frameworks does it support?

FowyldAI's architecture supports HIPAA, GDPR, FedRAMP, SOC 2 Type II, and ITAR compliance. Data residency is guaranteed by architecture, not just policy. See Data Sovereignty for details.

Performance

What hardware do I need?

Minimum: 4-core CPU, 16 GB RAM. Recommended: 8+ cores, 32 GB RAM, NVIDIA GPU with 8+ GB VRAM for fast inference. See Installation for full system requirements.

How fast is inference?

Performance depends on your hardware and model size. With a modern GPU (RTX 3060+), expect 30-80 tokens/second for general queries. CPU-only mode is slower but fully functional — typically 5-15 tokens/second.

Can I run multiple models simultaneously?

Yes. FowyldAI's orchestration engine manages multiple models in memory and routes queries to the optimal model based on task type and complexity. Configure loaded models in config.yaml.

Integration

Can I use existing OpenAI code with FowyldAI?

Yes. Change the base_url to your FowyldAI instance and your existing OpenAI SDK code works without modification. See OpenAI Integration.

Does it work with LangChain?

Yes. Use ChatOpenAI from langchain_openai with the FowyldAI base URL. See LangChain Integration.

Is there a VS Code extension?

Yes. The FowyldAI VS Code extension provides inline completions, code review, and a chat panel — all powered by your local instance. See VS Code Integration.

Licensing

What's the most affordable plan?

The Starter plan at $49/mo (or $39/mo billed annually) gives individual developers full access to the Crown Engine with unlimited local inference, basic workflows, and community support.

What do higher-tier plans include?

Pro ($149/mo) adds multi-agent workflows, Institutional Memory, and priority email support. Business ($399/mo) adds governance, audit logging, Gateway & MCP, and cloud augmentation. Enterprise offers custom integrations, compliance support, and dedicated deployment. See pricing.

Troubleshooting

FowyldAI won't start — "model not found"

Ensure models are downloaded and placed in the configured models directory. Verify paths in config.yaml match the actual file locations.

Slow inference on GPU

Check that NVIDIA drivers and CUDA toolkit are installed. Run nvidia-smi to verify GPU is detected. Ensure Docker is launched with --gpus all.

Port already in use

Another process is using port 8000. Either stop the other process or change the port in config.yaml: server.port: 8001.

Still have questions? Check the Support page for community and enterprise support options.