The bmannconsulting.com website
1--- 2github: https://github.com/exo-explore/exo 3tags: 4 - opensource 5 - AI 6 - LocalAI 7 - cluster 8--- 9Run your own AI cluster at home with everyday devices. 10 11Unify your existing devices into one powerful GPU: iPhone, iPad, Android, Mac, NVIDIA, Raspberry Pi, pretty much any device! 12## Features 13 14From the [README](https://github.com/exo-explore/exo#features) 15 16### Wide Model Support 17 18exo supports different models including LLaMA ([MLX](https://github.com/exo-explore/exo/blob/main/exo/inference/mlx/models/llama.py) and [tinygrad](https://github.com/exo-explore/exo/blob/main/exo/inference/tinygrad/models/llama.py)), Mistral, LlaVA, Qwen, and Deepseek. 19 20### Dynamic Model Partitioning 21 22exo [optimally splits up models](https://github.com/exo-explore/exo/blob/main/exo/topology/ring_memory_weighted_partitioning_strategy.py) based on the current network topology and device resources available. This enables you to run larger models than you would be able to on any single device. 23 24### Automatic Device Discovery 25 26exo will [automatically discover](https://github.com/exo-explore/exo/blob/945f90f676182a751d2ad7bcf20987ab7fe0181e/exo/orchestration/node.py#L154) other devices using the best method available. Zero manual configuration. 27 28### ChatGPT-compatible API 29 30exo provides a [ChatGPT-compatible API](https://github.com/exo-explore/exo/blob/main/exo/api/chatgpt_api.py) for running models. It's a [one-line change](https://github.com/exo-explore/exo/blob/main/examples/chatgpt_api.sh) in your application to run models on your own hardware using exo. 31 32### Device Equality 33 34Unlike other distributed inference frameworks, exo does not use a master-worker architecture. Instead, exo devices [connect p2p](https://github.com/exo-explore/exo/blob/945f90f676182a751d2ad7bcf20987ab7fe0181e/exo/orchestration/node.py#L161). As long as a device is connected somewhere in the network, it can be used to run models. 35 36Exo supports different [partitioning strategies](https://github.com/exo-explore/exo/blob/main/exo/topology/partitioning_strategy.py) to split up a model across devices. The default partitioning strategy is [ring memory weighted partitioning](https://github.com/exo-explore/exo/blob/main/exo/topology/ring_memory_weighted_partitioning_strategy.py). This runs an inference in a ring where each device runs a number of model layers proportional to the memory of the device.