DGPUNET

Distributed GPU Network

Distributed GPU Network
Active Cluster Ray Framework Multi-GPU

About DGPUNET

DGPUNET (Distributed GPU Network) is a Ray-based computing cluster that pools GPU resources from multiple machines to handle demanding AI workloads. By distributing tasks across consumer-grade GPUs, DGPUNET enables running large models and complex pipelines that would otherwise require expensive enterprise hardware or costly cloud infrastructure.

The cluster orchestrates workloads for SIIMPAF's animation pipeline, including Stable Diffusion image generation, EMAGE body motion synthesis, and PantoMatrix avatar animation. It represents a philosophical statement about accessibility in AI development - democratizing access to computational resources that would otherwise be gatekept by cloud providers.

Cluster Resources

5
GPU Nodes
92GB
Total VRAM
448GB
System RAM
Ray
Orchestration
Node System CPU GPU VRAM RAM Role
nv5090 Custom Tower AMD Ryzen 9 7900X NVIDIA RTX 5090 32GB 128GB Head Node
nv4090 Alienware M18R2 Intel i9 NVIDIA RTX 4090 Laptop 16GB 64GB Dev Machine
nv4080 Alienware M18R1 Intel i9 NVIDIA RTX 4080 Laptop 12GB 64GB Worker
nv4070 Alienware M16R1 Intel i7 NVIDIA RTX 4070 Laptop 8GB 64GB Worker
nv3090 Dell XPS Tower Intel i7 NVIDIA RTX 3090 24GB 128GB Worker

Distributed Workloads

Image Generation

Stable Diffusion inference distributed across available GPU resources for avatar image creation.

Body Motion Synthesis

EMAGE gesture generation from audio, producing natural body movements for avatar animation.

Avatar Animation

PantoMatrix/AnimateAnyone rendering combining facial expressions with body motion.

LLM Inference

Large language model inference distributed when local VRAM is insufficient. Supports 70-100B+ parameter models.

Video Processing

Frame-by-frame video rendering and post-processing across worker nodes.

Model Training

LoRA fine-tuning and other training tasks distributed across the cluster.

Real-World Application: RPEPTFS

DGPUNET's distributed GPU architecture makes possible applications that would be impossible on a single machine. RPEPTFS (Role-playing Enhanced Pitch Training Feedback Simulator) is a prime example.

Without DGPUNET (Single GPU)

2 NPCs

The nv5090's 32GB VRAM can run at most 2 AI investor NPCs simultaneously with real-time animation. Insufficient for realistic panel scenarios.

With DGPUNET (92GB VRAM)

5-6 NPCs

Distributing LLM inference, TTS, and animation across 5 GPUs enables full investor panels - each NPC with unique personality, real-time responses, and animated avatars.

How DGPUNET Powers RPEPTFS:

  • nv5090 (Head): Orchestration, primary LLM inference, Stable Diffusion
  • nv4090 (Dev): EMAGE body motion generation, secondary LLM
  • nv4080 (Worker): PantoMatrix animation rendering
  • nv3090 (Worker): TTS voice synthesis, video encoding
  • nv4070 (Worker): Audio processing, frame composition

The Business Impact: Without DGPUNET, RPEPTFS could only show entrepreneurs practicing with 2 investors at a time - far from the realistic 5-6 person investor panels they'll face in real pitch meetings. DGPUNET transforms RPEPTFS from a limited demo into a genuinely useful training tool for startup founders seeking Series A funding.

AI/LLM Tools Used

Large Language Models (LLMs)

Vector Database & RAG

Image Generation

Animation Pipeline

Text-to-Speech & Speech Recognition

Core ML Libraries

PyTorch 2.9.1+cu128 Transformers Diffusers Ray CUDA 12.8

Technology Stack

Ray PyTorch CUDA 12.8 Python 3.11 Stable Diffusion EMAGE PantoMatrix vLLM Ollama Qdrant FastAPI

Related Articles

The following articles document the journey and philosophy behind DGPUNET and related AI infrastructure:

Building DGPUNET: Democratizing AI Innovation Through Open Source Infrastructure

Over the past several months, I've been working on something that started as a practical necessity but evolved into a philosophical statement about accessibility in AI development. When a startup couldn't get anything better than a pitiful G10 GPU instance from their cloud provider - completely insufficient for the machine learning workloads needed - I realized I had to take matters into my own hands...

Part 1: How Role-Playing Games and Early Computing Shaped Four Decades of AI Development (1977-1990)

In 1977, a cousin introduced me to role-playing games. Two years later, another cousin gave me access to the University of Utah's computer network. At 8 or 9 years old, I was online and learning to code. These early experiences with pattern matching, NPC behaviors, and making computers feel responsive laid the foundation for four decades of AI development.

Part 2: IRC Bots, Beowulf Clusters, and Distributed Computing (1990-2005)

This era focused on infrastructure scaling - IRC bots for natural language processing, Beowulf clusters for distributed computing, building ISPs and data centers. Four significant patterns emerged: commodity over enterprise, distributed over centralized, humanizing computer interaction, and automation with reliability.

Part 3: Professional Applications and Breakthroughs (2005-2020)

Therapeutic gaming, educational technology, and real-time AI systems that outperformed commercial solutions. Applying decades of lessons to professional contexts.

Part 4: Building Distributed GPU Infrastructure When Centralization Threatens Innovation (2020-2025)

GPU scarcity, centralization concerns, and the decision to build DGPUNET - a distributed GPU infrastructure using consumer hardware and Ray clustering to democratize access to AI computational resources.

Part 5: SIIMPAF - Four Decades of Technology Lessons in One System

Bringing four decades of lessons together in SIIMPAF - a comprehensive self-hosted AI system that embodies the principles of distributed computing, open source, and computational independence.

How Role-Playing Games, Distributed Computing, and AI Development Led to an Investment Pitch Simulator

The journey from IRC bots and NPC behaviors to RPEPTFS - an AI-powered pitch training platform with QLoRA-trained investor NPCs based on real Dragons' Den and Shark Tank personalities.

Related Projects