Last active
June 8, 2025 09:32
-
-
Save TheYkk/959756387290f598d6e0934125a4de1e to your computer and use it in GitHub Desktop.
Revisions
-
TheYkk renamed this gist
Jun 8, 2025 . 1 changed file with 0 additions and 0 deletions.There are no files selected for viewing
File renamed without changes. -
TheYkk created this gist
Jun 8, 2025 .There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -0,0 +1,189 @@ 🚀 High-Level Goal Support a 64 v 64 (128 total) “Hell-Let-Loose–style” FPS with Godot clients and an authoritative Rust server, while keeping latency low (< 80 ms RTT budget) and bandwidth reasonable for both clients (< 250 kbps) and the server box (< 25 Mbps). ──────────────────────────────────────── 1. Core Design Pillars ──────────────────────────────────────── • Authoritative server ‑ no trust in clients • UDP first, with a light reliability/ordering layer (think ENet/Laminar/QUIC) • Fixed-rate server simulation tick, client-side prediction + interpolation • Delta-compressed, relevance-filtered snapshots (a.k.a. interest management) • Multi-threaded ECS simulation on the server; network I/O kept lock-free • Single box for 128 players, but layout is shard-friendly if we ever split ──────────────────────────────────────── 2. Top-Level Architecture ──────────────────────────────────────── Godot Client <-UDP/QUIC-> Rust “Game-Core” (authoritative) <-TCP-> Lobby / DB ``` ┌────────┐ inputs ┌─────────────┐ events ┌──────────┐ │ Godot │───────────►│ Net Front │──────────►│ Match │ │ Client │◄───────────│ Gate (IO) │◄──────────│ Lobby │ └────────┘ snapshots └──────┬──────┘ └──────────┘ │ lock-free channels │ ┌─────▼─────┐ │ Game ECS │ │ (Bevy?) │ └─────┬─────┘ │ ┌─────▼─────┐ │ Worker │ │ Threads │ └───────────┘ ``` Why two layers inside the server? • Net Front Gate = purely async I/O, packet (de)frag, (de)crypt, acks. • Game ECS = deterministic world updated at fixed Δt, batch-consumes inputs, emits snapshots. ──────────────────────────────────────── 3. Transport & Packet Layout ──────────────────────────────────────── Transport: UDP (or QUIC if you want built-in encryption + congestion control). Max safe MTU: 1200 bytes (fits inside most home NAT-MTUs). Packet Header (7 bytes): ``` uint16 seq_id uint16 ack_of_remote uint32 ack_bitfield (32 earlier acks) uint8 flags (bit0=reliable, bit1=frag, bit2=control…) ``` Payload = 1‒N “messages” TLVed inside the datagram: Msg-Types (1 byte id + 1 byte len if <256): 00 Heartbeat / ping 01 InputCmd (bitfield buttons 2B + 3×pos32 or delta16 + uint8 tick) 02 SnapshotDelta (compressed) 03 SnapshotBaseline (full state if delta lost) 04 Event/RPC (grenade exploded, chat, UI) 05 StreamFrag (map chunk, voice, etc.) Reliability: • “reliable” flag + sliding window resends. • Unreliable for InputCmds (they become obsolete quickly). • Semi-reliable for SnapshotBaselines. ──────────────────────────────────────── 4. Tick & Time Model ──────────────────────────────────────── Simulation tick = 60 Hz (Δt = 16.66 ms) Networking tick = 20 Hz (every 3rd sim tick we send a snapshot) Client Render (144 Hz) ┌───────────────────────────────────┐ Timeline → │I I I│I I I│I I I│ … (inputs @ 60) │ ├─┬─┬─┴─┬─┬─┴─┬─┬─┴─┬───────────────┤ Server Sim │S │S │S │S │S │S │S … (60 Hz) │ └────────┬────────┬────────┬────────┘ Snapshot Tx ▲ ▲ ▲ (20 Hz) Interpolation buf. 2.5 ticks ≈ 40 ms Client-side: • Sends InputCmd every render frame (ideally 60 Hz limit). • Predicts locally. • Keeps 100 ms of history; on mismatch vs authoritative state ⇒ smooth rewind/correct. Server: • Collects all inputs with tick ID ≤ current-tick. • Simulates physics, hit-scan. • Serializes state diff vs. last ACKed snapshot per client. • Runs interest mgmt: spatial hash + LOS + team filter. ──────────────────────────────────────── 5. Interest / Relevance Management ──────────────────────────────────────── World split into 3-D grid cells (e.g. 32 m cubes). For each client we only ship entities inside a radius of R = 250 m in front 120° FOV + team markers. Typical relevant entity count: • Players: ≈ 40 • Projectiles (bullets & tracers): ≈ 30 (fade quickly) • Grenades / effects: 10 • Buildables / vehicles: 20 TOTAL ≈ 100 entities / player on average. Entity State Quantization per delta entry id (uint16) 2 B position (x,y,z int16) 6 B (centimeter accuracy inside 2 km map) yaw/pitch (2×int16) 4 B velocity (packed int16×3) 6 B state bits (1 byte) 1 B TOTAL ~19 B → delta often ~10 B after XOR & RLE Bandwidth per client (down): 100 entities × 10 B × 20 Hz = 20 kB/s ≈ 160 kbps Bandwidth per client (up): InputCmd 8 B × 60 Hz = 480 B/s ≈ 4 kbps Server aggregate: Down: 20 kB/s × 128 = 2.5 MB/s ≈ 20 Mbps Up: 0.48 kB/s × 128 = 61 kB/s ≈ 0.5 Mbps Well within a single gig-E NIC. ──────────────────────────────────────── 6. Server Threading & Scaling ──────────────────────────────────────── CPU Budget (per tick): • Physics + ECS: ~100 µs per player → 128 × 100 µs = 12.8 ms • Overhead / pathing / extras → 2.0 ms Total → 14.8 ms < 16.6 ms budget 💚 Implementation: • 1 async thread (Tokio/Quinn) for recv/send (zero-copy to/from mpsc channel). • N-1 worker threads (rayon or Bevy_schedule) own ECS; partition by entity or system. • End of tick = barrier; snapshot builder runs, pushes bytes back to Net thread. Memory: Baseline entity (archetype) ~256 B; 5 000 live entities → 1.3 MB. Plenty of headroom; 32 GB RAM box is luxurious. ──────────────────────────────────────── 7. Will It Still Work for 128 Players? ──────────────────────────────────────── We already designed for 128 total. Stress test scenario: everybody in one courtyard. Entity count might double to 200 relevant. Bandwidth per client → 40 kB/s (≈320 kbps) still OK. Server outbound → 5 MB/s (≈40 Mbps) still < 1/20th gig-E. CPU: bullet spam could spike physics to 25 ms → mitigation: • cap projectile simulation (hitscan on server, clients draw fake tracers) • off-thread async jobs for explosions etc. So yes, still viable on one modern 8-core (Ryzen 5 7600, Xeon E-2288G, etc.). For >128, you’d shard or open “region servers” (same exe, different port). ──────────────────────────────────────── 8. Special Topics / Trade-offs ──────────────────────────────────────── Anti-cheat: • Server validates hits; client only raycasts for FX. • CRC on resources, obfuscation of packet opcodes. • Optional: kernel driver not covered here. Matchmaking & Persistence: • Separate micro-service; server receives a “SpawnBlob” (loadout, cosmetics). • At end of match flush stats via TCP to DB. Voice: Don’t mix in main data path; use separate SFU or Vivox-like relay. Tick vs. Event-Driven alternative? If you wish to ditch fixed 60 Hz, you could go “snapshot-based” variable Δt (Apex approach) but you’ll complicate determinism & physics. For indie scope, classic fixed-tick is safer. ──────────────────────────────────────── 9. Checklist Summary ──────────────────────────────────────── ✅ UDP + reliability layer (seq/acks) ✅ 60 Hz sim / 20 Hz snapshots, client interp 100 ms ✅ Entity relevance + delta compression to keep < 320 kbps per user ✅ Rust: ECS (Bevy/Legion/Shipyard) + Tokio/Quinn net I/O ✅ 8-core box, 40 Mbps peak outbound, < 16 ms/frame CPU ✅ Scales to 128 players; >128 ⇒ shard or stream sections to sub-servers You now have a concrete yet implementation-agnostic blueprint for building the server & protocol. Happy fragging! 🔫