Part 3: A Symphony of Performance and Efficiency

<< Back To Overview


Unleashing the Power of Asynchronous I/O

Within TERA, each server runs an entire world bustling with approximately 2,000 concurrent players. To handle this scale and maintain optimal performance, our server employs a meticulous orchestration of threads, ensuring smooth gameplay for every adventurer. We implemented asynchronous I/O, leveraging Windows' IOCP (Input/Output Completion Ports). This technology automatically creates a task queue for processing client packets, allowing for parallel execution by a dedicated worker thread pool. This architectural improvement significantly improved performance, pushing our server to new limits.

The Pursuit of Contention-Free Execution

To further optimize performance, we needed to address contention issues arising from reading and updating region data. Conventional locking mechanisms were leading to excessive thread waiting times, hindering overall throughput. Our solution involved replicating the world in each thread, effectively eliminating the need for locks. Now, each thread operates on its independent version of the world, ensuring seamless searching without contention.

Coordinated Synchronization for Consistency

While the previous design alleviated contention during the search phase, it introduced a new challenge in the update phase. The region data of individual threads could become inconsistent with other threads when executing region tasks. To address this, our solution involved a shared queue that stores region tasks generated by all threads. When our server receives a client movement packet, one thread processes it and generates a corresponding region task. This task is then appended to the shared queue with an execution counter, signifying the number of threads. Each thread independently processes the shared task queue, decrementing the counter upon task execution. Only when the counter reaches zero is the task removed from the queue. This synchronization mechanism guarantees that all threads process the same set of region tasks, ensuring data consistency across the server.

Unlocking Performance: Efficient Creature Updates

Frequent updates to specific creatures posed a challenge due to lock contention. To circumvent this issue, our solution involved the creation of a creature task queue within each thread's world replica. When a thread encounters a packet or timer task that requires a creature update, it pushes a corresponding function call (e.g, Creature.receiveDamage) with specific arguments (e.g, 200) to the creature's task queue. If another thread is already iterating over the creature task queue, the current thread gracefully returns to its main loop. However, if no contention occurs, the thread iterates over the task queue, processing the tasks pushed by other threads until the queue is empty. This asynchronous execution of creature tasks follows the active object pattern, enabling efficient and concurrent processing of creature updates.


>> Part 4: Ensuring Security and Data Integrity