In my previous post I wrote about different approaches and platforms for multi agent systems simulation. Those were platforms I have used previously, but today I want to move one level of abstraction up and talk a little bit about software and system architectures.

Why "software" and "system" architectures? Let's take the case of NetLogo and RoboCup2D. NetLogo runs as a single process with threads, and all the agents live inside it. RoboCup2D on the other hand is composed of several units: simulator, monitors and players, each one potentially running as a single process with threads. We can talk about RoboCup2D as a system.

BWAPI can be considered a hybrid (pun intended): it can be used as a single unit or as a system. In this case we have at least two processes, one running the world, i.e. the StarCraft game, and the other simulating the player inputs. At the same time, the second one can be either a single process running all the units or a set of processes each running a single unit.

Before proceeding, for this post we will define an agent as an autonomous unit that can sense the world in some dimensions, and can execute actions to modify the world. Keep in mind that the world is not necessarily a physical world like ours.

At the most basic level, each simulator runs in discrete time in a game loop. A game loop is an infinite loop running, at a defined frequency, a series of ordered actions. Most of the time these actions are: fetch player commands, update world, and draw to the screen. The frequency or clock controls the time in the simulation, and it affects how the agents interact with the world through the sensors and actions.

How the agents interact among them and the world depends on the simulator architecture. We can define at least two different approaches.

Single process simulation

Using a game loop we can define an architecture for a software similar to NetLogo. In NetLogo we have an observer and two types of agents: turtles and patches. In each loop iteration the observer ask each agent a defined task, and each agent reacts accordingly. Our architecture can allocate an array to hold all the agents (both turtles and patches), and start a game loop that run: ask each agent to perform an action defined by predefined rules, update the world with from the actions taken by the agents, and finally update the GUI (if any).

This first type of simulation can work both as a single thread or as a multi threaded environment. In the single thread case, the observer and all the agents run in a single thread and the next iteration start after all the updates are done. Only one agent can be updated at the time.

In the other case, the observer and each agent can all run in dedicated threads, or in a thread pool, where each agent posts tasks to be run by a worker thread. Several agents can be updated in parallel and it is the decision of the world model how to gather and merge all the actions taken by the agents.

In this approach, the clock is shared among all the threads. Usually the clock is determined by the game loop and a time unit is added to the clock when an iteration (or all the agents) has finished, or a thread is used to determine the update frequency with a thread sleep method. In the later case it is possible that some agents don't complete their updates and don't perform any action on that given iteration.

Multiple processes simulation

RoboCup2D follows a distributed approach. In this architecture, there is a centralized server that maintains the current and exact world model, and each agent connects to this server, starts receiving updates from its sensors and starts requesting commands to be executed by the server. Each agent should be a separate process and should not even be running on the same device. All the messages are sent as UDP messages through the network. The server also works as communication channel, all the message passes through it and routes them to the respective agents.

In this approach, the clock is usually managed by the server and it sends the current time to each agent at the beginning of a new simulation iteration. In the specific case of RoboCup2D, the sensor frequency can be asynchronous or synchronous: it can be sent at a defined intervals (might be different to the game loop) or be send along the clock.

Until now we have considered only push servers: the server send the sensors measures to the agents at a defined frequency. But it could be also interesting to study a pull server, where each agent should query the server about the current status of their sensors.

On a pull server, the agents could send a sensor model to the server and the later should evaluate the sensor against the current world model. The server then can limit what the agent can sense, and even add noise to the measurements. In RoboCup2D, the sensor measures are sent from the server to the agents, each agent knows about the world model only through this sensors. In BWAPI, each can query the world model but it is up to the server to decide how much data to sent. For example, if fog of war is active then the agents can not query the world model outside a certain radius.

Other types of simulations

Distributed simulations should run decentralized. In this approach, each agent has a world model that it has constructed with its sensors, and interacts with other agents that also have their own world model. A better world model could be constructed from the shared world model of each agent.

The real challenge in this setup is how to distribute the real world model. Remember we are on a simulation, a piece of software. In robotics this is really easy to setup because we already have a real physical world around us. Each agent could be running locally a world model version limited to the current agent state. Other challenge then will be agent discovering. In a centralized simulation, the server knows all the agents connected to it, but in a decentralized simulation the only way to know about other agents in the world is querying the communication channel.

One variant could be Internet of Things (IoT) where a central server is receiving data from the sensors in the world, and is trying to generate the world model from this information. The server doesn't need to know all the agents in the world, the important part is that it is receiving information.

A final approach is to simulate a set of agents as a population. It is not a multi agent simulator per se but it can provide quick results. One example is the video game Democracy (disclaimer: I don't know the internals of the game and I am just sharing my thoughts about how the game possibly works). In Democracy, the population is segmented in several sub-populations with a common dimension. There is a capitalist and a socialist sub-population, poor and rich sub-population, blue collar and white collar, etc. Several people can be in several sub-population at the same time and they are affected by the government actions according to their dimensions. At the end of the game all the population decides if the current government shall be reelected for a new term or not.

In this approach, one possibility is that the population can be simulated by another multi agent simulation, and just returns a metric on how the system is behaving, to the next level in the simulation hierarchy. Usually, the population is modeled using a statistical model, call it an average, a median, or a known or particular distribution.

Final thoughts

There are several dimensions in a software simulation architecture, we have only covered how to distribute communications among several agents or programs. We need to consider also how the environment will be implemented in software: how to simulate discrete and continuous environments? what programming languages are suitable for a multi agent simulation? what software architecture or programming paradigms offers the best solutions for a multi agent simulation? We will cover most of this questions in future posts where we will even test some real multi agent simulators.