Research Article  Open Access
Gabriele Valentini, Heiko Hamann, Marco Dorigo, "GlobaltoLocal Design for SelfOrganized Task Allocation in Swarms", Intelligent Computing, vol. 2022, Article ID 9761694, 12 pages, 2022. https://doi.org/10.34133/2022/9761694
GlobaltoLocal Design for SelfOrganized Task Allocation in Swarms
Abstract
Programming robot swarms is hard because system requirements are formulated at the swarm level (i.e., globally) while control rules need to be coded at the individual robot level (i.e., locally). Connecting global to local levels or vice versa through mathematical modeling to predict the system behavior is generally assumed to be the grand challenge of swarm robotics. We propose to approach this problem by programming directly at the swarm level. Key to this solution is the use of heterogeneous swarms that combine appropriate subsets of agents whose hardcoded agent behaviors have known global effects. Our novel globaltolocal design methodology allows to compose heterogeneous swarms for the example application of selforganized task allocation. We define a large but finite number of local agent controllers and focus on the global dynamics of behaviorally heterogeneous swarms. The user inputs the desired global task allocation for the swarm as a stationary probability distribution of agents allocated over tasks. We provide a generic method that implements the desired swarm behavior by mathematically deriving appropriate compositions of heterogeneous swarms that approximate these global user requirements. We investigate our methodology over several task allocation scenarios and validate our results with multiagent simulations. The proposed globaltolocal design methodology is not limited to task allocation problems and can pave the way to formal approaches to design other swarm behaviors.
1. Introduction
A primary challenge that complicates the spread of applications of large collections of embodied agents [1, 2] is how to design individual agent controllers for a given desired collective behavior. The canonical, localtoglobal approach [3] includes a trial and error refinement of individual agent control rules followed by a macroscopic analysis of resulting swarm behaviors [4, 5] or a formal verification of specific properties of interest [6–9]. Designing agent controllers for a target swarm behavior in a noniterative way without continuous refinements has proven challenging. At present, only a few methods exist and they are tailored to specific scenarios (e.g., task allocation [10–12], formation control [13], selfassembly [14–16], and collective construction [17]). The challenge is to find a generic method to automate globaltolocal programming [18].
To tackle this challenge, we propose a novel globaltolocal design approach. Our key idea is to compose a heterogeneous swarm [19–22] using groups of behaviorally different agents such that the resulting swarm behavior approximates a user input representing the desired behavior of the entire swarm. This idea is arguably related to the concept of population coding from neurosciences where certain cognitive phenomena (e.g., perception of directional movement) result from the average of different individual contributions from populations of neurons [23]. Analogously to population coding [24], we derive a globaltolocal design method that performs a function approximation of the user input as a linear combination of basis vectors, each representing the global influence of a different agent controller.
We illustrate this idea by defining a prescriptive design method for selforganized task allocation [25–27]. Specifically, we tackle variants of the singletask robots, multirobot tasks problem (STMR) [28] with static, sequential, and periodic swarm allocations. This problem is generally known as the coalition formation problem and has been extensively studied in the multiagent community [29, 30]. Standard multiagent approaches, however, require complex cooperation strategies with a priori negotiation or bidding for task assignments and unconstrained global communication. These approaches are not suitable for largescale swarms of unreliable agents with high scalability requirements that need to avoid communication bottlenecks.
The swarm intelligence community developed a rich framework of algorithms for selforganized task allocation that are suitable for unreliable, embodied agents. Instead of forming a priori coalitions, selforganized multiagent systems achieve task allocation as a result of the continuous interaction among agents and between agents and the environment. Popular approaches include thresholdbased algorithms [26, 31] and recruitment strategies inspired by the foraging behavior of ant colonies [25, 27]. More recently, CastilloCagigal et al. [32] investigated periodic binary task allocation and proposed a selforganized synchronization strategy whose macroscopic behavior results in a bimodal distribution of robots over two tasks—a scenario we consider here, too. All these studies, however, employ localtoglobal design approaches (i.e., bottomup with challenges in anticipating global behaviors). Although the proposed strategies are often supported by descriptive macroscopic models [33], these studies do not provide a prescriptive design method as ours here.
A notable exception is the method proposed by Berman et al. [10] for the design of task allocation. In their work, the authors consider problems with more than two tasks. Each agent decides to switch tasks independently of other agents, and therefore, agents do not interact with each other. Their method, based on a linear continuous model, optimizes a set of transition rates between pairs of tasks that define a unique agent controller with the aim to converge as quickly as possible to a given swarm allocation. Differently from the probability distribution over swarm allocations considered here, they assume a single swarm allocation as input corresponding to the mode of our unimodal scenario (see Section 3.1). Due to the lack of interactions among agents, their method cannot achieve the nonlinear, oscillatory dynamics of the swarm that we describe in our multimodal scenarios (see Section 3.2). The approach of Berman et al. [10] has also been extended to incorporate feedback gathered from the environment by individual agents [11, 12]. Agents keep track of the number of successfully completed tasks and report this information to a centralized authority (called the hive) that, in turn, updates the parameters of their stochastic control policy. Differently, our design approach provides a completely selforganized solution that does not require any centralized authority.
We consider a simple scenario where a user wants to design a swarm that allocates its members to only two tasks. Despite the simplicity of this task allocation problem, a few variations are possible. Let us say we have task and task and we want to design a swarm with 80% of agents working on task and 20% working on task . In a trivial approach, we could statically assign agents to tasks before deployment. However, it is generally beneficial to require the swarm to allocate agents to tasks after deployment so as to increase robustness to individual agent failures. Agents have only local perception and cannot accurately estimate the number of agents currently assigned to either task. Hence, the swarm behavior is inherently stochastic. Even for a good design of the agent controller, we can only hope to have 80% of the agents assigned to task on average over time due to the variance introduced by each agent accuracy in assessing the current state of the swarm. A variant of this scenario arises when the user wants to define the variance of the swarm allocation (i.e., increasing it over the accuracylimited value), for example, to increase the swarm’s potential for exploration over exploitation. Another variant is represented by sequential task allocation. For instance, we initially want the 80/20% allocation as above but followed in a later phase by a 30/70% allocation, possibly triggered by external factors. For example, in a surveillance task, a swarm may need to monitor the inside and outside of a facility allocating agents in different proportions during the day and night. Yet another variant is periodic task allocation. We allow the swarm to autonomously decide when and how often to switch from 80/20% to 30/70%.
In more formal terms, a swarm allocation corresponds to a partitioning of the agents into two working groups, one for each of the two tasks. The user provides a description of the desired swarm allocations as a probability distribution over the space of all possible swarm allocations. To define a specific swarm behavior, the user manipulates the number and positions of the distribution’s modes (i.e., local probability maxima) with each mode corresponding to a target swarm allocation (e.g., as above with the two modes 80/20% and 30/70%). The user can specify a static task allocation scenario by means of a unimodal distribution. A sequential task allocation scenario is defined through a sequence of unimodal distributions and a criterion to trigger a switch in swarm allocation. Finally, a user can define a periodic task allocation scenario using a distribution with two or more modes. The swarm periodically and autonomously changes the allocation of agents as specified by the modes in a stochastic manner. While this approach may seem unnecessarily contrived for this simple case of binary task allocation, we believe it can be extended to more complex task allocation problems as well as other swarm problems that can be approached using probabilistic finite state machines (see Figure 1).
Our globaltolocal design method hinges on three main ideas and assumptions: (a) nonprogrammable agents, (b) means of predicting global system behavior from local agent behavior, and (c) accepting and leveraging the probabilistic nature of swarm behavior. We give up the freedom of having programmable agents as we assume hardwired behaviors of predefined controller types. However, we regain that freedom at the global level by composing heterogeneous swarms with wisely chosen doses of several agent controller types. For these predefined local agent controllers, we know their global swarm effect that we can model via the abovementioned basis vectors. By appreciating the probabilistic nature of swarms, we can model individual behaviors using probabilistic finite state machines (PFSMs), generalizing our approach to a wide range of scenarios representable by PFSM, and similarly also understand global swarm behavior via population models. We perceive the swarm as a stochastic dynamical system with the swarm making probabilistic autonomous decisions switching between global states. We define an arbitrarily large number of agent controllers, that is, sets of predefined control rules. For each agent controller, we derive a basis vector that models its globallevel contribution to the swarm dynamics. Specifically, each basis vector describes the transient dynamics of a homogeneous swarm where all agents run the same controller. The probability distribution over swarm allocations given by the user as input defines the desired asymptotic behavior of the swarm. From this input, we mathematically derive a response vector to describe the desired transient dynamics of the swarm. These transient dynamics are such that the swarm will asymptotically converge to the stationary distribution in input from the user. We then use the response vector as a reference to select the necessary agent controllers through a linear combination of our initial set of basis vectors. Finally, we systematically search for a proper composition of a heterogeneous swarm by estimating the coefficients in a lasso regression [34] between the response vector and a linear combination of basis vectors. We use penalized regression to limit the set of selected controller types to a few that are indeed required (i.e., basis vector with strictly positive coefficients) and the value of the coefficients to define the proportions of agents executing each of the selected controllers.
2. Materials and Methods
2.1. Experimental Design
We build on the idea of behavioral heterogeneity to define a globaltolocal design method for STMR task allocation problems. We consider the problem of designing a swarm of agents that allocates its members to a pair of tasks (task and task ) as defined by a user input. The user input, formally defined in Section 2.2, prescribes a desired swarm allocation by means of a stationary probability distribution defined over the space of all possible allocations of agents to 2 tasks. We leverage the degrees of freedom that can be gained at the global level by mixing different agent controllers at the local level. Contrary to localtoglobal approaches that manually explore a possibly infinite space of design solutions to obtain a single agent controller for an homogeneous swarm, we restrict our designspace to a large but finite number of alternatives and systematize our search to obtain a combinatorial solution. To do so, we consider a set of parameterized control rules as the basis of the individual agent controller and obtain a large albeit finite number of different controllers by varying these parameters (see Section 2.3). The outcome of this search—called swarm composition—is a heterogeneous swarm formed of groups of agents with different controllers.
Let represent a swarm composition with agent controllers. represents the th agent controller whereas gives the number of agents in the swarm with that controller. For each agent controller, parameters and define, respectively, the number of task allocations that an agent needs to observe from other agents before applying a decision rule and the enumeration index identifying the specific set of decision rules associated to the th controller (see Section 2.3). We consider the problem of finding a swarm composition that approximates the stationary distribution defined by the user (see Section 2.2). We tackle this problem with a prescriptive modeldriven approach by defining a macroscopic model that, given the local agent controllers, describes both the transient and the stationary behavior of the swarm, and a method to derive, from the desired stationary behavior , a (reference) model of a transient swarm behavior that converges to . The first model, introduced in Section 2.4, provides a space of basis vectors, each describing the behavior of an homogeneous swarm executing a specific agent controller. The second model, introduced in Section 2.5, gives a response vector that describes the behavior of the heterogeneous swarm as required by the user (i.e., the behavior of the target swarm composition). Finally, we can formulate a linear combination of basis vectors. As described in Section 2.6, the key is to find appropriate coefficients to select basis vectors. They represent the desired swarm composition via .
2.2. User Input
Let represent a swarm allocation, where is the number of agents allocated to task (respectively, to task ), and is the set of all possible macroscopic states of the swarm (i.e., all possible distributions of agents over the two tasks). The user inputs a desired stationary probability distribution , , over the macroscopic state space. Entries , , give the probability that the swarm allocation is ; each mode of (i.e., local probability maxima) defines a desired swarm allocation by virtue of representing those allocations that are most likely to realize at any given time. The number of modes of the user input determines the particular variant of the task allocation scenario. A distribution with one unique mode corresponds to a single and static swarm allocation; given a sequence of such distributions, we can design swarms for sequential task allocation by including a triggering criterium for agents to change their control rules. When the user input is a multimodal distribution, the swarm behavior requested by the user is periodic task allocation. With a certain frequency, the swarm changes the allocation of its members according to the different swarm allocations determined by the modes of .
2.3. Agent Controllers
We consider agents with only local perception of their environment and local agenttoagent communication. By building on these limited capabilities, we define a recipe to enumerate finitely many different agent controllers. We achieve this by considering a template of a control rule that can be instantiated with different configurations and that allows us to enumerate different agent controllers. We abstract from any domainspecific actions that an agent would need to execute in a particular task and application. Instead, we focus on the agent interactions and the decisionmaking necessary to fulfill the swarm allocations desired by the user. We consider a system where tasks are uniformly distributed in a closed environment in which agents (a) move randomly while working on either of the two tasks, (b) have the ability to perceive the allocation of their neighbors, and (c) to trigger a change in their allocation (one at a time).
Agents act stochastically and asynchronously. They repeatedly apply two control rules: selfswitching and switchorrecruit. When executing the selfswitching rule, an agent changes its current allocation to the alternative task. The global effect of the selfswitching rule can be assimilated to the spontaneous switching behavior of unreliable agents subject to internal noise [35, 36]. Using the switchorrecruit rule, an agent has a greater influence on the current swarm allocation. It can decide to either increase or decrease the number of agents allocated to a task by one unit. As a function of its current allocation and those of its neighbors, the agent either selfswitches to the other task or recruits a neighbor from those with the alternative allocation. When the agent acts as a recruiter, the recruited neighbor always switches its task allocation and it does so independently of its internal state and of its actual agent controller. That is, passively recruited agents always switch their task allocation without objections. Control rules are executed randomly by individual agents: selfswitching with rate and switchorrecruit with rate (respectively, with probabilities and ).
All agents only have local perception of the current global task allocation. We say that each agent perceives the currently assigned tasks from its neighbors (i.e., agents within proximity, for example, within communication range). Note that all agents move at all time, and neighborhoods are subject to change, that is, the underlying network is dynamic. Each agent knows a number of neighbors currently assigned to task , a number of neighbors currently assigned to task , and its own currently assigned task forming a set of information from agents ( neighbors plus the considered agent). We define agent controllers , , that differ from each other by the logical function . We use function to define the local taskswitching behavior of an agent and to determine the global effect of the switchorrecruit rule. Function takes as input a group of task allocations of size . This group includes the task allocation of the agent applying the switchorrecruit rule and that of its neighbor agents. Parameter is an index that encodes a particular taskswitching behavior and ranges over all possible agent controllers based on the same group size . For a group of task allocations of size , we have possible group compositions. We do not assign any action to homogeneous groups (i.e., groups with either or agents allocated to task ). Therefore, has possible inputs and three possible outputs (i.e., switch allocation, recruit a neighbor, no action). Moreover, since the noaction is fixed in each agent controller, we obtain possible functions . Function is defined as . gives the change of agents allocated to task when an agent applies the switchorrecruit rule over a group of allocations that contains entries for task . Given a particular choice of values for parameters and , we set if the th bit of (expressed in the binary numeral system) equals ; otherwise, we set . Table 1 shows an example of an agent controller defined by ; in this case, the switchorrecruit rule corresponds to the majority rule often used in swarm robotics research [37, 38]. By enumerating all for increasing values of , we obtain an arbitrary large set of different agent controllers.

Note three properties that result from the above defined controllers. (a) Agent controllers are independent of the controllers of neighboring agents and only require to know their task allocations. (b) The interplay between selfswitching and direct recruitment eases the mixing of task allocations among agents with different controllers. (c) Due to selfswitching, the resulting decisionmaking process is ergodic which prevents its absorption at extreme states where all agents are allocated to one of the two tasks [39]. These properties are fundamental for the definition of our globaltolocal design method because they allow us to predict the global behavior of a heterogeneous swarm.
We have developed a simple microscopic multiagent simulator to validate our design method. In our simulations, agents consist of situated massless points moving within an environment of space units with a velocity of 2 space units per time step. We consider a time step of seconds, and, at every time step, we update the position in space of each agent in the swarm. Since agents are asynchronous, we update their task allocation only when they execute a control rule. Agents are always assumed to work on one of the two tasks and to periodically execute their agent controller. Independently of the controller , agents perform a random walk. They do not collide with each other but can collide with the boundaries of the environment. In the case of a collision with a boundary, the agent bounces back with a mirrored angle of incidence. When executing the switchorrecruit rule, agents note their own task allocation and sample the task allocations of their closest neighbors. In the following, we show the average of simulations each lasting seconds.
2.4. Basis Vectors
For each agent controller, we use a discretetime Markov chain to describe the global dynamics of a homogeneous swarm of agents executing the same controller. In the derivation of the Markov chain, we assume for simplicity that at each time step, one agent in the swarm executes a control rule. Note that this assumption does not lead to a slowdown of the allocation dynamics nor to any other loss in performance. Since agents act in real time—which is continuous—and are not synchronized, we have that the probability of two or more concurrent executions of control rules by different agents is zero (as a consequence of the time step of seconds set in our multiagent simulations and of the choice of values for parameters and , we rarely observed concurrent executions of control rules by two or more agents). The discretization of the time in terms of the number of control rule executions allows us to simplify our mathematical derivations without introducing approximations. In Section 3.2, we provide means to recover the time as a continuous entity from the number of control rule executions. A direct consequence of this assumption is that the number of agents in the swarm allocated to task 0 changes by units per time step. Therefore, the resulting Markov process is described by a tridiagonal matrix of size .
For a given agent controller with function , the transition matrix is defined as
Probability models the transition of winning one more agent being assigned to task . It is the sum of two contributions: the probability that an agent currently allocated to task selfswitches its allocation to task ; and the probability that any agent increases the value of by applying the switchorrecruit rule with a change . Occurrences of certain group compositions (i.e., assumed current task allocations of a considered agent and its current neighbors forming a set of size ) are modeled using the hypergeometric distribution. Equation (2) models the transition without effect. results from those agents that do not execute any action as a result of the application of the switchorrecruit rule over a homogeneous group with either or allocations for task . Equation (3) is derived similarly to Equation (1).
Equations (1)–(3) define an ergodic Markov chain . Based on , we derive a basis vector that gives the expected change of the swarm allocation resulting from the next agent executing a control rule. We obtain
Given a system state , the function returns the expected change of . We obtain an arbitrary large space by considering all basis vectors , , for groups of increasing sizes .
Figure 2(a) shows examples of basis vectors (dashed lines) and their linear combination (solid line) over all possible system states (i.e., possible task allocations). For intervals on with , the transient behavior of the swarm drives the task allocation process in the direction of the extreme allocation to the right, towards . For intervals on with , there is a push to the left, towards . Zeros would indicate stable and unstable fixed points in a deterministic system but need to be interpreted here as random dynamical attractors [40] due to the system’s stochasticity. These points represent task allocations that either attract or repel the swarm allocation process. Attraction points identify the modes of the stationary distribution of . Basis vectors are pairwise symmetric with each other around , ; that is, for each , there exists such that . Therefore, an equal number of agents with controllers and cancel each other’s effect of the switchorrecruit rule and leave only the contribution of the selfswitching rule.
2.5. Response Vector
In our globaltolocal design method, we obtain the response vector , which represents the expected change of the userdesired swarm, from the stationary distribution (see Section 2.2). We first construct a Markov chain that converges to itself and then computes from with Equation (4).
The stationary distribution of an ergodic Markov chain with transition matrix can be uniquely determined by solving the system of equations [41]. The inverse problem, however, is less trivial, and its solution is in general not unique. In the case of our tridiagonal transition matrix, this problem implies the exploration of a manifold characterized by dimensions. This number of dimensions is due to the sparse structure of tridiagonal matrices and to the fact that transition matrices are rowstochastic (i.e., row entries are nonnegative and sum up to 1). As a consequence, in order to construct our response vector , we need to find a set of additional constraints.
The stationary distribution defined by the user imposes a set of linear constraints on this manifold through equation
The intuitive interpretation is that the probability of state has to be the sum of all influxes from any state to (including ). Due to the linear relation , one of these constraints is redundant and the stationary distribution reduces the number of dimensions of from to . Therefore, a general transition matrix that converges to can be parameterized by constant values referred to as . By constraining the transition matrix to be rowstochastic, we obtain the set of inequalities
Any choice of values for parameters that satisfies the above set of inequalities defines a transition matrix that satisfies . Since probabilities , , are nonnegative by definition, all entries in the parameter vector can always be chosen to be sufficiently small to satisfy the set of inequalities in Equation (6). Using Equation (6), we have obtained of constraints necessary to determine a transition matrix that asymptotically converges to .
In order to uniquely determine a transition matrix , we still require additional constraints. By inspecting Equation (2), we see that all agent controllers , , have equal diagonal entries . Furthermore, the probabilities converge for increasing group sizes as indicated by example group sizes shown in Figure 2(b). This implies that, by making a simple initial guess for parameters , , and , we can easily impose an additional set of linear constraints and uniquely determine a matrix . As we will see in the following, this initial guess of parameters is not binding and can be revised during the application of the method.
For a desired stationary distribution and initial parameters , , and , we can solve and obtain the transition matrix . The solution of the system of equations is subject to two constraints: the diagonal entries of are constant and equal to (for any choice of ); all rows of are nonnegative and sum up to 1. Since the first and last rows of have only two nonzero entries, these two constraints suffice to compute and . We compute all remaining entries and recursively following the sequence
Finally, the response vector is obtained from by computing its expected change as in Equation (4).
2.6. Regression Problem
Starting from an arbitrary set of agent controllers, Equation (4) allows us to define our search space using a matrix whose columns are the transposed basis vectors , . The response vector is derived from the stationary distribution using Equations (7)–(10) and (4). In order to determine our swarm composition , we need to find a column vector of regression coefficients that satisfies
Coefficients are required to form a conical combination. Therefore, we require, so that results in a nonnegative number of agents with controller .
In general, the accuracy of a solution to the regression problem in Equation (11) increases with the number of basis vectors whose coefficient is greater than zero (i.e., a greater number of involved basis vectors helps to finetune the result). However, that would mean to use many different agent controllers in rather small subpopulations. This increased heterogeneity would, for example, complicate production and handling of robot swarms. More importantly, it might compromise the robustness of the designed swarm. In fact, a swarm composition based on many different agent controllers is more affected by agent failures because each agent controller is likely to be represented by only a few agents in the swarm. As a result, in the case of agent failures, the actual swarm composition might soon depart from the designed one. In contrast, a swarm with few agent controllers but big subpopulations for each suffers less from the loss of agents because these losses are more likely to be homogeneously distributed across agent controllers. The swarm is more robust as it will still approximately allocate its agents as specified by the user input. To design for robustness, we seek to maximize the number of agents using each of the selected agent controller and therefore to minimize the number of different agent controllers used in the designed swarm composition.
We therefore search for a solution that minimizes the number of nonzero coefficients . The perfectly suited method for this objective is to define the regression problem as a lasso problem [34] with positivity constraints
The first summand implements the actual minimization. The regularization coefficient in the second summand determines the weight of the penalization term and controls the sparsity of the solution . Given a solution of the lasso problem (12), we normalize each coefficient according to so that the coefficients sum to 1 and satisfy the physical conservation of swarm size. The final swarm composition is obtained by computing the number of agents with controller as and rounding these values in order to have integer numbers of agents for each controller and a swarm of any desired size .
3. Results
We apply our method to design heterogeneous swarms for both unimodal and multimodal user inputs . As discussed above, heterogeneous swarms formed by many different agent controllers might not be robust to failures. Hence, we minimize the number of agent controllers and give priority to the robustness of the designed solution. We prefer qualitative over quantitative accuracy in the approximation of . In the following, we design swarms with agents. Since is independent of the magnitude of and but only depends on probabilities and , we set and vary in . In the multiagent simulations, and are divided by a factor of .
3.1. Unimodal User Input
Figures 3(a) and 3(b) show the results of the proposed method applied to a unimodal user input. The red solid line in Figure 3(a) represents the user input which defines the desired allocation . From , we derive a response vector by first constructing an equivalent Markov chain as in Equations (7)–(10) and successively applying Equation (4). We initially set parameters to and . The resulting response vector (red solid line in Figure 3(b)) shows sudden jumps for values of . These jumps can be reduced by tuning the initial values of and . However, we observe that tuning is not necessary and might even worsen the accuracy of our design method.
We consider asymmetric agent controllers for and solve the lasso problem (12) for . We obtain the swarm composition that consists of three agent controllers with . Due to the requirement of sparsity, the expected change computed from using the Markov chain does not accurately fit the response vector (see Figure 3(b)). This also applies to the expected change that we measured empirically in multiagent simulations. The designed solution fulfills the essential requirements, such as the zeros and the signs of in the region of interest (). This suffices to design a composition that closely meets the user input as shown in Figure 3(a) by the distribution predicted using both the Markov chain model (blue circles) and the distribution resulting from multiagent simulations (histograms).
Similarly to the solution proposed by Berman et al. [10], our method can also be used to implement sequential task allocation. Let us consider a series of user inputs . By applying our method to each user input, we can derive a set of swarm compositions . Individual agents in the swarm could be programmed to change their controller over time according to . Depending on the scenario, the change of agent controllers can be coupled to external signals broadcasted by the designer, a predefined time schedule, or changing environmental cues. We performed a simple experiment where the agents in the swarm change their agent controller after a certain predefined time. Initially, the swarm is required to allocate its agents around the swarm allocation as specified by the distribution given in Figure 3(a) and uses the swarm composition . In a second time period, the swarm is required to change the distribution. The second distribution over swarm allocations (not shown here) defines the swarm allocation and is obtained by the swarm composition . A video recording of this simulation is provided in Movie 1 (see supplementary material). Agents are initialized using the swarm composition and readily converge to . At time seconds, the agents in the swarm change their agent controllers from the initial swarm composition to the second swarm composition . Soon after the change of agent controllers, the swarm updates its allocation and converges to .
3.2. Multimodal User Input
A multimodal stationary distribution defines a task allocation scenario characterized by multiple swarm allocations with one for each mode of . The result of such a user input is a swarm that periodically switches between different swarm allocations. Contrary to the above discussed case of sequential tasks, switches between pairs of swarm allocations are stochastic and characterized by a certain mean period of time (see [42] for a bimodal example). Thus, multimodal user inputs define a periodic task allocation scenario.
Figures 4(a) and 4(b) show an example application of our method to a bimodal user input. The red solid line in Figure 4(a) defines a scenario with two swarm allocations: and . The response vector (red line in Figure 4(b)) has been derived using initial parameters and . We consider all agent controllers resulting from group sizes and solve the lasso problem for . The solution of Equation (12) yields the swarm composition characterized by 3 agent controllers with group size . With respect to the unimodal scenario, we increased the value of the regularization parameter to obtain a sparse solution . Both the stationary distribution predicted using the Markov chain and the distribution computed from multiagent simulations (shown in Figure 4(a)) qualitatively match the user input with only a small deviation of the distribution around . A video recording of a multiagent simulation of the bimodal scenario can be found in Movie 2 (see supplementary material).
Additionally, the user might also express requirements over the mean switching time , that is, the time necessary for the swarm to reallocate its agents from to . Using the Markov chain model resulting from , we can compute the mean and the variance of the number of control rule executions necessary for this purpose. By multiplying these statistics by the mean duration between the execution of two control rules, we can obtain as a function of the rates and . We recover the time in its continuous form from the discrete number of executions of control rules. Figure 4(c) shows the prediction of the Markov chain (, shaded area) compared to multiagent simulations ( boxplots) when and . We obtain a good agreement of both means (dashed line versus diamonds symbols) and variances of the two models.
Finally, we apply our globaltolocal design method to a trimodal user input . The stationary distribution (red line in Figure 4(d)) defines a task allocation scenario where the swarm alternates its workforce among three possible allocations: , with the majority of agents working on task ; with agents equally allocated to both tasks; and with the majority of agents working on task . We compute the response vector using initial parameters and (data not shown). We define the minimization problem (12) by considering only asymmetric agent controllers with (i.e., the first 32 basis vectors). The solution of the lasso problem for gives the swarm composition . The distribution predicted using the Markov chain and the distribution resulting from simulations qualitatively suit the user requirements (blue circles and histograms).
4. Discussion
We have shown that our method can be used to design swarms that allocate their agents as defined by a user input. The user input is a stationary probability distribution over swarm allocations and defines the probability of any possible swarm allocation. Our method allows the user to specify scenarios with a single swarm allocation using a unimodal distribution and scenarios where the swarm alternates between different swarm allocations using a multimodal distribution.
A unimodal stationary distribution defines a task allocation scenario with a single swarm allocation . In principle, this scenario could be tackled by a static assignment of agents to each of the two tasks. However, such an approach is not robust to individual agent failures. Consider, for example, a collective construction scenario where task and task require, respectively, to dig and to remove the excavation material from a construction site. Due to workload disparity between tasks, agents are likely to experience uneven failure rates. Over time, the swarm might significantly depart from the desired allocation . Without complete knowledge of individual agent failures, a designer would be prevented from restoring the initial static allocation (e.g., by deploying new agents). Our design method is robust to such situations. Since agents repeatedly switch tasks, the workload is shared equally among agents. Agents are thus equally subject to wear as well as failures and the desired proportions of agents with each controller is preserved. Over the system’s lifetime, an operator can add new agents to the system with the same proportion of agent controllers as originally designed to counter degrading swarm performance. Note that the addition of new agents in the swarm can even be used as a means to reprogram the swarm behavior by considering the original swarm composition as an additional constraint in our design method.
As discussed in Section 3.1, we can use a chronological sequence of unimodal distributions to design sequential task allocation scenarios. This is achieved by letting agents change their agent controllers over time and results in a swarm that switches from a swarm allocation to the next in the sequence. Our method can be used to design a swarm composition for each distribution in the sequence. However, this approach to sequential task allocation requires agents with a mechanism (e.g., based on an external signal or fixed time scheduling) that triggers changes of agent controllers.
Alternatively, the user might provide a multimodal distribution as input. In this case, with a single swarm composition that does not change over time, we obtain a swarm behavior that naturally oscillates between the swarm allocations defined by the modes of the user input. This type of selforganizing swarm behavior is similar to the one investigated by Silk et al. [43] for the design of selforganizing networks. Periodic task allocation offers an alternative approach to implement sequential task allocation. It might be useful in extreme applications where hardware limitations prevent agents from perceiving external signals or from being programmable (e.g., hardwired controllers in nanorobotics applications [44]). This alternative approach to sequential task allocation could be useful, for example, to increase the penetration of nanobots into tumors [45, 46]. Nanobots with multifunctional capabilities (e.g., sensing, imaging, and therapy) [47] could be designed to initially perform tissue penetration and diagnosis and later to deliver the drugs in their payloads.
Through the example application of task allocation, we introduced the idea that swarms that achieve userspecified objectives can be designed by leveraging on behavioral heterogeneity. This idea was inspired by the concept of population coding from neurosciences [24], where a population of neurons performs a function approximation by combining different heterogeneous contributions. Leveraging behavioral heterogeneity substantially differs from most existing globaltolocal design approaches. Largely focused on selfassembly [14] and formation control problems [13], existing methods tend to be tailored for their respective applications. The method proposed by Klavins, for example, makes use of graph grammars and optimizes their execution rates to design a system that selfassembles into simple controlled shapes [15]. Rubenstein et al. [16] proposed a distributed algorithm for selfassembly and experimented with a swarm of more than a thousand robots. In their study, robots are given a blueprint of the desired shape and follow only local cues to incrementally position themselves according to the blueprint. In collective construction, Werfel et al. [17] propose a compilation method that decomposes a userspecified structure into a set of construction paths that robots follow to build a desired artifact. Similarly to our approach, all these studies use optimization methods to explore a (possibly constrained) design space.
The potential of behavioral heterogeneity has been previously investigated in an aggregation scenario [48]. In this study, the authors show, by means of evolutionary computation techniques, that heterogeneous swarms can outperform their homogeneous counterparts. Our method has some similarities with the approach used by Hamann et al. [49] to analyze collective motion in locust swarms; they use a linear combination of polynomials to fit a network model to macroscopic measurements of simulations. In their approach, the regression coefficients provide information about the spatial distribution of agents in the swarm. We have previously published an approach that is conceptually similar to the one we present here [50, 51]. One of the main differences is that we used evolutionary computation to select controller types instead of the more sophisticated and efficient optimization technique used in this paper. In addition, we also made use of simulations to estimate global effects in contrast to the formal approach we propose in this study.
In future work, we plan to extend the method and to apply it to task allocation scenarios with more than two tasks. This extension will require the definition of other linear constraints in addition to those defined in Section 2.5 that are necessary to uniquely derive a response vector from the user input. This could be achieved, for example, by considering different priorities among the tasks to be executed. We note that the total number of agent controllers is an exponential function of the number of tasks. However, penalized regression techniques allow us to consider highdimensional search spaces and to investigate a reasonable range of application scenarios. We also plan to perform a thorough algebraic characterization of our basis vectors and response vectors with the aim to improve the performance of the design method. We believe that our design idea of behaviorally heterogeneous agents has potential for a wider range of applications beyond task allocation. Our primary goal is therefore to deepen our understanding of the fundamental principles of behavioral heterogeneity. We want to extend our approach to many different swarm scenarios, such as collective decisionmaking and spatially organizing tasks.
We believe that our proposed approach is a fundamentally novel paradigm for the design of robot swarms and that the idea of programming the swarm at a global level by following a recipe that describes how to put together the right amounts of different robot controller types, almost as if they were ingredients of a cake, is particularly intriguing. With our approach, the swarm can be reprogrammed on a global level at runtime by adding robots of different robot controller types, without the need for the individual robots to be programmable.
Data Availability
Video recordings of the experiments are available on figshare at 10.6084/m9.figshare.19688809.
Conflicts of Interest
The authors declare that there is no conflict of interest regarding the publication of this article.
Authors’ Contributions
G. Valentini and H. Hamann conceived the idea and designed the experiments. G. Valentini conducted the experiments, and all authors contributed to their analysis and discussion. All authors contributed equally to the writing of the manuscript.
Acknowledgments
This work has been partially supported by the European Research Council through the ERC Advanced Grant “ESWARM: Engineering Swarm Intelligence Systems” (contract 246939). Marco Dorigo acknowledges support from the Belgian F.R.S.–Fonds De La Recherche Scientifique  FNRS.
Supplementary Materials
Movie 1: unimodal scenario. Movie 2: bimodal scenario. (Supplementary Materials)
References
 M. Dorigo, M. Birattari, and M. Brambilla, “Swarm robotics,” Scholarpedia, vol. 9, no. 1, p. 1463, 2014. View at: Publisher Site  Google Scholar
 E. Şahin, “Swarm robotics: from sources of inspiration to domains of application,” in International workshop on swarm robotics, pp. 10–20, Springer, Berlin, Heidelberg, 2004. View at: Google Scholar
 M. Brambilla, E. Ferrante, M. Birattari, and M. Dorigo, “Swarm robotics: a review from the swarm engineering perspective,” Swarm Intelligent, vol. 7, no. 1, pp. 1–41, 2013. View at: Publisher Site  Google Scholar
 K. Lerman, A. Martinoli, and A. Galstyan, “A review of probabilistic macroscopic models for swarm robotic systems,” in International workshop on swarm robotics, pp. 143–152, Springer, Berlin, Heidelberg, 2004. View at: Google Scholar
 H. Hamann and H. Wörn, “A framework of space–time continuous models for algorithm design in swarm robotics,” Swarm Intelligence, vol. 2, no. 24, pp. 209–239, 2008. View at: Publisher Site  Google Scholar
 M. Massink, M. Brambilla, D. Latella, M. Dorigo, and M. Birattari, “On the use of BioPEPA for modelling and analysing collective behaviours in swarm robotics,” Swarm Intelligence, vol. 7, no. 2, pp. 201–228, 2013. View at: Publisher Site  Google Scholar
 M. Brambilla, A. Brutschy, M. Dorigo, and M. Birattari, “Propertydriven design for robot swarms: a design method based on prescriptive modeling and model checking,” ACM Transactions on Autonomous and Adaptive Systems (TAAS), vol. 9, no. 4, pp. 1–28, 2014. View at: Google Scholar
 P. Kouvaros and A. Lomuscio, “A counter abstraction technique for the verification of robot swarms,” in Proceedings of the TwentyNinth AAAI Conference on Artificial Intelligence, vol. 29, no. 1, pp. 2081–2088, AAAI Press, 2015. View at: Publisher Site  Google Scholar
 P. Kouvaros and A. Lomuscio, “Verifying emergent properties of swarms,” in TwentyFourth International Joint Conference on Artificial Intelligence, Buenos Aires, Argentina, June 2015. View at: Google Scholar
 S. Berman, A. Halasz, M. Hsieh, and V. Kumar, “Optimized stochastic policies for task allocation in swarms of robots,” IEEE Transactions on Robotics, vol. 25, no. 4, pp. 927–937, 2009. View at: Publisher Site  Google Scholar
 S. Berman, R. Nagpal, and A. Halász, “Optimization of stochastic strategies for spatially inhomogeneous robot swarms: a case study in commercial pollination,” in 2011 IEEE/RSJ international conference on intelligent robots and systems, pp. 3923–3930, San Francisco, CA, USA, September 2011. View at: Publisher Site  Google Scholar
 K. Dantu, S. Berman, B. Kate, and R. Nagpal, “A comparison of deterministic and stochastic approaches for allocating spatially dependent tasks in microaerial vehicle collectives,” in 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 793–800, VilamouraAlgarve, Portugal, October 2012. View at: Publisher Site  Google Scholar
 J. Cheng, W. Cheng, and R. Nagpal, “Robust and selfrepairing formation control for swarms of mobile agents,” in Proceedings of the Twentieth AAAI Conference on Artificial Intelligence, vol. 5, pp. 59–64, AAAI Press, 2005. View at: Google Scholar
 R. Nagpal, “Programmable selfassembly using biologicallyinspired multiagent control,” in Proceedings of the First International Conference on Autonomous Agents and Multiagent Systems, ser. AAMAS’02, pp. 418–425, Bologna, Italy, 2002. View at: Google Scholar
 E. Klavins, “Programmable selfassembly,” IEEE Control Systems, vol. 27, no. 4, pp. 43–56, 2007. View at: Publisher Site  Google Scholar
 M. Rubenstein, A. Cornejo, and R. Nagpal, “Programmable selfassembly in a thousandrobot swarm,” Science, vol. 345, no. 6198, pp. 795–799, 2014. View at: Publisher Site  Google Scholar
 J. Werfel, K. Petersen, and R. Nagpal, “Designing collective behavior in a termiteinspired robot construction team,” Science, vol. 343, no. 6172, pp. 754–758, 2014. View at: Publisher Site  Google Scholar
 D. Yamins and R. Nagpal, “Automated globaltolocal programming in 1d spatial multiagent systems,” in Proceedings of the 7th International Joint Conference on Autonomous Agents and Multiagent Systems  Volume 2, ser. AAMAS ‘08, pp. 615–622, International Foundation for Autonomous Agents and Multiagent Systems, Estoril, Portugal, 2008. View at: Google Scholar
 M. Dorigo, G. Theraulaz, and V. Trianni, “Reflections on the future of swarm robotics,” Science Robotics, vol. 5, no. 49, article eabe4385, 2020. View at: Publisher Site  Google Scholar
 M. Dorigo, G. Theraulaz, and V. Trianni, “Swarm robotics: past, present, and future,” Proceedings of the IEEE, vol. 109, no. 7, pp. 1152–1165, 2021. View at: Publisher Site  Google Scholar
 G. Notomista, S. Mayya, Y. Emam et al., “A resilient and energyaware task allocation framework for heterogeneous multirobot systems,” IEEE Transactions on Robotics, vol. 38, no. 1, pp. 159–179, 2022. View at: Publisher Site  Google Scholar
 L. Wang, A. Ames, and M. Egerstedt, “Safety barrier certificates for heterogeneous multirobot systems,” in 2016 American control conference (ACC), pp. 5213–5218, Boston, MA, USA, 2016. View at: Publisher Site  Google Scholar
 A. Georgopoulos, A. Schwartz, and R. Kettner, “Neuronal population coding of movement direction,” Science, vol. 233, no. 4771, pp. 1416–1419, 1986. View at: Publisher Site  Google Scholar
 A. Pouget, P. Dayan, and R. Zemel, “Information processing with population codes,” Nature Reviews Neuroscience, vol. 1, no. 2, pp. 125–132, 2000. View at: Publisher Site  Google Scholar
 M. J. B. Krieger, J. B. Billeter, and L. Keller, “Antlike task allocation and recruitment in cooperative robots,” Nature, vol. 406, no. 6799, pp. 992–995, 2000. View at: Publisher Site  Google Scholar
 W. Agassounon and A. Martinoli, “Efficiency and robustness of thresholdbased distributed allocation algorithms in multiagent systems,” in Proceedings of the first international joint conference on Autonomous agents and multiagent systems part 3  AAMAS '02, pp. 1090–1097, 2002. View at: Publisher Site  Google Scholar
 T. H. Labella, M. Dorigo, and J.L. Deneubourg, “Division of labor in a group of robots inspired by ants' foraging behavior,” ACM Transactions on Autonomous and Adaptive Systems, vol. 1, no. 1, pp. 4–25, 2006. View at: Publisher Site  Google Scholar
 B. P. Gerkey and M. J. Matarić, “A formal analysis and taxonomy of task allocation in multirobot systems,” The International Journal of Robotics Research, vol. 23, no. 9, pp. 939–954, 2004. View at: Publisher Site  Google Scholar
 O. Shehory and S. Kraus, “Methods for task allocation via agent coalition formation,” Artificial Intelligence, vol. 101, no. 1–2, pp. 165–200, 1998. View at: Publisher Site  Google Scholar
 M. Dias, R. Zlot, N. Kalra, and A. Stentz, “Marketbased multirobot coordination: a survey and analysis,” Proceedings of the IEEE, vol. 94, no. 7, pp. 1257–1270, 2006. View at: Publisher Site  Google Scholar
 K. H. Low, W. K. Leow, and M. H. Ang Jr., “Task allocation via selforganizing swarm coalitions in distributed mobile sensor network,” in Proceedings of the Nineteenth AAAI Conference on Artificial Intelligence, vol. 4, pp. 28–33, AAAI Press, 2004. View at: Google Scholar
 M. CastilloCagigal, A. Brutschy, A. Gutiérrez, and M. Birattari, “Temporal task allocation in periodic environments,” in Swarm Intelligence, ser. LNCS, vol. 8667, pp. 182–193, Springer, 2014. View at: Google Scholar
 W. Agassounon, A. Martinoli, and K. Easton, “Macroscopic modeling of aggregation experiments using embodied agents in teams of constant and timevarying sizes,” Autonomous Robots, vol. 17, no. 2/3, pp. 163–192, 2004. View at: Publisher Site  Google Scholar
 R. Tibshirani, “Regression shrinkage and selection via the lasso,” Journal of the Royal Statistical Society: Series B (Methodological), vol. 58, no. 1, pp. 267–288, 1996. View at: Publisher Site  Google Scholar
 A. Dussutour, M. Beekman, S. Nicolis, and B. Meyer, “Noise improves collective decisionmaking by ants in dynamic environments,” Proceedings of the Royal Society B: Biological Sciences, vol. 276, no. 1677, pp. 4353–4361, 2009. View at: Publisher Site  Google Scholar
 C. A. Yates, R. Erban, C. Escudero et al., “Inherent noise can facilitate coherence in collective swarm motion,” Proceedings of the National Academy of Sciences, vol. 106, no. 14, pp. 5464–5469, 2009. View at: Publisher Site  Google Scholar
 G. Valentini, H. Hamann, and M. Dorigo, “Efficient decisionmaking in a selforganizing robot swarm: on the speed versus accuracy tradeoff,” in Proceedings of the 14th International Conference on Autonomous Agents and Multiagent Systems, ser. AAMAS’15, pp. 1305–1314, IFAAMAS, Istanbul, Turkey, 2015. View at: Google Scholar
 G. Valentini, E. Ferrante, and M. Dorigo, “The bestofn problem in robot swarms: formalization, state of the art, and novel perspectives,” Frontiers in Robotics and AI, vol. 4, 2017. View at: Publisher Site  Google Scholar
 G. Valentini, M. Birattari, and M. Dorigo, “Majority rule with differential latency: an absorbing Markov chain to model consensus,” in Proceedings of the European Conference on Complex Systems 2012, ser. Springer Proceedings in Complexity, pp. 651–658, Springer, 2013. View at: Publisher Site  Google Scholar
 L. Arnold, Random Dynamical Systems, Springer, 1998. View at: Publisher Site
 J. G. Kemeny and J. L. Snell, Finite Markov Chains, Springer, 1976.
 J. Buhl, D. J. T. Sumpter, I. D. Couzin et al., “From disorder to order in marching locusts,” Science, vol. 312, no. 5778, pp. 1402–1406, 2006. View at: Publisher Site  Google Scholar
 H. Silk, M. Homer, and T. Gross, “Design of selforganizing networks: creating specified degree distributions,” IEEE Transactions on Network Science and Engineering, vol. 3, no. 3, pp. 147–158, 2016. View at: Publisher Site  Google Scholar
 D. Bray, “Protein molecules as computational elements in living cells,” Nature, vol. 376, no. 6538, pp. 307–312, 1995. View at: Publisher Site  Google Scholar
 S. Hauert, S. Berman, R. Nagpal, and S. N. Bhatia, “A computational framework for identifying design guidelines to increase the penetration of targeted nanoparticles into tumors,” Nano Today, vol. 8, no. 6, pp. 566–576, 2013. View at: Publisher Site  Google Scholar
 S. Hauert and S. N. Bhatia, “Mechanisms of cooperation in cancer nanomedicine: towards systems nanotechnology,” Trends in Biotechnology, vol. 32, no. 9, pp. 448–455, 2014. View at: Publisher Site  Google Scholar
 J.S. Choi, Y. W. Jun, S. I. Yeon, H. C. Kim, J. S. Shin, and J. Cheon, “Biocompatible heterostructured nanoparticles for multimodal biological detection,” Journal of the American Chemical Society, vol. 128, no. 50, pp. 1598215983, 2006. View at: Publisher Site  Google Scholar
 D. Kengyel, H. Hamann, P. Zahadat, G. Radspieler, F. Wotawa, and T. Schmickl, “Potential of heterogeneity in collective behaviors: a case study on heterogeneous swarms,” in PRIMA 2015: Principles and Practice of MultiAgent Systems, ser. LNCS, Q. Chen, P. Torroni, S. Villata, J. Hsu, and A. Omicini, Eds., vol. 9387, pp. 201–217, Springer, 2015. View at: Publisher Site  Google Scholar
 H. Hamann, G. Valentini, Y. Khaluf, and M. Dorigo, “Derivation of a micromacro link for collective decisionmaking systems: uncover network features based on drift measurements,” in Parallel Problem Solving from Nature–PPSN XIII, ser. LNCS, vol. 8672 of English, pp. 181–190, Springer, 2014. View at: Publisher Site  Google Scholar
 H. Hamann, G. Valentini, and M. Dorigo, “Population coding: a new design paradigm for embodied distributed systems,” in Swarm Intelligence: 10th International Conference, ANTS 2016, LNCS 9882, pp. 173–184, Springer, 2016. View at: Publisher Site  Google Scholar
 M. Niess and H. Hamann, “Selforganized construction by population coding,” in 2019 IEEE 4th International Workshops on Foundations and Applications of Self* Systems (FAS*W), pp. 219–224, Umea, Sweden, June 2019. View at: Publisher Site  Google Scholar
Copyright
Copyright © 2022 Gabriele Valentini et al. Exclusive Licensee Zhejiang Lab, China. Distributed under a Creative Commons Attribution License (CC BY 4.0).