api.config.reward_function

Classes

RewardFunction

The reward function. This class is responsible for computing the reward for each agent in the environment.

Module Contents

class api.config.reward_function.RewardFunction

Bases: Generic[api.typing.AgentID, api.typing.StateType, api.typing.RewardType]

The reward function. This class is responsible for computing the reward for each agent in the environment.

abstract reset(agents: List[api.typing.AgentID], initial_state: api.typing.StateType, shared_info: Dict[str, Any]) None

Function to be called each time the environment is reset. This is meant to enable users to design stateful reward functions that maintain information about the game throughout an episode to determine a reward.

Parameters:
  • agents – List of AgentIDs for which this RewardFunc will return a Reward

  • initial_state – The initial state of the reset environment.

  • shared_info – A dictionary with shared information across all config objects.

abstract get_rewards(agents: List[api.typing.AgentID], state: api.typing.StateType, is_terminated: Dict[api.typing.AgentID, bool], is_truncated: Dict[api.typing.AgentID, bool], shared_info: Dict[str, Any]) Dict[api.typing.AgentID, api.typing.RewardType]

Function to compute the reward for a player. This function is given a player argument, and it is expected that the reward returned by this function will be for that player.

Parameters:
  • agents – List of AgentIDs for which this RewardFunc should return a Reward

  • state – The current state of the game.

  • is_terminated – TODO.

  • is_truncated – TODO.

  • shared_info – A dictionary with shared information across all config objects.

Returns:

A dict of rewards, one for each AgentID in agents.