api.config.reward_function ========================== .. py:module:: api.config.reward_function Classes ------- .. autoapisummary:: api.config.reward_function.RewardFunction Module Contents --------------- .. py:class:: RewardFunction Bases: :py:obj:`Generic`\ [\ :py:obj:`api.typing.AgentID`\ , :py:obj:`api.typing.StateType`\ , :py:obj:`api.typing.RewardType`\ ] The reward function. This class is responsible for computing the reward for each agent in the environment. .. py:method:: reset(agents: List[api.typing.AgentID], initial_state: api.typing.StateType, shared_info: Dict[str, Any]) -> None :abstractmethod: Function to be called each time the environment is reset. This is meant to enable users to design stateful reward functions that maintain information about the game throughout an episode to determine a reward. :param agents: List of AgentIDs for which this RewardFunc will return a Reward :param initial_state: The initial state of the reset environment. :param shared_info: A dictionary with shared information across all config objects. .. py:method:: get_rewards(agents: List[api.typing.AgentID], state: api.typing.StateType, is_terminated: Dict[api.typing.AgentID, bool], is_truncated: Dict[api.typing.AgentID, bool], shared_info: Dict[str, Any]) -> Dict[api.typing.AgentID, api.typing.RewardType] :abstractmethod: Function to compute the reward for a player. This function is given a player argument, and it is expected that the reward returned by this function will be for that player. :param agents: List of AgentIDs for which this RewardFunc should return a Reward :param state: The current state of the game. :param is_terminated: TODO. :param is_truncated: TODO. :param shared_info: A dictionary with shared information across all config objects. :return: A dict of rewards, one for each AgentID in agents.