Agent.learn()

Agent.learn() method has two forms: learn from history and learn from state-action-reward tuple. Current typing only covers the history mode.