You are in control of a reinforcement learning agent that is learning to maximize reward in an object-oriented MDP. However, you can't see the MDP directly, so your task is to translate a set of natural language advices about the MDP into RLang statements given an RLang vocabulary. By translating natural language into RLang, you are able to affect the behavior of the learning agent in various ways.

RLang is a formal language for specifying information about every element of a Markov Decision Process (S,A,R,T). Each RLang object refers to one or more elements of an MDP, and RLang objects are built off of primitive Pythonic expressions and RLang primitives. Here is a list of relevant RLang objects, their descriptions, and example pairings of natural language and syntax examples.

Here is an example RLang Policy you can generate given a set of primitives and a piece of language advice:
Advice = "If the yellow door is open, go through it and walk to the goal. Otherwise open the yellow door if you have the key."
Primitives = ['yellow_door', 'step_towards', 'goal', 'in_inventory', 'yellow_key', 'toggle', 'at']
Policy main:
    if yellow_door.is_open:
        Execute step_towards(goal)
    elif in_inventory(yellow_key) and at(yellow_door) and not yellow_door.is_open:
        Execute toggle
This policy uses 'dot' syntax to access the is_open property of the yellow_door. It also uses a parameterized predicate in_inventory to see if the agent has the yellow key. Lastly, it uses a parameterized action step_towards to make a step towards the goal.

Here is an example of an RLang Plan you can generate:
Advice = "Open the door using the yellow key and go to the goal"
Primitives = ['yellow_door', 'goal', 'pickup', 'yellow_key', 'toggle', 'go_to']
Plan main:
    Execute go_to(yellow_key)
    Execute pickup
    Execute go_to(yellow_door)
    Execute toggle
    Execute go_to(goal)
Plans are much simpler than policies, and they are always executed at the start of the environment episode. This plan uses a go_to sub-plan, which stops executing once the agent is at the target. Importantly, RLang Plans do not support conditional logic. Also, Policies do not support sub-plan executions.

Here is an example of an RLang Effect you can generate:
Advice = "Don't step into the lava. Going forward while you're facing the right with increment your x value, leaving the rest of the world unchanged."
Primitives = ['Lava', 'toggle', 'forward', 'pointing_right', 'at_any', 'agent']
Effect main:
    if A == forward and at_any(Lava):
        Reward -1
    if A == forward and agent.dir == pointing_right:
        agent.x' -> agent.x + 1
        S' -*> S
This Effect communicates a transition and reward function. It contains a new piece of syntax: -> and -*>, which specify predictions, partial transition functions. The above predictions state that the value of the agent's x coordinate at the next state will be its value on the current state plus 1, and that the rest of the state will otherwise remain unchanged.

Again, your goal is to translate multiple pieces of advice about a single environment into one or more RLang Policies, Plans, and Effects. Using the advice and primitives below, generate an RLang program. Importantly, the reward the agent gets from the environment is between 0 and 1.
