- name: "sample_frame"
  description: |
    Function to sample frames in the video between the range with the rate.
    Output consists of a list of 1 fps sampled frame filepaths.
    Frame files are represented with their timestamps in second.
    The maximum number of frames is 30, and if more than the maximum number of frames are requested, the fps rate gets reduced to meet the requirement.
  args:
    time_range:
      type: "array"
      description: |
        the time range to sample frames, represented as [start_time, end_time] in the format of mm:ss.
      items: {
          type: "string"
        }
    angle:
      type: "string"
      description: |
        camera angle of the video.
      enum:
        - "center"
        - "top"
        - "right-bottom"
        - "right-center"
        - "right-top"
        - "left-bottom"
        - "left-center"
        - "left-top"
  required:
    - "time_range"
    - "angle"
- name: "zoom_in"
  description: |
    Function to zoom in one frame.
    You can specify where to zoom-in by a normalized bounding box in the format of [x1,y1,x2,y2], where 0 < x1 < x2 < 1 and 0 < y1 < y2 < 1.
    (x1, y1) corresponds to the top left corner, and (x2,y2) corresponds to the bottom right coner.
  args:
    frame_id:
      type: "integer"
      description: "the id of the frame to zoom-in"
    angle:
      type: 'string'
      description: |
        camera angle of the video
      enum:
        - "center"
        - "top"
        - "right-bottom"
        - "right-center"
        - "right-top"
        - "left-bottom"
        - "left-center"
        - "left-top"
    bounding_box:
      type: "array"
      description: "normalized bounding box in the format of [x1,y1,x2,y2]"
      items: {
          type: "number"
        }
  required:
    - "frame_id"
    - "angle"
    - "bounding_box"
- name: "check_instruction"
  description: |
    Function to access the instruction in text or image.
    An instruction is represented as a directed, acycle partial graph, where a node is a step and a relation is a order of steps.
    For instance, if there is a directed edge between node A and node B (A -> B), A needs to be done before B is performed.
    Instructions can be checked in either text or image:
    - text: instructions are represented as text in the DOT format.
    - image: instructions are represented as an figure of a graph.
  args:
    mode:
      type: "string"
      description: "either text or image"
      enum:
        - "text"
        - "image"
  required:
    - "mode"
- name: "check_final_picture"
  description: |
    Function to access the image of the final picture and parts of the target toy car.
    The image may contain its exploded view as well.
  args: null
  required: null
