Give a prompt, Imagine there is an image captioned by the prompt, perform the following actions:
- Identify as many objects and their corresponding attributes(###COLOR, ACTION.etc) as possible based on prompt and output the them as a phrase of combination of attribute and object. ###Replicate objects appearing multiple times and denote their instances distinctly.
- Classify the objects into specific category: 1.general objects. 2.text in the image. 3.Objects that specifically refer to the Proper Noun of the real world(###One entity only be classified in one category)
- Identify as many relations among objects in Step 2 as possible and output a list in the format [ENTITY 1, TYPE of ENTITY 1, RELATION, ENTITY 2, TYPE of ENTITY 2] and the type should be the objects' category.
- based on the relation analysis, consider the objects' location in the image. follow the three step below: 1. Consider result in Step 3 as a knowledge graph where entity is node and relation is edge, calculate the node with max degree in knowledge graph. 2. Locate the node with max degree first in the canvas and use this object as an anchor. 3. Based on anchor and the relation in knowledge graph, generate other objects' location in knowledge graph. Not only the location between objects but also the location in the whole canvas. Remember all mentioned objects are foreground and the object should not cover the whole image. 
- based on the location analysis and consider objects' scale and spatial coordinates, striving for representative realism wherever feasible, generate the objects' bounding box in the image and format of these bounding boxes is an array as [x1, y1, x2, y2], with (x1, y1) marking the upper left corner and (x2, y2) the lower right corner. Maintain coordinate values between 0 and 1, accurate to three decimal points. the width and height of bounding box should be bigger than 0.105 in min and smaller than 0.895 in max.
- Convert the result of Step 5 strictly adhering to the following structure:{{"object1": {{"phrase": "the name of object","coordinates": [x1, y1, x2, y2]}},"object2": {{"phrase": "the name of object","coordinates": [x1, y1, x2, y2]}},...}}

Here are some examples:

prompt:A blue basketball jersey with the Golden State Warriors logo and 'Stephen Curry' written on it.
output:
Step 1
Objects:a blue Basketball Jersey, the text of 'Stephen Curry', Golden State Warriors Logo
Step 2
General Objects: a blue Basketball Jersey
Text in the Image: the text of 'Stephen Curry'
Objects that specifically refer to the Proper Noun of the Real World: Golden State Warriors Logo
Step 3
triplets:[a blue Basketball Jersey, General Objects, has, the text of 'Stephen Curry', Text in the Image][a blue Basketball Jersey, General Objects, has, Golden State Warriors Logo, Objects that specifically refer to the Proper Noun of the Real World][the text of 'Stephen Curry', Text in the Image, belongs to, a blue Basketball Jersey, General Objects][Golden State Warriors Logo, Objects that specifically refer to the Proper Noun of the Real World, is on, a blue Basketball Jersey, General Objects]
Step 4
(1)Node with max degree: a blue Basketball Jersey. (2)Location of a blue Basketball Jersey: [The mid-center of the canvas]. (3)Location of 'the text of Stephen Curry', Golden State Warriors Logo: 'the text of Stephen Curry': [Near the lower-center region of the Basketball Jersey]; Golden State Warriors Logo: [Towards the upper-mid region of the Basketball Jersey].
Step 5
Bounding Boxes of the Objects:a blue Basketball Jersey, General Objects: [0.300, 0.100, 0.800, 0.900] - The object spans the entire image.
'Stephen Curry' Text, Text in the Image: [0.200, 0.500, 0.800, 0.600] - The name is usually printed at the back of the jersey and centrally aligned.
Golden State Warriors Logo, Objects that specifically refer to the Proper Noun of the Real World: [0.300, 0.100, 0.700, 0.300] - The logo is often positioned at the front and top half of the jersey.
Step 6
{"Basketball Jersey": {"phrase": "a blue Basketball Jersey","coordinates": [0.300, 0.100, 0.800, 0.900]},"Stephen Curry": {"phrase": "'Stephen Curry' Text","coordinates": [0.200, 0.500, 0.800, 0.600]},"Golden State Warriors Logo": {"phrase": "Golden State Warriors Logo","coordinates": [0.300, 0.100, 0.700, 0.300]}}


prompt:A young girl dressed in pink standing on a chair stirring a bowl.
output:
Step 1
Objects:Young Girl, Chair, Bowl, Pink Dress, Stirring Utensil
Step 2
General Objects: Young Girl, Chair, Bowl, Pink Dress, Stirring Utensil
Text in the Image: N/A
Objects that specifically refer to the Proper Noun of the Real World: N/A
Step 3
triplets:[Young Girl, General Objects, is wearing, Pink Dress, General Objects][Young Girl, General Objects, is using, Stirring Utensil, General Objects][Young Girl, General Objects, is standing on, Chair, General Objects][Stirring Utensil, General Objects, is in, Bowl, General Objects][Chair, General Objects, is supporting, Young Girl, General Objects][Bowl, General Objects, is being stirred by, Stirring Utensil, General Objects][Pink Dress, General Objects, is worn by, Young Girl, General Objects]
Step 4
(1) Node with max degree: Young Girl.(2) Location of Young Girl: [Center right of the canvas].(3) Location of Pink Dress, Chair, Bowl, Stirring Utensil:Pink Dress: [Overlaying the Young Girl];Chair: [Directly below the Young Girl, indicating she is standing on it];Bowl: [at the table on the left side of the Young Girl, at waist level, to show she is stirring it];Stirring Utensil: [Within the Bowl, extending into Young Girl's hands which are placed above the Bowl]
Step 5
Bounding Boxes of the Objects:Young Girl, General Objects: [0.263, 0.017, 0.797, 0.993] - The young girl should be central right in the image, occupying most of the space.Chair, General Objects: [0.486, 0.557, 0.792, 0.995] - The chair is under the girl, making its position likely towards the middle bottom of the image.Pink Dress, General Objects: [0.469, 0.146, 0.777, 0.626] - The pink dress, being worn by the girl, would align with the same area the girl occupies in the image.Stirring Utensil, General Objects: [0.261, 0.296, 0.343, 0.428] - As the girl is using the utensil, this should be located in the middle upper area of the image, probably held in her hand.Bowl, General Objects: [0.288, 0.407, 0.46, 0.513] - The bowl would likely be in the left side of the girl and visible in the middle middle of the image.
Step 6
{"Young Girl": {"phrase": "Young Girl","coordinates": [0.263, 0.017, 0.797, 0.993]},"Chair": {"phrase": "Chair","coordinates": [0.486, 0.557, 0.792, 0.995]},"Pink Dress": {"phrase": "Pink Dress","coordinates": [0.469, 0.146, 0.777, 0.626]},"Stirring Utensil": {"phrase": "Stirring Utensil","coordinates": [0.261, 0.296, 0.343, 0.428]},"Bowl": {"phrase": "Bowl","coordinates": [0.288, 0.407, 0.46, 0.513]}}

prompt: Five sheep are grazing in a field.
Step 1
Objects: A grazing sheep(1), A grazing sheep(2), A grazing sheep(3), A grazing sheep(4), A grazing sheep(5), Field.
Step 2
General Objects: A grazing sheep(1), A grazing sheep(2), A grazing sheep(3), A grazing sheep(4), A grazing sheep(5), FieldText in the Image: N/AObjects that specifically refer to the Proper Noun of the Real World: N/A
Step 3
triplets: [A grazing sheep(1), General Objects, is in, Field, General Objects][A grazing sheep(2), General Objects, is in, Field, General Objects][A grazing sheep(3), General Objects, is in, Field, General Objects][A grazing sheep(4), General Objects, is in, Field, General Objects][A grazing sheep(5), General Objects, is in, Field, General Objects]
Step 4
(1) Node with max degree: Field. (2) Location of Field: [Spanning the entire lower part of the canvas]. (3) Location of A grazing sheep(1), A grazing sheep(2), A grazing sheep(3), A grazing sheep(4), A grazing sheep(5): A grazing sheep(1): [Lower left of the canvas]; A grazing sheep(2): [Lower center-left of the canvas]; A grazing sheep(3): [Center of the canvas]; A grazing sheep(4): [Lower center-right of the canvas]; A grazing sheep(5): [Lower right of the canvas].
Step 5
Bounding Boxes of the Objects:A grazing sheep(1), General Objects: [0.100, 0.550, 0.300, 0.850] - Located towards the lower left part of the canvas.A grazing sheep(2), General Objects: [0.225, 0.550, 0.425, 0.850] - Located towards the lower center-left part of the canvas.A grazing sheep(3), General Objects: [0.350, 0.550, 0.550, 0.850] - Centrally located in the canvas but in the lower part.A grazing sheep(4), General Objects: [0.475, 0.550, 0.675, 0.850] - Located towards the lower center-right part of the canvas.A grazing sheep(5), General Objects: [0.600, 0.550, 0.800, 0.850] - Located towards the lower right part of the canvas.Field, General Objects: [0.000, 0.500, 1.000, 1.000] - The field spans the entire lower portion of the canvas.
Step 6
{"Sheep1": {"phrase": "A grazing sheep","coordinates": [0.100, 0.550, 0.300, 0.850]},"Sheep2": {"phrase": "A grazing sheep","coordinates": [0.225, 0.550, 0.425, 0.850]},"Sheep3": {"phrase": "A grazing sheep","coordinates": [0.350, 0.550, 0.550, 0.850]},"Sheep4": {"phrase": "A grazing sheep","coordinates": [0.475, 0.550, 0.675, 0.850]},"Sheep5": {"phrase": "A grazing sheep","coordinates": [0.600, 0.550, 0.800, 0.850]},"Field": {"phrase": "Field","coordinates": [0.000, 0.500, 1.000, 1.000]}}

prompt:
output: