E-Anlia commited on
Commit
fe5839c
·
verified ·
1 Parent(s): 2d3bcfc

Upload XML Prompt.json

Browse files
Files changed (1) hide show
  1. XML Prompt.json +140 -0
XML Prompt.json ADDED
@@ -0,0 +1,140 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ prompt = f"""
2
+ Please annotate each character in the image and provide image tag information in JSON format, along with a detailed description of the scene.
3
+
4
+ 【Character Identification and Annotation】
5
+ 1. Create a bounding box (bbox) for each character, formatted as [bottom-left x coordinate, bottom-left y coordinate, top-right x coordinate, top-right y coordinate]
6
+ 2. The bounding box should precisely contain the entire character, neither too large nor too small
7
+ 3. Character names are temporarily unknown, please use placeholders like $character_1$, $character_2$, etc.
8
+
9
+ 【Overall Image Analysis】
10
+ 1. After analyzing the positions of all characters, provide an overall description of the image, including both tags and caption sections
11
+ 2. Tags section:
12
+ - Reorganize based on the original tags provided in the <tags> content
13
+ - Group tags by character using structured XML format:
14
+
15
+ <character_1>
16
+ <n>$character_1$</n>
17
+ <gender>1girl/1boy</gender>
18
+ <appearance>facial features, hair color, hair style, eye color, skin tone, age appearance, etc.</appearance>
19
+ <clothing>clothing type, color, style, accessories, footwear, etc.</clothing>
20
+ <body_type>height, build, physical characteristics, etc.</body_type>
21
+ <expression>facial expression, emotional state, mood, etc.</expression>
22
+ <action>current pose, movement, gesture, activity, etc.</action>
23
+ <interaction>interaction with other characters, objects, or environment</interaction>
24
+ <position>precise position in image (center, left, right, foreground, background, etc.)</position>
25
+ </character_1>
26
+
27
+ - Use structured XML format for general tags in <general_tags>:
28
+ * <count>: Overall character count (1girl, 2girls, 3girls, 1boy, multiple boys, etc.)
29
+ * <artists>: Artist name, art style attribution, creator information
30
+ * <style>: Art style (anime style, watercolor, oil painting, digital art, realistic, etc.)
31
+ * <background>: Background type (indoor, outdoor, landscape, cityscape, abstract, etc.)
32
+ * <environment>: Specific environment (room, forest, city, beach, school, office, etc.)
33
+ * <perspective>: Viewpoint (from above, from below, side view, close-up, wide shot, etc.)
34
+ * <atmosphere>: Mood and atmosphere (dark, bright, moody, cheerful, romantic, mysterious, etc.)
35
+ * <lighting>: Lighting conditions (natural light, artificial light, sunset, candlelight, neon, etc.)
36
+ * <quality>: Image quality tags (high resolution, masterpiece, best quality, detailed, etc.)
37
+ * <objects>: Important objects (furniture, decorations, tools, vehicles, weapons, etc.)
38
+ * <other>: Any other scene-related tags not covered in above categories
39
+
40
+ - ALWAYS include gender tags (1girl, 1boy, etc.) for each character in their specific gender section
41
+ - Character information should be precise and detailed; if there are conflicting tags, select the most accurate ones based on your judgment
42
+ - Add detailed information about character positions in the image - this is EXTREMELY important
43
+ - Use professional prompt word format for tags, not natural language
44
+ - If you observe important elements in the image that are not included in the <tags>, add appropriate tags
45
+ - All tags must be in XML format for consistency and easy parsing
46
+ - **IMPORTANT: If any XML attribute is not applicable or not visible in the image, you can omit that tag entirely. Only include tags that are relevant and accurately describe what you can observe in the image.**
47
+
48
+ 3. Caption section (this is the most important part):
49
+ - Use natural language to describe the entire image content in extreme detail
50
+ - Describe each character's name in detail (using placeholders like $character_1$, etc.)
51
+ - Clearly indicate each character's gender
52
+ - Describe each character's clothing in detail, including clothing type, color, style, accessories, etc.
53
+ - Describe each character's actions and poses in detail
54
+ - Describe each character's expressions and emotional states in detail
55
+ - Describe each character's precise position in the image (e.g., "$character_1$ is positioned in the center-left of the frame")
56
+ - Describe the relative positioning between characters in extreme detail (e.g., "$character_1$ is standing to the left of $character_2$, slightly in front")
57
+ - Describe how characters interact with scene elements in detail (e.g., "$character_1$ is sitting on the bed")
58
+ - Describe the background and environmental details in detail (furniture, decorations, etc.)
59
+ - Describe the overall atmosphere, lighting, color tones, perspective, artistic style, and other visual effects of the image in detail
60
+ - Make sure to describe the entire scene in a coherent, flowing narrative rather than simply listing points
61
+ - Ensure the description covers all important visual elements in the image
62
+ - If adult content exists, it is also very important to accurately analyze the adult content! Describe in detail the sexual activities currently taking place, such as positions, oral sex, M-legs, etc., as well as actions such as holding the penis, spreading the vagina, anus, urethra, etc., and other descriptions such as a toy inserted into the vagina, a vibrator inserted into the vagina, and fondling the vagina. These can be found in <tags>. your description should be very detail on this content!!!!!!!!
63
+ - Your description should cover all the details in the picture and be accurate and objective! You only need to describe the elements in the picture, not analyze it by saying what it feels like.
64
+
65
+ 【Format Requirements and Flexibility】
66
+ 1. The JSON format must be maintained correctly
67
+ 2. While maintaining consistent format, you can flexibly supplement tags and descriptions
68
+ 3. Tag format must remain consistent, but content can be adjusted according to the actual situation in the image
69
+ 4. **XML Tag Flexibility: You can omit any XML tags that are not applicable, not visible, or not relevant to the specific image. For example:**
70
+ - If a character's expression is not clearly visible, omit the <expression> tag
71
+ - If there are no notable objects in the scene, omit the <objects> tag
72
+ - If the artist is unknown or not identifiable, omit the <artists> tag
73
+ - Only include tags that accurately describe observable elements in the image
74
+ 5. The caption must be extremely detailed and comprehensive, with a length of at least 200 words
75
+ 6. You must ensure your output is entirely in English!
76
+ ******Your output should cover as many elements and content in the image as possible!!! The more detailed, the better!!**************
77
+
78
+ 【Output Format】
79
+ The content in '<>' should be replaced
80
+ Please output strictly according to the following JSON format, without adding any other content:
81
+ Note: You can omit any XML tags that are not applicable or not clearly observable in the image.
82
+ {{
83
+ "character_1": {{
84
+ "bbox": [x1, y1, x2, y2],
85
+ "name": "$character_1$"
86
+ }},
87
+ "character_2": {{
88
+ "bbox": [x3, y3, x4, y4],
89
+ "name": "$character_2$"
90
+ }},
91
+ // Add more characters in this format as needed
92
+ "image": {{
93
+ "tags": "
94
+ <character_1>
95
+ <n>$character_1$</n>
96
+ <gender>1girl</gender>
97
+ <appearance>detailed appearance description</appearance>
98
+ <clothing>detailed clothing description</clothing>
99
+ <body_type>body type description</body_type>
100
+ <expression>expression description</expression>
101
+ <action>action description</action>
102
+ <interaction>interaction description</interaction>
103
+ <position>position description</position>
104
+ </character_1>
105
+
106
+ <character_2>
107
+ <n>$character_2$</n>
108
+ <gender>1boy</gender>
109
+ <appearance>detailed appearance description</appearance>
110
+ <clothing>detailed clothing description</clothing>
111
+ <body_type>body type description</body_type>
112
+ <expression>expression description</expression>
113
+ <action>action description</action>
114
+ <interaction>interaction description</interaction>
115
+ <position>position description</position>
116
+ </character_2>
117
+
118
+ <general_tags>
119
+ <count>1girl, 1boy, multiple characters, etc.</count>
120
+ <artists>artist name, art style attribution, etc.</artists>
121
+ <style>anime style, watercolor, oil painting, digital art, etc.</style>
122
+ <background>indoor, outdoor, landscape, cityscape, etc.</background>
123
+ <environment>room, forest, city, beach, school, etc.</environment>
124
+ <perspective>from above, from below, side view, close-up, etc.</perspective>
125
+ <atmosphere>dark, bright, moody, cheerful, romantic, etc.</atmosphere>
126
+ <lighting>natural light, artificial light, sunset, candlelight, etc.</lighting>
127
+ <quality>high resolution, masterpiece, best quality, etc.</quality>
128
+ <objects>furniture, decorations, tools, vehicles, etc.</objects>
129
+ <other>any other scene-related tags not covered above</other>
130
+ </general_tags>
131
+ // Note: Omit any tags above that are not applicable to the specific image
132
+ ",
133
+ "caption": "Extremely detailed description of the scene, including all characters' names, genders, appearances, clothing, actions, expressions, precise positions, relative positions, as well as scene background, environment, atmosphere, lighting, objects, perspective, artistic style, etc. The description should be extremely thorough, with vivid details, at least 200 words. Output in English."
134
+ }}
135
+ }}
136
+
137
+ <tags>
138
+ {tags}
139
+ </tags>
140
+ """