Add files via upload by youkuaz2 · Pull Request #259 · FlagAI-Open/OpenSeek

youkuaz2 · 2026-05-29T08:21:32Z

No description provided.

gemini-code-assist

Code Review

This pull request introduces task-specific prompt routing, Chain-of-Thought (CoT) prompts, and specialized example selection strategies (M05, M09, M10, M11, M19, M20) to optimize the long-context in-context learning annotation pipeline. It also implements a mixed context length strategy and task-differentiated configurations for LLM generation. However, several critical issues were identified: caching examples_str outside the loop in main.py defeats dynamic example selection and the mixed context length strategy; an incorrect HTML entity (&rt; instead of >) in the regex pattern prevents proper cleaning of the thinking process for Task 8; and repeatedly loading the tokenizer from disk across multiple example selection functions creates a severe performance bottleneck.

gemini-code-assist · 2026-05-29T08:23:34Z

        if examples_str is None:
-            examples_str = select_examples(icl_examples, task_description, text2annotate)
+            # M11优化：使用Task 7 Jeopardy线索拆解策略
+            examples_str = select_examples_M11(icl_examples, task_description, text2annotate, task_id, sample_idx)


🚨 Logic Error: Caching examples_str defeats dynamic example selection

In the current implementation, examples_str is initialized to None outside the loop and cached after the first iteration:

if examples_str is None: examples_str = select_examples_M11(...)

Because of this, select_examples_M11 (or any other dynamic selection method) is only called once for the very first sample (sample_idx = 0). For all subsequent samples, the exact same examples are reused.

This completely breaks:

Dynamic Retrieval: Examples are not selected based on the current sample's text2annotate similarity or keywords.

Mixed Context Length Strategy: The sample_index < 50 check in select_examples_M11 will always evaluate to True (since it only runs for sample_idx = 0), meaning a 30k context is used for all samples, which is highly inefficient and defeats the 8k fallback optimization.

Specialized Routing: The other specialized strategies (select_examples_M09 for Task 5, select_examples_M10 for Task 6, select_examples_M20 for other tasks) are imported but never used.

Recommendation: Remove the if examples_str is None: check and route the example selection dynamically based on task_id.

Suggested change

if examples_str is None:

examples_str = select_examples(icl_examples, task_description, text2annotate)

# M11优化：使用Task 7 Jeopardy线索拆解策略

examples_str = select_examples_M11(icl_examples, task_description, text2annotate, task_id, sample_idx)

# Dynamic example selection based on task type (M09, M10, M11, M20)

if task_id == 5:

examples_str = select_examples_M09(icl_examples, task_description, text2annotate, task_id, sample_idx)

elif task_id == 6:

examples_str = select_examples_M10(icl_examples, task_description, text2annotate, task_id, sample_idx)

elif task_id == 7:

examples_str = select_examples_M11(icl_examples, task_description, text2annotate, task_id, sample_idx)

else:

examples_str = select_examples_M20(icl_examples, task_description, text2annotate, task_id, sample_idx)

gemini-code-assist · 2026-05-29T08:23:34Z

+        # Step 1: 过滤<think&rt;标签内容（去思维链）
+        cleaned_result = re.sub(r'<think&rt;.*?</think&rt;', '', whole_result, flags=re.DOTALL)


🐛 Bug: Incorrect HTML entity &rt; in regex pattern

The regex pattern uses <think&rt; and </think&rt; to clean the thinking process:

cleaned_result = re.sub(r'<think&rt;.*?</think&rt;', '', whole_result, flags=re.DOTALL)

However:

The LLM output (whole_result) contains plain text tags like <think> and </think>, not HTML-escaped entities.

Even if it were escaped, the correct HTML entity for > is > (greater than), not &rt;.

Because of this, the regex will fail to match and remove the thinking process, which defeats the "M12 optimization" and can corrupt the extracted code.

Recommendation: Update the regex to match plain text <think> and </think> tags.

Suggested change

# Step 1: 过滤<think&rt;标签内容（去思维链）

cleaned_result = re.sub(r'<think&rt;.*?</think&rt;', '', whole_result, flags=re.DOTALL)

# Step 1: 过滤<think>标签内容（去思维链）

cleaned_result = re.sub(r'<think>.*?</think>', '', whole_result, flags=re.DOTALL)

gemini-code-assist · 2026-05-29T08:23:35Z

    # 初始化Qwen3-4B的tokenizer（自动下载/加载千问3-4B的分词器）
    # 若本地已下载模型，可替换为本地路径，如 "./qwen3-4b"
-    tokenizer = AutoTokenizer.from_pretrained("/share/project/wuhaiming/spaces/data_agent/OpenSeek-main/openseek/competition/LongContext-ICL-Annotation/src/Qwen3-4B", trust_remote_code=True)
+    tokenizer = AutoTokenizer.from_pretrained("/root/Qwen3-4B", trust_remote_code=True)


⚡ Performance Bottleneck: Repeatedly loading tokenizer from disk

In select_examples (and similarly in select_examples_M05, select_examples_M19, select_examples_M09, select_examples_M10, and select_examples_M11), the tokenizer is loaded from disk on every single function call:

tokenizer = AutoTokenizer.from_pretrained("/root/Qwen3-4B", trust_remote_code=True)

Since example selection is now dynamic and runs for every test sample, this will reload the tokenizer thousands of times, causing a massive performance bottleneck and extremely slow evaluation.

You already implemented a singleton helper get_tokenizer_m20() to cache the tokenizer:

def get_tokenizer_m20(): global _tokenizer_m20 if _tokenizer_m20 is None: _tokenizer_m20 = AutoTokenizer.from_pretrained("/root/Qwen3-4B", trust_remote_code=True) return _tokenizer_m20

Recommendation: Use get_tokenizer_m20() across all select_examples functions to reuse the cached tokenizer instance.

Suggested change

# 初始化Qwen3-4B的tokenizer（自动下载/加载千问3-4B的分词器）

# 若本地已下载模型，可替换为本地路径，如 "./qwen3-4b"

tokenizer = AutoTokenizer.from_pretrained("/share/project/wuhaiming/spaces/data_agent/OpenSeek-main/openseek/competition/LongContext-ICL-Annotation/src/Qwen3-4B", trust_remote_code=True)

tokenizer = AutoTokenizer.from_pretrained("/root/Qwen3-4B", trust_remote_code=True)

# 使用全局单例避免重复加载分词器

tokenizer = get_tokenizer_m20()

Add files via upload

857b84a

gemini-code-assist Bot reviewed May 29, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add files via upload#259

Add files via upload#259
youkuaz2 wants to merge 1 commit into
FlagAI-Open:mainfrom
youkuaz2:main

youkuaz2 commented May 29, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot May 29, 2026

Uh oh!

gemini-code-assist Bot May 29, 2026

Uh oh!

gemini-code-assist Bot May 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

-        if examples_str is None:
-            examples_str = select_examples(icl_examples, task_description, text2annotate)
-            # M11优化：使用Task 7 Jeopardy线索拆解策略
-            examples_str = select_examples_M11(icl_examples, task_description, text2annotate, task_id, sample_idx)
+        # Dynamic example selection based on task type (M09, M10, M11, M20)
+        if task_id == 5:
+            examples_str = select_examples_M09(icl_examples, task_description, text2annotate, task_id, sample_idx)
+        elif task_id == 6:
+            examples_str = select_examples_M10(icl_examples, task_description, text2annotate, task_id, sample_idx)
+        elif task_id == 7:
+            examples_str = select_examples_M11(icl_examples, task_description, text2annotate, task_id, sample_idx)
+        else:
+            examples_str = select_examples_M20(icl_examples, task_description, text2annotate, task_id, sample_idx)

		# Step 1: 过滤<think&rt;标签内容（去思维链）
		cleaned_result = re.sub(r'<think&rt;.*?</think&rt;', '', whole_result, flags=re.DOTALL)

Uh oh!

Conversation

youkuaz2 commented May 29, 2026

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot May 29, 2026

Choose a reason for hiding this comment

🚨 Logic Error: Caching examples_str defeats dynamic example selection

Uh oh!

gemini-code-assist Bot May 29, 2026

Choose a reason for hiding this comment

🐛 Bug: Incorrect HTML entity &rt; in regex pattern

Uh oh!

gemini-code-assist Bot May 29, 2026

Choose a reason for hiding this comment

⚡ Performance Bottleneck: Repeatedly loading tokenizer from disk

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

🚨 Logic Error: Caching `examples_str` defeats dynamic example selection

🐛 Bug: Incorrect HTML entity `&rt;` in regex pattern