Skip to content

Add files via upload#246

Open
rainbow2526 wants to merge 1 commit into
FlagAI-Open:mainfrom
rainbow2526:main
Open

Add files via upload#246
rainbow2526 wants to merge 1 commit into
FlagAI-Open:mainfrom
rainbow2526:main

Conversation

@rainbow2526

Copy link
Copy Markdown

No description provided.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request switches the default annotation backend from Nvidia to Huawei Ascend, reduces the number of few-shot examples from 100 to 50, and updates various file and model paths to absolute /root/ paths. Feedback focuses on avoiding hardcoded absolute paths to improve portability, and removing the unused task_id parameter from the annotate_ascend call and definition to prevent signature mismatch and runtime errors when switching backends.

Comment on lines +13 to +20
1: '/root/OpenSeek/openseek/competition/LongContext-ICL-Annotation/data/openseek-1_closest_integers.json',
2: '/root/OpenSeek/openseek/competition/LongContext-ICL-Annotation/data/openseek-2_count_nouns_verbs.json',
3: '/root/OpenSeek/openseek/competition/LongContext-ICL-Annotation/data/openseek-3_collatz_conjecture.json',
4: '/root/OpenSeek/openseek/competition/LongContext-ICL-Annotation/data/openseek-4_conala_concat_strings.json',
5: '/root/OpenSeek/openseek/competition/LongContext-ICL-Annotation/data/openseek-5_semeval_2018_task1_tweet_sadness_detection.json',
6: '/root/OpenSeek/openseek/competition/LongContext-ICL-Annotation/data/openseek-6_mnli_same_genre_classification.json',
7: '/root/OpenSeek/openseek/competition/LongContext-ICL-Annotation/data/openseek-7_jeopardy_answer_generation_all.json',
8: '/root/OpenSeek/openseek/competition/LongContext-ICL-Annotation/data/openseek-8_kernel_generation.json',

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Hardcoding absolute paths like /root/OpenSeek/... makes the codebase non-portable and prone to failure when run in different environments or by other users. It is highly recommended to resolve these paths dynamically relative to the script's location using os.path utilities.

Suggested change
1: '/root/OpenSeek/openseek/competition/LongContext-ICL-Annotation/data/openseek-1_closest_integers.json',
2: '/root/OpenSeek/openseek/competition/LongContext-ICL-Annotation/data/openseek-2_count_nouns_verbs.json',
3: '/root/OpenSeek/openseek/competition/LongContext-ICL-Annotation/data/openseek-3_collatz_conjecture.json',
4: '/root/OpenSeek/openseek/competition/LongContext-ICL-Annotation/data/openseek-4_conala_concat_strings.json',
5: '/root/OpenSeek/openseek/competition/LongContext-ICL-Annotation/data/openseek-5_semeval_2018_task1_tweet_sadness_detection.json',
6: '/root/OpenSeek/openseek/competition/LongContext-ICL-Annotation/data/openseek-6_mnli_same_genre_classification.json',
7: '/root/OpenSeek/openseek/competition/LongContext-ICL-Annotation/data/openseek-7_jeopardy_answer_generation_all.json',
8: '/root/OpenSeek/openseek/competition/LongContext-ICL-Annotation/data/openseek-8_kernel_generation.json',
1: os.path.join(os.path.dirname(__file__), '../data/openseek-1_closest_integers.json'),
2: os.path.join(os.path.dirname(__file__), '../data/openseek-2_count_nouns_verbs.json'),
3: os.path.join(os.path.dirname(__file__), '../data/openseek-3_collatz_conjecture.json'),
4: os.path.join(os.path.dirname(__file__), '../data/openseek-4_conala_concat_strings.json'),
5: os.path.join(os.path.dirname(__file__), '../data/openseek-5_semeval_2018_task1_tweet_sadness_detection.json'),
6: os.path.join(os.path.dirname(__file__), '../data/openseek-6_mnli_same_genre_classification.json'),
7: os.path.join(os.path.dirname(__file__), '../data/openseek-7_jeopardy_answer_generation_all.json'),
8: os.path.join(os.path.dirname(__file__), '../data/openseek-8_kernel_generation.json'),

Comment on lines +82 to +84
# prediction = annotate(input_prompt, task_id)
# test_record['prediction'] = prediction
prediction = annotate(input_prompt)
prediction = annotate(input_prompt, task_id)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Passing task_id to annotate breaks compatibility with annotate_nvidia, which only accepts a single argument (input_prompt). If a user switches to Nvidia GPU by uncommenting line 9, the code will raise a TypeError. Since task_id is not used inside annotate_ascend either, it should be removed from the call.

Suggested change
# prediction = annotate(input_prompt, task_id)
# test_record['prediction'] = prediction
prediction = annotate(input_prompt)
prediction = annotate(input_prompt, task_id)
# prediction = annotate(input_prompt)
# test_record['prediction'] = prediction
prediction = annotate(input_prompt)

return prediction

def annotate_ascend(input_prompt:str)->list[str]:
def annotate_ascend(input_prompt:str, task_id:int=None)->list[str]:

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The task_id parameter is unused in annotate_ascend and breaks signature compatibility with annotate_nvidia. It should be removed to maintain consistency and prevent runtime errors when switching between GPU backends.

Suggested change
def annotate_ascend(input_prompt:str, task_id:int=None)->list[str]:
def annotate_ascend(input_prompt:str)->list[str]:

help='Prefix path to save the evaluation logs.')
parser.add_argument('--tokenizer_path', type=str,
default='/share/project/wuhaiming/spaces/data_agent/OpenSeek-main/openseek/competition/LongContext-ICL-Annotation/src/Qwen3-4B')
default='/root/Qwen3-4B')

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Hardcoding /root/Qwen3-4B as the default tokenizer path limits portability. Using an environment variable with a fallback makes it easier to configure across different environments.

Suggested change
default='/root/Qwen3-4B')
default=os.getenv("TOKENIZER_PATH", "/root/Qwen3-4B"))

# 初始化Qwen3-4B的tokenizer(自动下载/加载千问3-4B的分词器)
# 若本地已下载模型,可替换为本地路径,如 "./qwen3-4b"
tokenizer = AutoTokenizer.from_pretrained("/share/project/wuhaiming/spaces/data_agent/OpenSeek-main/openseek/competition/LongContext-ICL-Annotation/src/Qwen3-4B", trust_remote_code=True)
tokenizer = AutoTokenizer.from_pretrained("/root/Qwen3-4B", trust_remote_code=True)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Hardcoding /root/Qwen3-4B as the tokenizer path limits portability. Using an environment variable with a fallback allows for easier configuration.

Suggested change
tokenizer = AutoTokenizer.from_pretrained("/root/Qwen3-4B", trust_remote_code=True)
import os
tokenizer_path = os.getenv("TOKENIZER_PATH", "/root/Qwen3-4B")
tokenizer = AutoTokenizer.from_pretrained(tokenizer_path, trust_remote_code=True)

openai.api_key = "EMPTY"
openai.base_url = "http://localhost:9010/v1/"
model = "Qwen3-4B-ascend-flagos"
model = "/root/Qwen3-4B"

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Hardcoding /root/Qwen3-4B as the model name/path limits portability. Using an environment variable with a fallback allows for easier configuration.

Suggested change
model = "/root/Qwen3-4B"
import os
model = os.getenv("MODEL_PATH", "/root/Qwen3-4B")

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant