-
Notifications
You must be signed in to change notification settings - Fork 844
(patch): return entire list #3539
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
hanouticelina
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ErikKaum Which model were you using?
I tried:
import os
from huggingface_hub import InferenceClient
client = InferenceClient(
provider="hf-inference",
api_key=os.environ["HF_TOKEN"],
)
result = client.text_classification(
"I like you. I love you",
model="tabularisai/multilingual-sentiment-analysis",
top_k=3,
)
print(result)
# [TextClassificationOutputElement(label='Very Positive', score=0.6660197973251343), TextClassificationOutputElement(label='Positive', score=0.23012897372245789), TextClassificationOutputElement(label='Neutral', score=0.061766646802425385)]The output isn't truncated, the API returns a list of lists, so we use [0] to get the inner list, where each element is a TextClassificationOutputElement.
same for summarization and translation (we don't expect a list as output for these tasks).
|
Ah that's gnarly, in that case it might be that the API serves it in the wrong format 😓 I used this model: emotion-english-distilroberta-base , with the default container in inference endpoints. Did you use TEI or something else to run the model? |
no it's calling directly https://endpoints.huggingface.co/hf-inference/endpoints/auto-multilingual-sentiment-ana that uses btw even with emotion-english-distilroberta-base, i'm getting the expected output (i.e. a list of import os
from huggingface_hub import InferenceClient
client = InferenceClient(
provider="hf-inference",
api_key=os.environ["HF_TOKEN"],
)
result = client.text_classification(
"I like you. I love you",
model="j-hartmann/emotion-english-distilroberta-base",
top_k=3,
)
print(result)
# [TextClassificationOutputElement(label='joy', score=0.9762780666351318), TextClassificationOutputElement(label='sadness', score=0.006413786672055721), TextClassificationOutputElement(label='neutral', score=0.0055558281019330025)] |
|
Okay this is kinda weird, even with the this is what I'm using: from huggingface_hub import InferenceClient
def main():
client = InferenceClient(
token="token"
)
test_text = "I love this product! It works great and exceeded my expectations."
result = client.text_classification(
text=test_text,
model=url,
top_k=3,
)
print(result)
if __name__ == "__main__":
main()outputs: TextClassificationOutputElement(label='joy', score=0.9758541584014893)Either there's something weird on my client or then the inference endpoint that's served through the API has some added trick to it 🤔 |
|
@ErikKaum yes indeed, just managed to reproduce that. pinging @oOraph if you have an idea? do we use specific pipelines for endpoints that's served through HF Inference? (TL;DR for @oOraph: with HF Inference, the API returns a list of lists of |
|
(might be good having the same repro example in curl / raw i.e. curl -X POST \
-H "authorization: Bearer $HF_TOKEN" \
-H "content-type: application/json" \
-d '{
"inputs": "I love this product! It works great and exceeded my expectations.",
"parameters": {
"top_k": 3
}
}' \
https://router.huggingface.co/hf-inference/models/j-hartmann/emotion-english-distilroberta-base-and same with an inference endpoints url) |
|
^yes sorry, here is the Python repro i used : import os
import requests
def query(payload):
endpoint_url = ...
headers = {
"Accept": "application/json",
"Authorization": f"Bearer {os.getenv('HF_TOKEN')}",
"Content-Type": "application/json",
}
response = requests.post(endpoint_url, headers=headers, json=payload)
return response.json()
output = query({"inputs": "I like you. I love you", "parameters": {}})
print(output)
print(type(output))
print(type(output[0])) |
|
I'll dig asap I did not read everything yet so I might be totally wrong. But I guess the problem comes from here: -> I had to add this at some point for the Hub widgets to work correctly on text-classification. Honnestly I don't remember why anymore and it might be unrelevant now (because at the time the widgets were in the middle of the process of being reworked to use the hugginface.js lib if not mistaken but the issue may not be here anymore :)) |
|
OK so I looked.
Hence the registry.internal.huggingface.tech/hf-endpoints/inference-pytorch-cpu:api-inference-6.5.0 + API_INFERENCE_COMPAT=true env var output tweak mentionned above, to make the bridge between pipeline and widget): More details: Difference between the endpoints classical output and the "tweaked" hf inference output:
return the raw pipeline output from transformers. But depending on the body:
-> I just made the test to know whether the output tweak was still needed or not: still needed otherwise we hit the following
(side note, script to see the varying output depending on the input cases: ) |
|
Okay super nice detective work 😄 So to make sure I understood:
Honestly I personally prefer that the output shape never changes based on the input, so I'd be all for that option here. I think on the inference endpoint side we'd just need to make sure that the new Does that make sense? |

Issue reported by a user. Calling text classification like so:
returns only 1 result despite setting
top_kto 2. I think top_k is sent in correctly but the result is truncated to return only the first result.I notice that a few other tasks had the same pattern so I omitted the
[0]as well.Note that I'm not 100% sure that this is the correct fix, especially since the async client is auto generated. Maybe you would prefer not to edit it directly?
Lemme know 🙌