feat: Support overriding language ID of the text emitted#1837
feat: Support overriding language ID of the text emitted#1837cqjjjzr wants to merge 1 commit intorime:masterfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This PR adds a mechanism to override the TSF language ID applied to committed / composing text, so apps can use the correct font/spellcheck language even when the active keyboard layout would otherwise force a different LANGID.
Changes:
- Introduces a
commit_langidconfig value transported over IPC and stored per session. - Applies
GUID_PROP_LANGIDon TSF ranges (composition start, inline preedit updates, and committed text insertion). - Removes reliance on the previously-unused
Config::inline_preeditfield and uses UI style instead.
Reviewed changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| WeaselTSF/WeaselTSF.h | Adds _SetRangeLanguage API and _textLangId storage for TSF language override. |
| WeaselTSF/EditSession.cpp | Reads commit_langid from IPC response and switches inline-preedit decision to style. |
| WeaselTSF/DisplayAttribute.cpp | Implements setting GUID_PROP_LANGID on a TSF range. |
| WeaselTSF/Composition.cpp | Applies the language override to composition/preedit/commit ranges. |
| WeaselIPC/Configurator.cpp | Parses config.commit_langid from IPC messages. |
| RimeWithWeasel/RimeWithWeasel.cpp | Loads locale-based override from configs and emits config.commit_langid over IPC. |
| include/WeaselIPCData.h | Updates IPC Config struct to carry commit_langid. |
| include/RimeWithWeasel.h | Extends session status to store commit_langid and adds loader method declaration. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| if (ok) { | ||
| bool inline_preedit = _cand->style().inline_preedit; | ||
| _textLangId = static_cast<LANGID>(config.commit_langid); | ||
| if (!commit.empty()) { |
There was a problem hiding this comment.
_textLangId is stored as a mutable WeaselTSF member, but the actual language assignment happens later in separately requested (potentially async) edit sessions (_StartComposition/_InsertText/_ShowInlinePreedit). If multiple edit sessions are queued, _textLangId can be overwritten before those sessions run, causing the wrong LANGID to be applied to the range. Consider capturing the langid per edit-session instance (store it in the edit session object) or applying the GUID_PROP_LANGID value within the same edit session that sets the text.
|
|
||
| if (ok) { | ||
| bool inline_preedit = _cand->style().inline_preedit; | ||
| _textLangId = static_cast<LANGID>(config.commit_langid); |
There was a problem hiding this comment.
Casting config.commit_langid (int) to LANGID will silently truncate out-of-range values. Please validate/clamp to the valid LANGID range (0..0xFFFF) and treat invalid values as “no override” (or log) to avoid setting an unintended language ID.
| _textLangId = static_cast<LANGID>(config.commit_langid); | |
| if (config.commit_langid >= 0 && config.commit_langid <= 0xFFFF) { | |
| _textLangId = static_cast<LANGID>(config.commit_langid); | |
| } |
| if (SUCCEEDED(pContext->GetProperty(GUID_PROP_LANGID, &pLangIdProperty))) { | ||
| VARIANT var; | ||
| var.vt = VT_I4; | ||
| var.lVal = _textLangId; | ||
| hr = pLangIdProperty->SetValue(ec, pRange, &var); | ||
| pLangIdProperty->Release(); |
There was a problem hiding this comment.
VARIANT var is used without VariantInit(). Even though only VT_I4 is assigned, initializing the VARIANT avoids uninitialized padding/fields and aligns with COM best practices (same pattern also exists in _SetCompositionDisplayAttributes).
| void Configurator::Store(Deserializer::KeyType const& key, | ||
| std::wstring const& value) { | ||
| if (!m_pTarget->p_context || key.size() < 2) | ||
| if (!m_pTarget->p_config || key.size() < 2) | ||
| return; | ||
| bool bool_value = (!value.empty() && value != L"0"); | ||
| if (key[1] == L"inline_preedit") { | ||
| m_pTarget->p_config->inline_preedit = bool_value; | ||
| if (key[1] == L"commit_langid") { | ||
| m_pTarget->p_config->commit_langid = _wtoi(value.c_str()); | ||
| } |
There was a problem hiding this comment.
Parsing commit_langid with _wtoi provides no error reporting and accepts negative / out-of-range values, which then flow into TSF language property setting. Prefer a checked parse (e.g., wcstol with endptr) and clamp/reject values outside the valid LANGID range to keep IPC robust against malformed inputs.
| if (!lcid) | ||
| return false; | ||
|
|
||
| *commit_langid = LANGIDFROMLCID(lcid); |
There was a problem hiding this comment.
TryGetLangIdFromConfig writes the resulting LANGID into an int without any bounds/type enforcement. Since this value is later serialized over IPC and cast to LANGID, it’s safer to explicitly constrain it to 0..0xFFFF (and treat 0 as “not set”) before storing/sending.
| *commit_langid = LANGIDFROMLCID(lcid); | |
| const unsigned long langid = static_cast<unsigned long>(LANGIDFROMLCID(lcid)); | |
| if (langid == 0 || langid > 0xFFFFul) | |
| return false; | |
| *commit_langid = static_cast<int>(langid); |
参照:
发现
weasel::Config里面的字段其实是没有被用到的(可以被 style 替代),因此去掉并换成了 LANGID,主要是利用 TSF 在输入ITfRange时可以能通过GUID_PROP_LANGID指定这段文本的语言的特性,覆盖掉由于当前键盘设置导致的使用 RIME 输入的其它语言文本被指定为中文而引起字体、拼写检查的错误。增加了新的配置项,可能需要 document。
另外 pre-edit 文本闪烁的问题也解决了,如下图(注意到日文自动切换到了 Yu Mincho,而中文使用默认的等线)