feat(ui): global search service#6695
Conversation
a84dd79 to
501b594
Compare
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #6695 +/- ##
==========================================
+ Coverage 89.91% 89.94% +0.03%
==========================================
Files 396 397 +1
Lines 110609 110760 +151
Branches 110609 110760 +151
==========================================
+ Hits 99454 99624 +170
+ Misses 7378 7369 -9
+ Partials 3777 3767 -10 ☔ View full report in Codecov by Harness. |
Merging this PR will improve performance by 35.18%
Performance Changes
Tip Curious why this is faster? Comment Comparing |
Introduce a reactive `GlobalSearch` service that will act as an entry point for results aggregated from multiple sources. The only source for now is the SDK's per-room message search, merged across the filtered rooms by relevance score, with each hit's content and sender resolved via the timeline machinery. The design leaves room for further `SearchResult` kinds (people, rooms etc.) and for swapping in server-side sources without changing the interface.
Replace the iterator-based message search FFI with a thin wrapper around the new `matrix_sdk_ui::search::GlobalSearch`.
501b594 to
f36d52f
Compare
Hywan
left a comment
There was a problem hiding this comment.
The PR description and commits mention GlobalSearch while the code has a SearchService. Who's the lier?
| } | ||
|
|
||
| /// A global, reactive, paginated search across all the user's data. | ||
| pub struct SearchService { |
There was a problem hiding this comment.
I think this design can be improved: we should not involve two locks here.
Ideally, you want:
SearchService::new(client: Client)so you get theSearchServiceand you can run queries with it,SearchService::query(&self, query: String) -> ObservableQuery
And then, on ObservableQuery, you can have the following methods:
ObservableQuery::subscribe(&self) -> QueryResultSubscriberto subscribe to results, whereQueryResultSubscriberis a thin type aroundeyeball_im::VectorSubscriber<QueryResult>exposing the following methods:QueryResultSubscriber::values(&self) -> Vector<QueryResult>to get all current values,QueryResultSubscriber::into_values_and_stream(self) -> (Vector<QueryResult>, eyeball_im::VectorSubscriberBatchedStream<QueryResult>), which is the equivalent ofeyeball_im::VectorSubscriber::into_values_and_batched_stream.
ObservableQuery::paginate(&self)to paginate the query further moreObservableQuery::pagination_state(&self) -> QueryPaginationStateto get the pagination status of this query.
Now we have a design, how would you implement this? I initially wrote many lines of code but I ultimately got stuck on the same problem as you are I believe: paginating requires calling Stream::next(), which implies you have to store the stream somewhere.
Maybe we can loop over the stream in a task, and have a channel between the task and ObservableQuery. When calling ObservableQuery::paginate(), a message is sent over the channel, and the loop over the stream does a new iteration, such as:
fn task(
client: Client,
query_results: Arc<Mutex<ObservableVector<QueryResult>>>,
pagination: Arc<Notify>,
pagination_state: SharedObservable<QueryPaginationState>
) {
let stream = client.search_messages(query).build_events();
while let Some(query_results) = stream.next() {
// add `query_results` on the shared observable vector
pagination.notified().await;
}
}
pub struct ObservableQuery {
task: AbortOnDrop<()>,
pagination: Arc<Notify>,
pagination_state: SharedObservable<QueryPaginationState>,
query_results: Arc<Mutex<ObservableVector<QueryResult>>>,
}
impl ObservableQuery {
fn new(client: Client, query: String) -> Self {
let pagination = Arc::new(Notify::new());
let pagination_state = SharedObservable::new(QueryPaginationState::default());
let query_results = Arc::new(Mutex::new(ObservableVector::new()));
let task = task(client, query_results.clone(), pagination.clone());
Self { … }
}
pub async fn paginate(&self) -> bool {
match self.pagination_state.read().await {
PaginationState::Idle { end_reached } if end_reached.not()) => {
self.pagination.notify_one();
true
}
_ => false,
}
}
}That way, you don't have to store the stream. You only have a single lock around the shared ObservableVector of QueryResults. You also no longer have an Option around the stream! And finally, the Client is passed directly to and owned by the task.
Also, notify_one (called by paginate) adds one permit every time it is called, ensuring the task will call next() exactly paginate has been called. You already have this guarantee now.
Note that we can check the pagination state in paginate to avoid calling notify_one.
There was a problem hiding this comment.
We had a chat about this yesterday and agreed to start with a simplified version in which the streams are driven from background tasks and the task moves the pagination forward by listening to a Notify object.
I've done so in f0f9f27 but after seeing it in action I'm not so sure it made anything better:
- we still have 2 mutexes as the task itself needs one.
resultsalso gained an Arc - paginate is now fire and forget and errors need to be surfaced from the listener. They also stop the stream so the recovery path now involves setting a new query, which isn't particularly ergonomic
- and generally there's just more things to think and reason about now for honestly not clear gains
Did I perhaps completely misunderstood your vision here? 🤔
435c6c1 to
cc4a954
Compare
There was a problem hiding this comment.
A couple of feedback.
To address the fact the pagination can return an error, we could do something a bit different: instead of using Notify, we can use a channel where the pagination sends ControlFlow::Continue(()) and then immediately waits to receive a ControlFlow too. If it receives a ControlFlow::Continue(()), then no error; otherwise if it receives a ControlFlow::Break(error), then it has to return the error. It will require 2 channels for that 🤔.
Edit: note that the design you adopted looks fine to me.
123ee8c to
e540702
Compare
Hywan
left a comment
There was a problem hiding this comment.
I won't lie: this is NOT your fault, but I'm not super happy with the final result: more refactoring of the lower layers would be required to make it better. That's probably okay-ish with the current APIs though. It's still experimental and we can break it whenever we want. Let's iterate based on that!
Thanks for tackling this and for having tried a different approach!
Introduce a reactive
GlobalSearchservice that will act as an entry point for results aggregated from multiple sources.The only source for now is the SDK's per-room message search, merged across the filtered rooms by relevance score, with each hit's content and sender resolved via the timeline machinery. The design leaves room for further
SearchResultkinds (people, rooms etc.) and for swapping in server-side sources without changing the interface.