Add multi-label support to Find box for number fields#6964
Add multi-label support to Find box for number fields#6964rvisser7 wants to merge 19 commits intoLMFDB:mainfrom
Conversation
| if not labels_input or not hasattr(table, "_label_col"): | ||
| return | ||
|
|
||
| labels, seen = [], set() |
There was a problem hiding this comment.
I think you could just do labels = list(set(label.strip() for label in labels_input.split(","))).
There was a problem hiding this comment.
Perhaps this doesn't matter too much, but would it maybe make more sense for parse_labels to instead be in the utils/search_parsing.py file?
| not_parsed, not_found = 0, 0 | ||
| for entry in entries: | ||
| try: | ||
| label = parse_entry(entry) |
There was a problem hiding this comment.
@jwj61 expressed concern about polredabs getting called many times. I wonder if you could also add a progressive timeout here, where you stop if the total amount of parsing time surpasses an amount determined by a keyword.
There was a problem hiding this comment.
Ah, of course, thanks - I forgot about this!
I've added a timer to this for loop. If the timer hits the value set by time_limit (I've put a default of 30 seconds), then it flashes an error and returns the index page. At the moment, the timer is only checked between each entry being parsed, so I am assuming that parsing at least a single entry won't take too long.
| - ``index_endpoint`` -- the URL for the index homepage for this section | ||
| - ``input_key`` -- the dictionary key for the jump search box (default: "jump") | ||
| - ``labels_jey`` -- the dictionary key for the labels search query (default: "labels") | ||
| - ``sep`` -- A string used as the seperator for parsing the jump box input (default: ",") |
There was a problem hiding this comment.
Maybe you can provide sep as a function, defaulting to lambda x: re.split(",", x). Then you can allow Q(sqrt2,sqrt3) for a field name even though it has a comma in it by making a more complicated splitting function.
There was a problem hiding this comment.
Thanks, this is a great suggestion!
I've had a go at writing a function split_top_level_commas, currently placed just above multi_entry_jump_search. This takes some input string and returns a list of substrings which only splits on commas which are not inside any parentheses/brackets/braces. Since this would probably give the intended behaviour for most sections (not just number fields), I've made this the default separator function.
| - ``info`` -- the info dictionary passed in from front end | ||
| - ``parse_entry`` -- a custom function which converts a string (e.g. polynomial, equation, nickname etc) to be parsed into label | ||
| - ``label_exists`` -- a custom function which determines whether a given label exists in the database | ||
| - ``index_endpoint`` -- the URL for the index homepage for this section |
There was a problem hiding this comment.
technically this is the input to url_for, not a url itself.
…update multi_entry_jump_search to use it as default separator
…r sep and time_limit
|
Just for fun, maybe I can also mention another nice consequence of this PR: this also provides a convenient way get a search page for essentially any parametrised family of fields directly via the Find box (or the Labels box). So in particular, I think this also at least gives some partial progress towards issue #6948 🙂 E.g. to obtain a search results page for the first few cyclotomic fields (ordered by degree), we can just copy-paste the Python output of Just for convenience, I've given some links to search pages for some of the families mentioned in #6948 below. To estimate the polredabs cost, I've also given some rough estimates on the time taken for each page to load using the "jump" link, (measured using the Legendre server):
|
I've had a go at implementing issue #6882 in this PR, just for number fields for now. It essentially follows the SneakyBox approach suggested by @roed314 .
As always, any comments/feedback are very welcome! 🙂 At present, this is only implemented for number fields, but I'll hopefully extend this to other sections soon.
How it works:
A function
multi_entry_jump_searchhas been added tosearch_wrapper.py. This is meant to be a generic handler which parses a comma-separated list of entries (e.g. could be labels/names/polynomials/equations) given as an input string in the jump box. Each entry is processed using a custom section-specific parser functionparse_entry. (e.g. for number fields, this is justnf_string_to_label).If any of the entries are unable to be parsed, it flashes an info message on the number of invalid entries. If all entries are invalid, it flashes an error message and returns the usual home search page.
Also in search_wrapper.py, a function
parse_labelshas been added to convert a "?labels=..." URL query into a database query of the form{"$in": [...]}on the label column.I've tried to keep the section-specific code to a minimum. In particular, the only update to
number_field.pyis some code at the beginning ofnumber_field_jumpwhich first runs the genericmulti_entry_jump_searchparser, and setting thelabel_knowlargument (used to define thelabelsSneakyBox).Examples for testing:
If there's no comma in the jump box, then it treats the input as usual. E.g.
http://localhost:37777/NumberField/?jump=2.2.5.1
http://localhost:37777/NumberField/?jump=Qsqrt7
http://localhost:37777/NumberField/?jump=x%5E2+-+3
E.g: a given list of labels:
http://localhost:37777/NumberField/?jump=2.2.5.1%2C+2.2.8.1%2C+2.2.12.1
E.g. a mix of labels, nicknames, polynomials:
http://localhost:37777/NumberField/?jump=2.2.5.1%2C+Qsqrt11%2C+x%5E3+-+2%2C+Qzeta10
E.g. a mix of labels and invalid entries:
http://localhost:37777/NumberField/?jump=3.3.49.1%2C+banana%2C+x%5E2+-+7
E.g. some random nonsense:
http://localhost:37777/NumberField/?jump=1234%2C+banana%2C+%21%21%21%2C+asdfghjk
For now, I've just put this as a draft, just to get any preliminary feedback on whether this looks ok, or whether this should maybe be implemented in a different way. If the editors are happy with the above implementation, I can then implement this for all other sections of the LMFDB where we'd like to support a multi-label search in the "Find" jump box. :)