-
-
Notifications
You must be signed in to change notification settings - Fork 184
Improve wikilink parsing and checking #1799
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
e9983af to
f6f7814
Compare
|
@jrfnl I am having trouble understanding the MediaWiki links with Could you please clarify this? |
ec15bc2 to
3a919db
Compare
|
Guess you can rebase on top of |
mre
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good work! I've added a bunch of comments, but don't worry.
Most of them are about very minor paper cuts. Most of it can be fixed pretty easily by clicking on the "accept suggestion" button, I hope. 😉
b251e14 to
f6303a9
Compare
c6c566f to
589e1f3
Compare
fcdf77c to
e0912ab
Compare
8f60582 to
589e1f3
Compare
|
@JayJayArr, any chance you could fix the merge conflicts? I would love to get this great work merged. 😊 |
Feat: strip Potholes and Headings from wikilinks adjust fixture to contain Headers and Potholes add integration test for fixture split fixture into obsidian and mediawiki
switch to tokio mutex
cleanup fix: merge conflicts
Co-authored-by: Matthias Endler <[email protected]>
589e1f3 to
a016b0a
Compare
include span in tests fix: allow too many lines fix merge conflicts fix merge conflicts
a016b0a to
341e27f
Compare
|
There we go. Lets call it okay for now 😅 |
| // Initializes the index of the wikilink checker | ||
| fn setup_wikilinks(&self) -> Result<(), ErrorKind> { | ||
| match &self.wikilink_checker { | ||
| Some(checker) => checker.setup_wikilinks_index(), | ||
| None => Err(ErrorKind::WikilinkCheckerInit( | ||
| "Initialization failed, no checker instantiated".to_string(), | ||
| )), | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can use a let-else
/// Initializes the index of the wikilink checker
fn setup_wikilinks(&self) -> Result<(), ErrorKind> {
let Some(checker) = &self.wikilink_checker else {
return Err(ErrorKind::WikilinkCheckerInit(
"Initialization failed, no checker instantiated".to_string(),
));
};
checker.setup_wikilinks_index()
}I also changed the comment into a doc-comment, which is helpful even for non-public methods.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thinking a little more about it, I'm not sure if it needs to return an error at all. Maybe we can just silently ignore the error or return a bool?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The two cases where this could fail is either because of
a. a poisoned Mutex or
b. the user given base-url is not local.
I just realized, that the FileChecker itself is never called if the base-url is Remote so it would be a viable option to return a bool.
Just out of curiosity: Do you see any downsides in returning an Error except cluttering up the ErrorKinds?
| // Tries to resolve a link by looking up the filename in the wikilink index | ||
| fn apply_wikilink_check(&self, path: &Path, uri: &Uri) -> Result<PathBuf, ErrorKind> { | ||
| let mut path_buf = path.to_path_buf(); | ||
| for ext in &self.fallback_extensions { | ||
| path_buf.set_extension(ext); | ||
| if let Some(checker) = &self.wikilink_checker { | ||
| match checker.contains_path(&path_buf) { | ||
| None => { | ||
| trace!("Tried to find wikilink {} at {}", uri, path_buf.display()); | ||
| } | ||
| Some(resolved_path) => return Ok(resolved_path), | ||
| } | ||
| } | ||
| } | ||
|
|
||
| Err(ErrorKind::InvalidFilePath(uri.clone())) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I still have a bunch of issues with that code. 😅
- I don't love the method name because the method does more than just applying the check. It looks like it resolves the path.
- The
if let Some(checker)is inside the loop but the checker doesn't change - The match on
contains_pathwithNone => trace!()is awkward - You're iterating extensions but modifying the same path_buf
Here's an attempt to fix the above:
/// Resolves a wikilink reference to its actual file path.
///
/// Iterates through configured fallback extensions (e.g., `.md`, `.txt`) and checks
/// the wikilink index for a matching file. Returns the first match found.
///
/// # Errors
///
/// Returns `ErrorKind::InvalidFilePath` if no matching file is found in the index.
fn resolve_wikilink(&self, path: &Path, uri: &Uri) -> Result<PathBuf, ErrorKind> {
let Some(checker) = &self.wikilink_checker else {
return Err(ErrorKind::InvalidFilePath(uri.clone()));
};
for ext in &self.fallback_extensions {
let mut candidate = path.to_path_buf();
candidate.set_extension(ext);
if let Some(resolved) = checker.contains_path(&candidate) {
return Ok(resolved);
}
trace!("Wikilink not found: {} at {}", uri, candidate.display());
}
Err(ErrorKind::InvalidFilePath(uri.clone()))
}There are a few more issues. For example, if the wikilink_checker is None, we should return a different type of error. Or ideally, we should not even be able to call that method on it.
If wikilink_checker can be None, you're doing defensive checks everywhere instead of encoding the constraint in the type system. I'm thinking that what we really want is a WikiLinkResolver. Then this will become nicer. 😉
Here's what I mean:
struct WikilinkResolver {
checker: WikilinkChecker,
fallback_extensions: Vec<String>,
}
impl WikilinkResolver {
/// Resolves a wikilink by searching the index with fallback extensions.
fn resolve(&self, path: &Path, uri: &Uri) -> Result<PathBuf, ErrorKind> {
for ext in &self.fallback_extensions {
let mut candidate = path.to_path_buf();
candidate.set_extension(ext);
if let Some(resolved) = self.checker.contains_path(&candidate) {
return Ok(resolved);
}
trace!("Wikilink not found: {} at {}", uri, candidate.display());
}
Err(ErrorKind::WikilinkNotFound(uri.clone()))
}
}Then we don't have to deal with the None check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At the moment, the wikilink functionality is scattered across multiple places and would be much cleaner as its own module.
Just one idea:
lychee-lib/src/checker/
├── mod.rs
├── file.rs
└── wikilink/
├── mod.rs
├── resolver.rs # WikilinkResolver - the main resolution logic
└── index.rs # WikilinkIndex - the filename indexing data structure
This has the added advantage that we also have all the tests in one place and can write a proper module documentation.
We don't have to do this right away. It's okay if we keep it as is and only fix the other comments. But feel free to move stuff around as well. 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I REALLY appreciate this remark! ❤️
Right now I am still a little afraid to screw around too much in other peoples projects 😅
But I will have another look into it when I have the time to attack the WikilinkResolver
Co-authored-by: Matthias Endler <[email protected]>
--fallback-extensions