Skip to content

Conversation

@JayJayArr
Copy link
Contributor

@JayJayArr JayJayArr commented Aug 14, 2025

  • Extract the correct filename from a Wikilink by removing potholes and fragments
  • Create a new Checker for wikilinks, make it clear to the users that a base-url is required
    • Check Obsidian style wikilinks without headers
    • strip fragments & potholes from wikilinks
    • traverse the wiki-directory and check if any of the Filenames exist with any of the specified --fallback-extensions

@JayJayArr JayJayArr changed the title Improve wikilinks, change to opt-in Improve wikilink parsing and checking Aug 18, 2025
@JayJayArr JayJayArr force-pushed the 1788-spaced-wikilinks branch from e9983af to f6f7814 Compare August 21, 2025 06:03
@JayJayArr
Copy link
Contributor Author

@jrfnl I am having trouble understanding the MediaWiki links with () and :.
Are there really markdown files named Help:Links or where do they point to?
Is [[Extension:DynamicPageList (Wikimedia)]] different to [[Extension:DynamicPageList]]

Could you please clarify this?

@JayJayArr JayJayArr force-pushed the 1788-spaced-wikilinks branch 3 times, most recently from ec15bc2 to 3a919db Compare September 30, 2025 10:24
@JayJayArr JayJayArr marked this pull request as ready for review September 30, 2025 12:28
@mre
Copy link
Member

mre commented Sep 30, 2025

Guess you can rebase on top of master now to fix CI.

Copy link
Member

@mre mre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good work! I've added a bunch of comments, but don't worry.
Most of them are about very minor paper cuts. Most of it can be fixed pretty easily by clicking on the "accept suggestion" button, I hope. 😉

@JayJayArr JayJayArr force-pushed the 1788-spaced-wikilinks branch 2 times, most recently from b251e14 to f6303a9 Compare October 3, 2025 17:10
@JayJayArr JayJayArr force-pushed the 1788-spaced-wikilinks branch 3 times, most recently from c6c566f to 589e1f3 Compare October 13, 2025 12:45
@thomas-zahner thomas-zahner force-pushed the master branch 2 times, most recently from fcdf77c to e0912ab Compare October 21, 2025 12:53
@JayJayArr JayJayArr force-pushed the 1788-spaced-wikilinks branch from 8f60582 to 589e1f3 Compare October 22, 2025 21:08
@mre
Copy link
Member

mre commented Nov 12, 2025

@JayJayArr, any chance you could fix the merge conflicts? I would love to get this great work merged. 😊

@JayJayArr JayJayArr force-pushed the 1788-spaced-wikilinks branch from 589e1f3 to a016b0a Compare November 13, 2025 20:52
include span in tests

fix: allow too many lines

fix merge conflicts

fix merge conflicts
@JayJayArr JayJayArr force-pushed the 1788-spaced-wikilinks branch from a016b0a to 341e27f Compare November 13, 2025 20:59
@JayJayArr
Copy link
Contributor Author

There we go. Lets call it okay for now 😅

Comment on lines +336 to +344
// Initializes the index of the wikilink checker
fn setup_wikilinks(&self) -> Result<(), ErrorKind> {
match &self.wikilink_checker {
Some(checker) => checker.setup_wikilinks_index(),
None => Err(ErrorKind::WikilinkCheckerInit(
"Initialization failed, no checker instantiated".to_string(),
)),
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can use a let-else

/// Initializes the index of the wikilink checker
fn setup_wikilinks(&self) -> Result<(), ErrorKind> {
    let Some(checker) = &self.wikilink_checker else {
        return Err(ErrorKind::WikilinkCheckerInit(
            "Initialization failed, no checker instantiated".to_string(),
        ));
    };
    checker.setup_wikilinks_index()
}

I also changed the comment into a doc-comment, which is helpful even for non-public methods.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking a little more about it, I'm not sure if it needs to return an error at all. Maybe we can just silently ignore the error or return a bool?

Copy link
Contributor Author

@JayJayArr JayJayArr Nov 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The two cases where this could fail is either because of
a. a poisoned Mutex or
b. the user given base-url is not local.

I just realized, that the FileChecker itself is never called if the base-url is Remote so it would be a viable option to return a bool.
Just out of curiosity: Do you see any downsides in returning an Error except cluttering up the ErrorKinds?

Comment on lines +346 to +362
// Tries to resolve a link by looking up the filename in the wikilink index
fn apply_wikilink_check(&self, path: &Path, uri: &Uri) -> Result<PathBuf, ErrorKind> {
let mut path_buf = path.to_path_buf();
for ext in &self.fallback_extensions {
path_buf.set_extension(ext);
if let Some(checker) = &self.wikilink_checker {
match checker.contains_path(&path_buf) {
None => {
trace!("Tried to find wikilink {} at {}", uri, path_buf.display());
}
Some(resolved_path) => return Ok(resolved_path),
}
}
}

Err(ErrorKind::InvalidFilePath(uri.clone()))
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I still have a bunch of issues with that code. 😅

  • I don't love the method name because the method does more than just applying the check. It looks like it resolves the path.
  • The if let Some(checker) is inside the loop but the checker doesn't change
  • The match on contains_path with None => trace!() is awkward
  • You're iterating extensions but modifying the same path_buf

Here's an attempt to fix the above:

/// Resolves a wikilink reference to its actual file path.
///
/// Iterates through configured fallback extensions (e.g., `.md`, `.txt`) and checks
/// the wikilink index for a matching file. Returns the first match found.
///
/// # Errors
///
/// Returns `ErrorKind::InvalidFilePath` if no matching file is found in the index.
fn resolve_wikilink(&self, path: &Path, uri: &Uri) -> Result<PathBuf, ErrorKind> {
    let Some(checker) = &self.wikilink_checker else {
        return Err(ErrorKind::InvalidFilePath(uri.clone()));
    };

    for ext in &self.fallback_extensions {
        let mut candidate = path.to_path_buf();
        candidate.set_extension(ext);
        
        if let Some(resolved) = checker.contains_path(&candidate) {
            return Ok(resolved);
        }
        trace!("Wikilink not found: {} at {}", uri, candidate.display());
    }
    
    Err(ErrorKind::InvalidFilePath(uri.clone()))
}

There are a few more issues. For example, if the wikilink_checker is None, we should return a different type of error. Or ideally, we should not even be able to call that method on it.

If wikilink_checker can be None, you're doing defensive checks everywhere instead of encoding the constraint in the type system. I'm thinking that what we really want is a WikiLinkResolver. Then this will become nicer. 😉

Here's what I mean:

struct WikilinkResolver {
    checker: WikilinkChecker,
    fallback_extensions: Vec<String>,
}

impl WikilinkResolver {
    /// Resolves a wikilink by searching the index with fallback extensions.
    fn resolve(&self, path: &Path, uri: &Uri) -> Result<PathBuf, ErrorKind> {
        for ext in &self.fallback_extensions {
            let mut candidate = path.to_path_buf();
            candidate.set_extension(ext);
            
            if let Some(resolved) = self.checker.contains_path(&candidate) {
                return Ok(resolved);
            }
            trace!("Wikilink not found: {} at {}", uri, candidate.display());
        }
        
        Err(ErrorKind::WikilinkNotFound(uri.clone()))
    }
}

Then we don't have to deal with the None check.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the moment, the wikilink functionality is scattered across multiple places and would be much cleaner as its own module.

Just one idea:

lychee-lib/src/checker/
├── mod.rs
├── file.rs
└── wikilink/
    ├── mod.rs
    ├── resolver.rs    # WikilinkResolver - the main resolution logic
    └── index.rs       # WikilinkIndex - the filename indexing data structure

This has the added advantage that we also have all the tests in one place and can write a proper module documentation.

We don't have to do this right away. It's okay if we keep it as is and only fix the other comments. But feel free to move stuff around as well. 😄

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I REALLY appreciate this remark! ❤️
Right now I am still a little afraid to screw around too much in other peoples projects 😅
But I will have another look into it when I have the time to attack the WikilinkResolver

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants