How does it work?
This search engine is designed to be an addon to HydePHP documentation sites.
It works by using a precompiled JSON search index that contains the searchable content
for all the pages, as well as a link to them. The current version assumes that the
searchable content is the entire page content, but it could just as well be a string
of keywords, excerpts, or something else.
Here's a walk-through of the process:
- During the Hyde build process, a JSON search index is generated.
- This search index is loaded by the HydeSearch script using AJAX.
- When typing in the search field, the results are filtered in realtime,
and then sorted by the number of matches.
For the context section for each result, HydeSearch finds the first
occurrence of the search term in the page content, extracts the
whole sentence, and highlights the matching word.
About the dataset
Since I love working with real data, I'm using the entire
Alice's Adventures in Wonderland book as it's in the public domain.
Each chapter has been split into a Markdown page file.
The files were then placed in the _docs folder of a Hyde installation to be processed by HydePHP.
- It took an average of 1,265.51ms to generate the search index JSON.
- Each entry in the index contains the entire chapter in plain text.
- The file weights 148kB. When testing in production, only 55.2 kB is sent over the air.
- However, since the file is automatically cached by the browser, the load time is only ~2ms on my device.
- Since the index is loaded asynchronously, and only once per page, that's good with me.
To conserve space and improve performance on sites with many pages,
alternative strategies could be used instead of loading the whole page.