Full-text search added to the Rogue Scholar science blog archive

Full-text search added to the Rogue Scholar science blog archive

In January I started the Rogue Scholar blog archive with the slogan "science blogging on steroids", promising to enhance science blogs in important ways. Earlier this month I began DOI registrations for blog posts, and I am well on track to complete this for the included 35 blogs with more than 1,000 blog posts in the next few weeks. Another promise was the full-text search of blog posts, a functionality that is not typically part of blogging platforms, or that is implemented with only limited functionality.

Today, I am happy to announce the first version of full-text search for all Rogue Scholar content. Full-text search works either for specific blogs and does a much better job of finding relevant content compared to blogging platforms or generic web searches, e.g. this blog post describing the work of a group of researchers from the University of Geneva.

Full-text search also works across all blogs included in the Rogue Scholar, something that would be much harder to accomplish otherwise. A good example are topics widely discussed in the blogosphere such as COVID, climate change, or ChatGPT, but also more obscure content where we don't remember the source, for example, a blog post about the Tasmanian Devil (a carnivorous marsupial from Tasmania that is severely affected by a transmissible facial tumor that threatens the survival of the species).

The first implementation of full-text search of course has some limitations, mainly:

  • Author names not yet included (unless they also appear in the full-text)
  • No relevance sorting of results (they are always sorted by reverse publication date)
  • Improvements in the search user interface, either a faceted search interface powered by Elasticsearch, or the floating modal search window made popular by Algolia and the Instantsearch open source library

The Rogue Scholar full-text search is implemented with the Postgres database full-text search, which is a nice alternative to a dedicated search index particularly if you don't need to search millions of documents. And the full-text search was only possible because all blogs participating in the Rogue Scholar agreed to a Creative Commons Attribution (CC-BY 4.0) license for all their posts.

Copyright © 2023 Martin Fenner. Distributed under the terms of the Creative Commons Attribution 4.0 License.