share

Topical Custom Search Engines

colin

29 January 2025

3 min

Code snippet showing construction of a payload for an LLM API, with

In a previous post we showed how the Mojeek API can be used to create and deploy automated custom search engines. With these, a set of sites to search across is generated based on the search query. Sometimes you may wish to search across the web about a particular topic, and so we show here how you can also do that in a similar fashion.

Outline Steps

The steps are very similar to those previously, but with an additional first step where a topic is defined by the user:

  • Get or define the ‘topic’ to go with a search query. This will be a word or few words describing the topic, theme or context that relates to the search intention.
  • Send the Mojeek search query plus the ‘topic’ to an LLM (Large Language Model), along with a system prompt asking it to return a list of relevant websites to search across.

The remaining steps are unchanged:

  • Parse the returned site list into a form suitable for the Mojeek API.
  • Call the Mojeek API with the query and this site list to search across.

Detailed Steps

Now these steps will be illustrated with Typescript code snippets.

First we define the variables to be used:

Screenshot of a code snippet defining automated custom search engine variables in TypeScript. Variables include focusPrompt, focusInput, numberSites set to '100', query, and topic.

We adjust the previous system prompt to use the user defined topic, as well as the search query, as follows:

Screenshot of a code snippet setting up the LLM system prompt for an automated custom searchengine in TypeScript. Variables include focusPrompt, numberSites, topic, and query.

The rest of the steps are as before but shown below. First we bundle the model choice, system prompt and query into a payload, and then call the LLM API.

Screenshot of a code snippet setting up a payload for, and then subsequently a POST API call to, an LLM in TypeScript.

Let’s take a look at an example, where the search query is ‘class’ and the topic is ‘programming’. Using the above we got the following site list from an LLM, noting that in this case we requested up to 25 sites rather than 100.

sitelist=.docs.oracle.com,.docs.python.org,.developer.mozilla.org,.java.com,.cplusplus.com,.w3schools.com,.stackoverflow.com,.learn.microsoft.com,.geeksforgeeks.org,.tutorialspoint.com,.realpython.com,.khanacademy.org,.codecademy.com,.coursera.org,.udacity.com,.edx.org

By adding the context we have narrowed Mojeek to search only across relevant programming related sites. This will much more relevant results than a searching with the search query ‘class programming’.

We call the Mojeek API with this site list and query:

https://api.mojeek.com/search?api_key=API_KEY&fmt=json&si=1&q=query&fi=sitelist

In this case we used the &si=1, so that just one result from each domain is returned. The response was then as follows:

A Mojeek JSON response to the query

You can find further details of API functionality and available plans on the Mojeek Search API page.

colin

29 January 2025

3 min

Get the latest

Subscribe to our newsletter and receive Mojeek news and articles by email.

Subscribe