To Track, or Not to Track? 15 Years Striving for Search Engine Independence
How can I get my website to rank high on Google? That was the challenge that confronted Marc Smith, like many other developers back in 2001. It’s a question tackled by hundreds of millions in digital marketing nowadays. It has spawned new branches of marketing such as SEO and content marketing. In Marc’s case he was developing a shareware game for his friend as a favour, and as a way to hone his programming skills. Little did he know the lifelong quest it would draw him into. Figuring out how to rank highly on Google led Marc down a fateful path and the creation of his own search engine. That search engine, Mojeek, is one of very few genuine alternatives to Google.
Marc’s journey will be familiar to startup founders and yet is unusual. He has dared to be different and avoided building things fast. Using the web services of the digital ecosystem establishment would have meant accepting undesirable dependencies and practices.
By sticking to his principles Marc, the founder and lead developer of Mojeek, has created the only general search engine crawling the web that does not track its users. The big two from the USA, Google and Bing (joined by Yandex of Russia plus Baidu and Sogou from China) track their users and hoover up all the data they can. Mojeek stands alone in not doing that and has maintained its independence since inception.
How is it that Marc recognised surveillance capitalism way before that became a thing? And what has driven him in taking a contrarian view of how search should be done? This is no David versus Goliath story. No, it’s the story of a long struggle to offer an alternative for consumer web search. Search which provides liberty from state and corporate surveillance.
Privacy as a Right
To answer these questions we need to go back to March 18, 2006, when Marc took a prescient and public position on privacy. On that day he fully recognised the value and significance of what he had been building with Mojeek; a search engine without personal tracking. This position, of no tracking, holds to this day and underpins the company he founded. And it drives him to keep developing Mojeek with an intensity and determination few of us can imagine, never mind sustain over two decades.
So what happened and on that day and what led up to it?
“The searcher probably felt safe because they entered a search query about a sensitive medical matter. Of course, I had no idea who this was, nor was I interested.”
Marc explained, “I was watching the search queries come in on Mojeek and where they came from. Back then, the search rate was a trickle. I noticed a referral coming from Google. A search immediately followed so was presumably the same person. The searcher probably felt safe, because they entered a search query about a sensitive medical matter. Of course, I had no idea who this was, nor was I interested. Mojeek had never included any tracking of anything personal. But this instance got me thinking more about what it meant for people in general, and particularly vulnerable people, to live in fear of being watched over. Being tracked as they sought information and places to go online. By this time I knew pretty well how Google collected data and how its search engine worked. So I watched the queries, for a few more hours that day, and saw more referrals from Google immediately followed by what looked like very personal queries. I wanted people to know, rather than guess, that they were safe using Mojeek so I put up a page clarifying the position on personal tracking. It was a natural commitment to make, the right thing to do and respectful of a basic human right.” The privacy statement which was Mojeek's first privacy policy can be seen here on the web archive. Its principles hold today and are expressed in what is, tellingly, one of the shortest privacy policies you will see, and readable in 2 minutes.
Mojeek Started as a Hobby Project
Marc’s understanding of how Google worked in detail grew in 2002 when he started working on Mojeek. He was thinking about a project to improve his programming skills in C and figured writing a search engine would be interesting. Being curious about how his website got ranked on Google, he had previously to read the paper published by Page and Brin in 1998 on the "The anatomy of a large-scale hypertextual Web search engine".
Marc, like many talented developers, found learning by reading and doing was more natural than being taught by others. He had been learning this way since childhood and from his teens was applying knowledge and skills (also in BASIC, Perl, HTML, Javascript and PHP) in projects for himself and clients. Turning back to C, Marc reconnected with his deep affinity for that language. An affinity that amplified his new found hobby project and into something of an obsession. Whilst developing games had been a passion and, developing websites for clients kept paying the bills, Marc always felt hampered in those pursuits by his colour blindness. He felt more at home with the logic, numbers, algorithms and challenges of a search engine. You can see that in the simple green Mojeek UI of today and from it’s first presence on the internet in October 2004.
During 2002 and through to 2009, Marc’s paid client work kept him going and his talents meant work came by word-of-mouth. Crucially he was never overwhelmed so could happily spend much of his time working on Mojeek. That was just as well since Marc developed almost every line of code from scratch, initially to increase his learning rate. Later that became crucial in maintaining independence from what has become the tentacles of Big Tech. Back then, and still today, the only exception to this was to use libcurl, an important and still active open source project. Still without Marc’s determination and talent Mojeek might have been abandoned. A first version simply didn’t work very well and Marc had to do a complete rewrite. That second version was made available to friends who encouraged him along.
Bedroom Data Centre
One friend, Guy, thought Mojeek should be available to others and so after some discussion bought Marc a computer to use as a server. Shortly after, in October 2004, Marc launched his independent search engine at mojeek.com from his tiny bedroom. Whilst the new server performed searches, Marc’s laptop crawled the Web. The index was relatively small back then but reached 100 million pages after a second server was added.
Marc kept working on the technology stack for Mojeek, most notably the indexing technology that he found was key to developing a fast and scalable service. He would do small consulting projects with just enough income to survive, completing them as fast as possible, so he could devote most of his energy and time to Mojeek.
After taking a public privacy stand on 18 March 2006, the desire, to fulfil the need he’d seen, drove Marc harder. But as the index size grew so did the need for more servers and thus cash, which Marc didn’t have. At four servers, all donated by friends and family, it was manageable. But it was clear that this would need to scale up significantly to meet the needs of users and for decent coverage of the rapidly-growing Web.
Guy, always keen to encourage him, said “why don’t you talk to my mate Bill?” Bill was a guitarist who had had some success in the punk era and was thinking of making a few small investments. The growing realisation for Marc that Mojeek might have to become a business started to bed in. So Marc agreed to meet Bill and they immediately found a deep connection. A connection that lasted until Bill sadly passed away in December 2019. Unfortunately it took two years for Bill to sell assets he held, and release the cash he promised to Marc. It was thus not until May 2009 that Mojeek Limited was incorporated with Marc, Guy, and Bill as shareholders.
Scaling Up and Struggling On
With this first investment Marc bought 14 servers and set them up in a local data centre. He knew that Mojeek needed improved indexing technology to be able to scale to a viable size. And with less than 1% of the funding raised by rivals around that time, such as Blekko ($60m, 2007 to 2015) and Cuil ($33m, 2008 to 2010), Marc had to make his technology incredibly efficient. He decided therefore to rewrite the Mojeek for a third time, and went live with this version in 2012.
Marc discovered the hard way that building a search engine for hundreds of millions of webpages was a big challenge. But building a scalable search engine suitable for handling billions of webpages was on another level. Doing this in an efficient way, and being able to serve up search results in less than a second for users, was a big hurdle. With the new third generation of Mojeek working Marc could start to contemplate building up the index of billions of pages, and that breakthrough underlies the fast scalable service provided by Mojeek today.
By early 2013 despite a minimal burn rate, Mojeek was running short of funds. Marc was determined to fight on but not sure how to keep going whilst maintaining independence. He decided to turn to a lifelong friend Jill, to ask for help. Jill said she could help most by introducing Marc to business contacts she knew and, months later with some of her own money, helped to put together an angel funding round of £250k in 2014.
This round enabled Marc to scale up the servers, hire someone to help with marketing for a while, and to seek a web developer. As it happens, finding a web developer that met his high standards proved impossible, and so it was not until 2017 that this post became filled. The expanding servers also coincided with a move to a new data centre Custodian which, importantly for Marc, was an independent company that prioritised minimisation of their carbon footprint.
Marc struggled on, pretty much alone between 2013 and 2017, until James who heads the Mojeek website team, joined him. He raised a small follow-on round from existing investors in 2016 to keep things going and so he could keep 100% focus on developing Mojeek.
A Genuine Alternative to Google and Bing
Marc’s growing knowledge and concern about the rise of tracking by Google only added to his drive. And the rise in popularity of the DuckDuckGo search service, started in 2008, and which later become more focussed on privacy, was vindication of his vision. DuckDuckGo’s approach was, and remains, totally different though. They predominantly use Microsoft’s Bing search engine and ad network to deliver search results and ads to the DuckDuckGo web service. Partnering with another Big Tech company to do search and advertising was not an an option Marc wanted to take.
“A real alternative in search needs it’s own index and crawler, and autonomy over ranking.”
By 2018, with an index size at 2 billion webpages, and growing confidence, Marc decided to ramp up marketing and expanded the team with another backend developer. Still, he kept the cash burn-rate low whilst burning the midnight oil. Whilst other peers such as Blekko and Cuil had come and gone, he still felt there was a place and need for alternatives. And as the only crawler search engine practising no tracking he knew there was a fundamental point of differentiation. Mojeek maintains a hugely important distinction from search services and marketing companies like DuckDuckGo, Startpage, Qwant and Ecosia that depend on the search engines of Microsoft and Google.
In late 2019 after nearly two decades of struggle, fortune played Marc the good hand he needed. Custodian had other notable customers, such as Brandwatch, and significantly a newspaper publisher. That publisher, like many others, was suffering from the loss of revenue to Google and Facebook. And in a conversation with Custodian staff they had bemoaned the lack of choice in search engines and notably in the UK. So it was with surprise that they learnt, from Custodian staff, about Mojeek. This small exchange led to an introduction for Marc to the publishers owner, Edward Iliffe. And that led to discussions about how to build Mojeek as a sustainable business.
Go to Market
At the end of 2019, Edward committed to investing a total of £2.3m into Mojeek. The conditions of investment were a commitment to a serious go-to-market effort; with investment tranches linked to milestones. One of these was the hiring of a CEO, Colin Hayhurst in July 2020. Marc had long realised Mojeek needed somebody with a different set of skills than himself, to take it forward, so was very happy about this.
With Marc, Colin has expanded the team to seven staff, all working remotely. Mojeek had always been remote and intends to remain so. Their marketing efforts have been ramped up with a full-time Head of Marketing, Josh Long from October 2020. Significantly the number of servers was doubled in early 2020 from 100 to 200 and today Mojeek has its own dedicated room in the Custodian data centre and is currently expanding servers from 200 to 300.
The focus now is on building a sustainable business, but without compromising Mojeek’s principles on tracking. Colin holds similar values to Marc about surveillance and privacy. He is an experienced software entrepreneur, who has commercialised technology globally from the UK and California.
“Ads are not the fundamental problem. The fundamental problem is tracking.”
The team led by Marc and Colin is exploring different revenue stream opportunities. It has, most notably, developed its own search ad technology stack. And so, as of this month customers are using it to place search ads on Mojeek, in what is currently a closed programme.
At at time when society is questioning the usage of personalised advertising, there is finally an emerging alternative and now an option for search ads without tracking. As Mojeek put it when describing their new search advertising: Ads are not the fundamental problem. The fundamental problem is tracking.”
Marc’s long journey to provide search without surveillance has reached a pivotal stage, just when search with surveillance has started to spread widely in the public consciousness.
Thanks for reading this far; if you'd like to be kept in the loop with all things Mojeek, we have recently opened up sign ups for the shiny, new Mojeek Newsletter.