Aditya Dwivedi recently did an interview with me about searchcode. You can view it here over on yourstory.com and it covers how the idea came about, some of the issues and where I think the future lies for the project.
Going to include an extract below in case bit rot sets in.
How this Aussie coder is making life easy for techies around the world with searchcode
You are busy building your product and then someone comes and tells you that what you’ve been doing has already been done by someone. Most of the times you just need to do some modification or add some functionality to customize the product. What do you do in that case? Reinventing the wheel is the last option you might want to take.
So what is the solution?
Looking for the source code of a similar kind of application be it for learning purpose or to reuse the code (only after seeking permission from the owner), is the right thing to do. And SearchCode helps you do exactly that. Searchcode is a search engine for source code, which indexes all the open source code which is hosted on major sites such as Github, Bitbucket, and Google Code.
It was started as a side project at a startup camp by Benjamin Boyter who is based in Sydney. Focusing on why it had to be a search engine and not anything else, Benjamin says, “I had always wanted to build a general purpose web search engine but that’s cost prohibitive. So I started building the documentation search. After a while I started integrating the results with DuckDuckGo after getting in touch with its Founder Gabriel. Around this time, Google announced that they were going to end Google Code Search, and Gabriel asked me if I was going to work on a replacement. I hadn’t considered it at the time, but said yes and started work. Several years later it’s a side project that is big enough to never end, and complex enough to keep me interested.”
At present, Searchcode has indexed over 200,000 projects and over 18 billion lines of code, which can be searched in over 90 languages. Benjamin has been working on this since the past three years. He says it is a problem big enough which can be solved by a single person.
Initially, Benjamin placed ads on the results page to help him pay the bills, but soon removed them after moving to a secure server. He says he doesn’t mind running it at a loss. He has decided to donate 10% of the money he earned from ads to support projects which are used by Searchcode.
Since they started, Searchcode handles about 200,000 searches every month on an average, a big part of which comes from API calls. Since they have partnered with DuckDuckGo they handle thousands of queries every day.
Searchcode tends to do about 200,000 searches a month currently. A lot of those are going through the API, however as it is hooked to DuckDuckGo as an instant answer direct queries tend reach up to several thousand a day. At present, they have about 5000 unique visitors during weekdays.
Benjamin says the whole thing is designed to scale very easily but this hasn’t been a requirement so far. With an index spanning 800 gigabytes in size and the whole thing powered by only two i7 servers with 16 and 32 gigabytes of RAM and 3 terabytes of hard disk space, one cannot fail to notice the efficiency and code optimization done by Benjamin in the entire project.
While easy access to code makes sure that one doesn’t have to reinvent the wheel, it also contributes to the fact that people have become lethargic when it comes to learning programming and fundamentally weak as everything is just a search away. Benjamin disagrees, he says, “There will always be those who copy and paste code and glue together something that just barely works, but sometimes that’s actually enough. The code is there to solve a problem, and there is a lot more ugly code out there solving problems than beautiful code. I personally think that to gain a true mastery of a programming language is to make every possible mistake and be able to understand the errors. If copy pasting helps you get there faster that’s a good thing. People go crazy about understanding fundamentals, but a lot of the time you don’t need them.”
Talking about the future, Benjamin says he has a lot of ideas which are still to be implemented in Searchcode. At present, he is enjoying the freedom it brings as a hobby project. However, he is still skeptical about the market for a source code search engine to give it his full commitment to go the startup route.