Sinister searching

by John Evelyn on June 20, 2007

How search engines help hackers as well as the rest of us

One of the most useful weapons in a hacker’s armoury is the everyday search engine. I use them to research my articles, but hackers can use them to find private information. Google is my example in this article because of its familiarity and popularity, but most of these tricks can be done using any search engine.

I’ve been to the Googleplex in Silicon Valley and met a lot of the people who work there. They seemed like good guys to me – the food in their café was vegetarian and the sun was shining. But it’s the old story: if you invent something smart and powerful, bad people will find a way to exploit it.

Search hacks

Here’s how hackers use Google to break into websites and find private information:

  • A hacker can get a directory of every page that Google indexes for a given site. This can reveal hidden files.
  • Google can search and index Microsoft Office documents as well as web pages and it’s easy to search for specific file types or even file names.
  • They can use a search for index files to bypass the normal way web pages are displayed and access the actual files that make up a website.
  • It is possible to identify the software used to host the website and find out which version is being used. This is very helpful to hackers in working out what tools are needed to break into it.

Sometimes hackers can even search for hidden password files that will give them immediate access to a site. Combining these kinds of searches with automated tools makes it easy for the bad guys to examine thousands of websites quickly and find vulnerable sites. Of course, Google isn’t the only search tool on the web and hackers can find ways to exploit any of them.

Protect your website and protect your data

The most basic step to protect your website is also the most effective: don’t put private information on it. Review the information you have there already, including hidden files. Is there anything that shouldn’t be there? Be especially careful about personal information such as names, addresses, passwords and so on.

Consider excluding individual pages or branches of your website from the search engines. Different search engines do this in different ways, so check online about how to do this. If you don’t want people to be able to find your site at all by searching then you can remove it from Google and other search engines completely. There is some information on search engine exclusion here – just remember that a public website is still public even if it isn’t being searched.

 

By Matthew Stibbe. Originally posted on Microsoft’s bCentral website. Reproduced with permission.

{ 1 comment… read it below or add one }

ken meade July 19, 2007 at 10:08 am

Search engine exclusions techniques like no follow, and robo ts.txt are a dangerous false sense of security.

Why assume someone who wants to index all your site is going to play by the non-mandatory rules. Its not even law that you have to write a search engine that respects those rules.

And a criminal would not respect the law, even if it was the law, and could easily ignore robots and nofollow directives if they wrote their own spider.

Leave a Comment

Previous post:

Next post: