Tag Archive for 'Search Engines'

Google — Search, Plus Your World

If you are  a Google+ user, then you now have a new search tool (the encrypted site is https://www.google.com/insidesearch/plus.html). When you are signed into your Google+ account your search engine results will be sorted for relevance in different fashion. Your search results will be sorted by what your Google+ friends say about the search term. This process assumes what your friends say is more important than other content.

This personalised search relevance is a boon for advertisers that want your attention. Google isn’t the first to do this. In 2010 Bing began ranking sites in search results based upon how many of your Facebook friends “like” the site.

The search engines and advertisers have decided that people want to search for other people and their opinions over other content. How convenient for the search engines and advertisers!

If you want a full explanation of the impact this will have for the Investigator, then read Phil Bradley’s article titled Why Google Search Plus is a disaster for search. Google is no longer my first choice, I start with Bing, then DuckDuckGo, and last but not least, I search Blekko.

Real Time Bot Search Engine

RTBot (Real Time Bot) is a Real-time information service, where you can enter a topic title and get results from multiple sources (e.g. Wikipedia, Youtube, Twitter, Facebook, Flickr, Books, Newspapers, Magazines) all at once. This may sound like a normal search engine, but it isn’t.

RTBot provides content only for specific topics such as concepts, subjects, personalities, events, places, companies, products, etc., but not for broader, unspecific searches.

If you use this properly, you often get a lot of video in the results that would require separate searches to find. This can be quite useful when searching by a company or person name.

 

Copernic Agent & Google

I have used Copernic for years, and just accepted its lack of a Google search.  I just got used to it, and never sought a way to add Google.

At a recent conference, Kevin Ripa told me that a registry entry would solve the problem after I mentioned that it didn’t search Google.  If you’re going to feel like an idiot, its good to shown-up by a really smart guy like Kevin.

Go to the registry key:

[HKEY_CURRENT_USER\Software\Copernic\Agent\System]

and insert the following string:

EngineUpdateAddress=

with value, http://updates.copernic.com/k2upd/agentex

 

123people Social Media Search Engine

I have been using 123people to uncover an individual’s social media presence. It  isn’t the only search engine I use for this, but and have found it to be a sound performer.

iSeek Search Engine

iSeek is a good search engine to use when you are searching by a person’s name. It clusters search results by topic, people, places, and organisations.

Zanran for Numerical & Graphical Data

An excellent article about a beta search engine with promise.

Zanran – a new data search engine

… a new data search engine called Zanran – that focuses on finding numerical and graphical data.

Zanran focuses on finding what it calls  ‘semi-structured’ data on the web. This is defined as numerical data presented as graphs, tables and charts – and these could be held in a graph image or table in an HTML file, as part of a PDF report, or in an Excel spreadsheet. This is the key differentiator – essentially, Zanran is not looking for text but for formatted numerical data.”

Searching AROUND(x) Google

The AROUND(x) Operator

A common complaint about Google was that there was no proximity search. Most people think that you cannot find thisword within x words of thatword.  Wrong!

Google supports an undocumented search operator called AROUND(x) that works as a proximity search. To make the operator work properly, you must write it in all capitals and place it between the words. It will return results with variables of the words such as plurals, etc., as is normal for Google. This may be used with other operators within normal Google search syntax, for example you might add the site: operator.

Implications of Organised Spam Taking Over Google

Manipulated Search Terms

Huge amounts of money is being spent to manipulate highly competitive search terms in Google. I’m not talking about the normal link-building or link-buying and other normal efforts. The trend is related to criminal organizations trying to sell counterfeit goods through the US search results, and to a lesser degree, the results for UK, France and Germany.

The spammers do this through keyworded anchor-text heavy links provided by automated forum and blog spam along with hacked websites. These gangs create such large numbers of these sites and links that Google is having quite a hard time catching up with the spam. The Caffeine update that ranks sites faster may be degrading overall search quality as this trend seems to go back only 7 months or so.

Lessons

  • Don’t trust what you read on the Internet; it may be planted data
  • If you don’t find anything interesting on the first few pages of Google, then you’re doing it wrong!  Set-up the search preferences properly and make them persistent.
  • Press releases and promotional websites are not a source of reliable data
  • The Internet is not a “neutral” source. Fact-check and evaluate everything you find.
  • The Internet is only one research venue

DIY Research is Not Practical

Over the last two years we have seen a DIY trend really take hold due to shrinking budgets. This has appeared in the areas of Due Diligence and Background Investigations particularly. This is false economy because the DIY Researcher doesn’t recognise changes like those described above, let alone what to do about such a distortion of the results.

The solution may be as simple as using OptimizeGoogle directed at a version of the search engine that does not implement Google Instant or it may mean conducting the search using a proxy in another country.  If you don’t understand how to do this and why you should do this, then don’t give money to somebody based upon your research.

Social Media Meta-Search Engines

Meta-search for Social Media Sites

The following social media meta-search engines let you search social networking sites by a person’s name, nickname, phone number, email address and more. Here are some of these search sites and my notes on their utility.

Kgbpeople.com

This searches social networks, search engines, photo/video/audio sites, and personal/professional reference sites. Canada isn’t in the country selection drop-down list this may be a problem for common names in a search for a Canadian. Nothing special, but consistently useful results.

Kurrently.com

This real-time search engine instantly combines results from Twitter and Facebook in an easy-to-read format organized by date stamp. It doesn’t help much if the person doesn’t have one of the above, or if his name isn’t associated with the Twitter account. The best search is for the Twitter @name such as mine @LocusCommunis, Otherwise, you often get nothing.

SocialMention

I have written about this one before.  It’s a real-time search engine searching over a hundred sites from blogs and comments to images and video. When searching names you really must put the name in quotation marks or you get useless results.  Of the three, this is the Investigator’s best choice in my opinion.

Twitter Searching

This Twitter thing has become a necessity to the connected. It is also an evolving search problem for Investigators.

Searching Twitter isn’t as straightforward as I would like. Content disappears in a short time in many search facilities and search results differ depending on which search facility you use.

18 Useful Twitter-related Sites

Here are 18 Twitter-related sites that I have found useful: Continue reading ‘Twitter Searching’

Synonym Searches in Google

The tilde (~) helps you find synonyms of words in a Google search. This is usually done by preceding the term with a ~.  For example, searching using the term ~investigator will yield results with synonyms for investigator. It is also an excellent search to do in Google RealTime when searching social media to ensure you are using the right search terms.

The tilde search is excellent for search term discovery and variance testing.

Scroogle

Anonymous Searching

In the past I have written about hiding your tracks as you search the Internet and about the Google SSL search interface.

Scroogle via SSL

Now let me introduce you to the SSL version of Scroogle.  Like the SSL Google, it hides your search terms from IP logging.  No one snooping between your browser and Scroogle can figure out what you were looking for, because the information is encrypted.  Unlike the SSL version of Google, your IP address is dropped before your search terms are sent to Google. Therefore, Google has no idea who is conducting the search.

When you click on any of the links in the Scroogle results on the secure results page, SSL does not allow the browser to record the address of where that secure page came from, and attach it to any outgoing non-SSL links on that page. Using SSL blanks-out this referrer, so that any non-SSL site you click on from a Scroogle SSL page won’t even know that you arrived at their site from Scroogle or anywhere else.

Using Scroogle

In practice, Scroogle isn’t the greatest for finding video and clicking on a link does not open a new window in Firefox. This makes it somewhat awkward when doing high-volume searching, but it offers excellent security.

Google – Getting more than 10 results

Open the Search settings at the top right of the Google Search page. This brings you to the Preferences Page. In the Number of Results section select 100. Next go to the last section for Google Instant, select the second option, “Do not use Google Instant“.

By disabling Instant, the full 100 search results should appear.

Murder in Google

This tool rules out some choices for vacation locales:  Murder Captured By Google Street View Car

Search Results Dominated by One Domain

The following two articles are required reading for anyone who must search by company or product name.

Furthermore, the Official Google Blog  post titled Showing More Results from a Domain, indicates that their algorithm is intended to show searchers more results from a single domain where evidence exits that there is a “strong user interest in a particular domain.”  They also note that the last few results (on a search results page set to show 10 results) are from other domains to preserve diversity in the results.

This has serious implications for anybody doing due diligence research as many derogatory entries in the search engine database will not appear without additional search terms.  It also means that search results set to 10, 20, 30, 50, and 100 per page may give radically different proportions of search results when sorted by domain.