Tag Archive for 'Search Strategies'

Google — Search, Plus Your World

If you are  a Google+ user, then you now have a new search tool (the encrypted site is https://www.google.com/insidesearch/plus.html). When you are signed into your Google+ account your search engine results will be sorted for relevance in different fashion. Your search results will be sorted by what your Google+ friends say about the search term. This process assumes what your friends say is more important than other content.

This personalised search relevance is a boon for advertisers that want your attention. Google isn’t the first to do this. In 2010 Bing began ranking sites in search results based upon how many of your Facebook friends “like” the site.

The search engines and advertisers have decided that people want to search for other people and their opinions over other content. How convenient for the search engines and advertisers!

If you want a full explanation of the impact this will have for the Investigator, then read Phil Bradley’s article titled Why Google Search Plus is a disaster for search. Google is no longer my first choice, I start with Bing, then DuckDuckGo, and last but not least, I search Blekko.

Real Time Bot Search Engine

RTBot (Real Time Bot) is a Real-time information service, where you can enter a topic title and get results from multiple sources (e.g. Wikipedia, Youtube, Twitter, Facebook, Flickr, Books, Newspapers, Magazines) all at once. This may sound like a normal search engine, but it isn’t.

RTBot provides content only for specific topics such as concepts, subjects, personalities, events, places, companies, products, etc., but not for broader, unspecific searches.

If you use this properly, you often get a lot of video in the results that would require separate searches to find. This can be quite useful when searching by a company or person name.

 

Copernic Agent & Google

I have used Copernic for years, and just accepted its lack of a Google search.  I just got used to it, and never sought a way to add Google.

At a recent conference, Kevin Ripa told me that a registry entry would solve the problem after I mentioned that it didn’t search Google.  If you’re going to feel like an idiot, its good to shown-up by a really smart guy like Kevin.

Go to the registry key:

[HKEY_CURRENT_USER\Software\Copernic\Agent\System]

and insert the following string:

EngineUpdateAddress=

with value, http://updates.copernic.com/k2upd/agentex

 

iSeek Search Engine

iSeek is a good search engine to use when you are searching by a person’s name. It clusters search results by topic, people, places, and organisations.

Zanran for Numerical & Graphical Data

An excellent article about a beta search engine with promise.

Zanran – a new data search engine

… a new data search engine called Zanran – that focuses on finding numerical and graphical data.

Zanran focuses on finding what it calls  ‘semi-structured’ data on the web. This is defined as numerical data presented as graphs, tables and charts – and these could be held in a graph image or table in an HTML file, as part of a PDF report, or in an Excel spreadsheet. This is the key differentiator – essentially, Zanran is not looking for text but for formatted numerical data.”

Searching AROUND(x) Google

The AROUND(x) Operator

A common complaint about Google was that there was no proximity search. Most people think that you cannot find thisword within x words of thatword.  Wrong!

Google supports an undocumented search operator called AROUND(x) that works as a proximity search. To make the operator work properly, you must write it in all capitals and place it between the words. It will return results with variables of the words such as plurals, etc., as is normal for Google. This may be used with other operators within normal Google search syntax, for example you might add the site: operator.

Implications of Organised Spam Taking Over Google

Manipulated Search Terms

Huge amounts of money is being spent to manipulate highly competitive search terms in Google. I’m not talking about the normal link-building or link-buying and other normal efforts. The trend is related to criminal organizations trying to sell counterfeit goods through the US search results, and to a lesser degree, the results for UK, France and Germany.

The spammers do this through keyworded anchor-text heavy links provided by automated forum and blog spam along with hacked websites. These gangs create such large numbers of these sites and links that Google is having quite a hard time catching up with the spam. The Caffeine update that ranks sites faster may be degrading overall search quality as this trend seems to go back only 7 months or so.

Lessons

  • Don’t trust what you read on the Internet; it may be planted data
  • If you don’t find anything interesting on the first few pages of Google, then you’re doing it wrong!  Set-up the search preferences properly and make them persistent.
  • Press releases and promotional websites are not a source of reliable data
  • The Internet is not a “neutral” source. Fact-check and evaluate everything you find.
  • The Internet is only one research venue

DIY Research is Not Practical

Over the last two years we have seen a DIY trend really take hold due to shrinking budgets. This has appeared in the areas of Due Diligence and Background Investigations particularly. This is false economy because the DIY Researcher doesn’t recognise changes like those described above, let alone what to do about such a distortion of the results.

The solution may be as simple as using OptimizeGoogle directed at a version of the search engine that does not implement Google Instant or it may mean conducting the search using a proxy in another country.  If you don’t understand how to do this and why you should do this, then don’t give money to somebody based upon your research.

Case-Sensitive Google Search

This application is  particularly useful for searching for a person’s name in Google as it returns results in the same case as given by the user.  The Query Box supports phrase search (quotes) but no other advanced search options.  If you want to use advanced search options, then type your advanced Google query in the Query Box and use the second input box to provide case sensitive filter terms as in this example.

The user can set the maximum number of Google results that will be scanned through in the “Limit” drop down box. This is an upper limit for the depth of a search and it’s maximum value is 1000. Google does not serve more than 1000 results for any query. Actually the search will stop when 10 case-matching results have been found. The user can click on the “Next” button to get the next page (continue with scanning through the Google results).

Social Media Meta-Search Engines

Meta-search for Social Media Sites

The following social media meta-search engines let you search social networking sites by a person’s name, nickname, phone number, email address and more. Here are some of these search sites and my notes on their utility.

Kgbpeople.com

This searches social networks, search engines, photo/video/audio sites, and personal/professional reference sites. Canada isn’t in the country selection drop-down list this may be a problem for common names in a search for a Canadian. Nothing special, but consistently useful results.

Kurrently.com

This real-time search engine instantly combines results from Twitter and Facebook in an easy-to-read format organized by date stamp. It doesn’t help much if the person doesn’t have one of the above, or if his name isn’t associated with the Twitter account. The best search is for the Twitter @name such as mine @LocusCommunis, Otherwise, you often get nothing.

SocialMention

I have written about this one before.  It’s a real-time search engine searching over a hundred sites from blogs and comments to images and video. When searching names you really must put the name in quotation marks or you get useless results.  Of the three, this is the Investigator’s best choice in my opinion.

Twitter Searching

This Twitter thing has become a necessity to the connected. It is also an evolving search problem for Investigators.

Searching Twitter isn’t as straightforward as I would like. Content disappears in a short time in many search facilities and search results differ depending on which search facility you use.

18 Useful Twitter-related Sites

Here are 18 Twitter-related sites that I have found useful: Continue reading ‘Twitter Searching’

27 Mohammeds

Identity

In conducting Internet research we encounter the problem of persona isolation. In national security circles this is called the “27 Mohammeds problem”.  Essentially, how do we know that the John Smith mentioned in a blog is the specific John Smith we are researching?

Reputation Evaluation

This leads to a another difficulty.  An Internet reputation may not reflect reality.  The Internet reputation may be fabricated out of malice.  We must evaluate a conviction in the august Internet Court and determine if we believe it enough to not take a risk on the subject firm or person.

Related Articles

The following related articles may help you deal with this problem:

Synonym Searches in Google

The tilde (~) helps you find synonyms of words in a Google search. This is usually done by preceding the term with a ~.  For example, searching using the term ~investigator will yield results with synonyms for investigator. It is also an excellent search to do in Google RealTime when searching social media to ensure you are using the right search terms.

The tilde search is excellent for search term discovery and variance testing.

Google – Getting more than 10 results

Open the Search settings at the top right of the Google Search page. This brings you to the Preferences Page. In the Number of Results section select 100. Next go to the last section for Google Instant, select the second option, “Do not use Google Instant“.

By disabling Instant, the full 100 search results should appear.

Facial Recognition for the Masses

Facial recognition software

Enter a photo at  http://developers.face.com/tools/#faces/detect and locate all photos of the same individual on Facebook.  This is limited to your friends at this point, but some developers are putting this on iphone apps. You can snap a photo on the street and get all their info through Facebook and other services this way.  In May 2010 they state that their Facebook apps have scanned over 7 billion photos in total and identified no less than 52 million faces.

This is something to watch as it has some interesting applications for the Investigator.  Of course some people will think the sky is falling due to the  mere existence of this app, but the technological genie was let out of the bottle a long time ago.

Search Results Dominated by One Domain

The following two articles are required reading for anyone who must search by company or product name.

Furthermore, the Official Google Blog  post titled Showing More Results from a Domain, indicates that their algorithm is intended to show searchers more results from a single domain where evidence exits that there is a “strong user interest in a particular domain.”  They also note that the last few results (on a search results page set to show 10 results) are from other domains to preserve diversity in the results.

This has serious implications for anybody doing due diligence research as many derogatory entries in the search engine database will not appear without additional search terms.  It also means that search results set to 10, 20, 30, 50, and 100 per page may give radically different proportions of search results when sorted by domain.