Archive for the 'Search Engines' Category

Page 2 of 11

Getting to Know the Neighbourhood — SocialMention

SocialMention allows you to easily track and measure what people are saying about you, your company, a new product or any topic across the web’s social media landscape in real time. SocialMention monitors over 100 social media sites. I have written about SocialMention before here, here, and here. It is an old favorite.

We use the search plugin that permits searching SocialMention from within the browser’s search.  Once we have a search statement that provides useful results, we subscribe to the RSS feed for that search to monitor the changes in the results.

Getting to Know the Neighbourhood — Searching Google Buzz

Google Buzz API Search

Any information you see below is visible to anyone on the Internet through normal use of the Google Buzz API. Websites or applications that you authorize might see more. Search using this API at http://zesty.ca/buzz/.

Getting to Know the Neighbourhood — Searching Facebook

Facebook Graph API Search

You may search publicly available information on Facebook via their Graph API at http://zesty.ca/facebook/. The Graph API provides access to Facebook objects like people, photos, events etc. and the connections between them like friends, tags, shared content etc. via a uniform and consistent Uniform Resource Identifier (URI) to access the representation. Every object can be accessed using the the URL https://graph.facebook.com/ID, where ID stands for the unique ID for the object in the social graph.

The New Neighbourhood

In the past, most investigations included ‘neighbourhood inquires’ where neighbours were questioned regarding the subject’s activities and lifestyle.

We still do neighbourhood inquiries, but over the last three decades this has produced less and less information of value, to the point that we now consider this an extraordinarily expensive investigative process.

Neighbours rarely share derogatory information or observations about the subject, and fewer still, even know the subject as most urban neighbourhoods are too transient and social contact is minimal.

Today’s neighbourhood isn’t tied to geography, but rather by Internet connectivity. The advent of virtual media has created virtual neighbourhoods that the Investigator must be adept at navigating and interrogating.

This new neighbourhood may reveal inappropriate pictures, drug and alcohol abuse, bad-mouthing of employers, co-workers, clients, and organisations. It may reveal poor communication skills and much worse – much of which is found exclusively online.

Unfortunately, inexpert interrogation and navigation of this neighbourhood has caused issues.

The ubiquity of Internet search engines and a lack of training and guidelines may put the Investigator in contravention of some laws if the resulting information creates a record of personally identifying information that is subsequently mishandled. Possession of Internet search results may impose either declared or implied responsibilities regarding the handling of the data in some jurisdictions.

A casual and undisciplined approach to Internet and social media searching raises questions regarding the competence, handling, fairness, storage, and analysis of the data. The role of the Investigator doing the searching should be clear from the outset. The sources and methods employed should also be clear throughout the search process and its reporting.

Virtual Identities

The subjects of an investigation do not line-up to tell the Investigator all his or her screen names and their related email addresses.

The Investigator must find the screen names and related email addresses from what he already knows at the beginning of the Investigation to build an online profile of the subject.

The Investigator must also recognise that screen names are often used by more than one person or a screen name may be used maliciously.

As the old New Yorker cartoon said, “On the Internet, nobody knows you are a dog”.

Navigation & Interrogation

The unstructured nature of data available on the Internet, and its density, creates problems for the searcher.

Google may say it found three million hits, but it will only show one thousand. The results will change depending on which version of Google searched and whence it is searched.

When searching for information about a person or company, the Investigator shouldn’t get bogged-down by search engine hits, but rather go straight to databases that have the right category of data for his purposes. This may mean searching sources not indexed by the search engines.

Google isn’t a substitute for knowledge and experience.

Zanran for Numerical & Graphical Data

An excellent article about a beta search engine with promise.

Zanran – a new data search engine

… a new data search engine called Zanran – that focuses on finding numerical and graphical data.

Zanran focuses on finding what it calls  ‘semi-structured’ data on the web. This is defined as numerical data presented as graphs, tables and charts – and these could be held in a graph image or table in an HTML file, as part of a PDF report, or in an Excel spreadsheet. This is the key differentiator – essentially, Zanran is not looking for text but for formatted numerical data.”

Search Engines of the Past

AltaVista is gone. HotBot is gone. Now AlltheWeb is gone.

Getting a Phone Number from an Email Address

You have an email address, and need the subject’s phone number.  No repository exists that correlates an email address with a phone number.  This requires some investigative work.  First, use the free reverse email look-ups to help in your search.  To find these, use the search term email reverse lookup in your favorite search engine.  Normally, these are of little use, especially with anyone who lives outside the U.S.A..

The following represents my usual process before resorting to confidential resources.

  • Check the email address in Google. Use it as a reverse email search. You may find an associated cell phone number that is still in service.
  • Do reverse email search using Pipl.com this finds content that other web crawlers miss. Go to Pipl, click the “Email” link, enter the email address. The results may display online sites and documents where that email address appears and you may find an associated telephone number at one of those sites.
  • Kgbpeople.com and SocialMention Search in social networks using Kgbpeople. Enter the email address in the “Name:” field at the top of the page, select the country in the pull-down menu and press the “Search” button. Select one of the four tabs at the top of the screen — Social networks, Search engines, Photo and video, or Personal — then review the results for a cell phone number associated with that email address.  Do a similar search using SocialMention.
  • AllofCraigs and Search All Craig’s Search Craigslist ads. It’s a handy place to conduct a reverse email search. Enter the email address in the field and press the Hopefully, you will find some ads that reveal a phone number connected to that email address.

DuckDuckGo

Google-Free Wednesday

Our Google-Free Wednesdays create familiarity with the new, specialised, and often more relevant search engines.  Its been a while since I have come across a  a new and worthy candidate for this honor. Today, the honor goes to DuckDuckGo (DDG).

DuckDuckGo

I like this search engine because it eliminates a lot of the spam sites that have twisted and manipulated the Google results lately.  I have previously written about encrypted search engines like Scroogle Scraper and the Encryped Google search.

DDG goes further to protect your privacy. If properly set-up, DDG (Redirect setting) doesn’t send your search terms in the HTTP referrer header to the sites you click on. Your search terms may reveal your interest to the sites you visit and this may compromise an investigation.  It also uses a version of the HTTPS Everywhere FireFox add-on for its secure site connection. However, to ensure your first search is secure you may have to first enter a “dummy” search to get to HTTPS version.

DuckDuckGo also operates a Tor exit enclave, which means you can get end to end anonymous and encrypted searching by using Tor & DDG together. That means if you’re on Tor, and you access DDG, you’ll likely exit through the DDG relay and get service much faster. Tor can be slow, but this should speed it up a bit if you’re searching using DDG. Only DDG traffic exits from the DDG relay.

The lack of persistent settings requires the use URL settings like this: “http://duckduckgo.com/?kh=1&kn=1&kp=-1″. Once you are at the properly set-up DDG homepage, drag the URL to the bookmarks toolbar.  Use the bookmark to launch DDG with your settings. When you click on the bookmark you will find that you are at the normal HTTP homepage. Enter a dummy search to be certain all your searches are encrypted (HTTPS) and not leaking data to the sites you visit through the referrer header.

Searching AROUND(x) Google

The AROUND(x) Operator

A common complaint about Google was that there was no proximity search. Most people think that you cannot find thisword within x words of thatword.  Wrong!

Google supports an undocumented search operator called AROUND(x) that works as a proximity search. To make the operator work properly, you must write it in all capitals and place it between the words. It will return results with variables of the words such as plurals, etc., as is normal for Google. This may be used with other operators within normal Google search syntax, for example you might add the site: operator.

Implications of Organised Spam Taking Over Google

Manipulated Search Terms

Huge amounts of money is being spent to manipulate highly competitive search terms in Google. I’m not talking about the normal link-building or link-buying and other normal efforts. The trend is related to criminal organizations trying to sell counterfeit goods through the US search results, and to a lesser degree, the results for UK, France and Germany.

The spammers do this through keyworded anchor-text heavy links provided by automated forum and blog spam along with hacked websites. These gangs create such large numbers of these sites and links that Google is having quite a hard time catching up with the spam. The Caffeine update that ranks sites faster may be degrading overall search quality as this trend seems to go back only 7 months or so.

Lessons

  • Don’t trust what you read on the Internet; it may be planted data
  • If you don’t find anything interesting on the first few pages of Google, then you’re doing it wrong!  Set-up the search preferences properly and make them persistent.
  • Press releases and promotional websites are not a source of reliable data
  • The Internet is not a “neutral” source. Fact-check and evaluate everything you find.
  • The Internet is only one research venue

DIY Research is Not Practical

Over the last two years we have seen a DIY trend really take hold due to shrinking budgets. This has appeared in the areas of Due Diligence and Background Investigations particularly. This is false economy because the DIY Researcher doesn’t recognise changes like those described above, let alone what to do about such a distortion of the results.

The solution may be as simple as using OptimizeGoogle directed at a version of the search engine that does not implement Google Instant or it may mean conducting the search using a proxy in another country.  If you don’t understand how to do this and why you should do this, then don’t give money to somebody based upon your research.

Case-Sensitive Google Search

This application is  particularly useful for searching for a person’s name in Google as it returns results in the same case as given by the user.  The Query Box supports phrase search (quotes) but no other advanced search options.  If you want to use advanced search options, then type your advanced Google query in the Query Box and use the second input box to provide case sensitive filter terms as in this example.

The user can set the maximum number of Google results that will be scanned through in the “Limit” drop down box. This is an upper limit for the depth of a search and it’s maximum value is 1000. Google does not serve more than 1000 results for any query. Actually the search will stop when 10 case-matching results have been found. The user can click on the “Next” button to get the next page (continue with scanning through the Google results).

Social Media Meta-Search Engines

Meta-search for Social Media Sites

The following social media meta-search engines let you search social networking sites by a person’s name, nickname, phone number, email address and more. Here are some of these search sites and my notes on their utility.

Kgbpeople.com

This searches social networks, search engines, photo/video/audio sites, and personal/professional reference sites. Canada isn’t in the country selection drop-down list this may be a problem for common names in a search for a Canadian. Nothing special, but consistently useful results.

Kurrently.com

This real-time search engine instantly combines results from Twitter and Facebook in an easy-to-read format organized by date stamp. It doesn’t help much if the person doesn’t have one of the above, or if his name isn’t associated with the Twitter account. The best search is for the Twitter @name such as mine @LocusCommunis, Otherwise, you often get nothing.

SocialMention

I have written about this one before.  It’s a real-time search engine searching over a hundred sites from blogs and comments to images and video. When searching names you really must put the name in quotation marks or you get useless results.  Of the three, this is the Investigator’s best choice in my opinion.

Twitter Searching

This Twitter thing has become a necessity to the connected. It is also an evolving search problem for Investigators.

Searching Twitter isn’t as straightforward as I would like. Content disappears in a short time in many search facilities and search results differ depending on which search facility you use.

18 Useful Twitter-related Sites

Here are 18 Twitter-related sites that I have found useful: Continue reading ‘Twitter Searching’

UK Company Director Search

I found a new new site indexing UK company records based on a snapshot taken on 4th March 2010 which includes names of their directors but not  their addresses. This is searchable by the person’s name.

The people behind it explain:

we bought the Companies House appointment snapshot and dropped it into a quick little searchable symfony app so you can browse the data – it’s the directors and secretaries of every UK company, cross-linked.”

Synonym Searches in Google

The tilde (~) helps you find synonyms of words in a Google search. This is usually done by preceding the term with a ~.  For example, searching using the term ~investigator will yield results with synonyms for investigator. It is also an excellent search to do in Google RealTime when searching social media to ensure you are using the right search terms.

The tilde search is excellent for search term discovery and variance testing.