Archive for the 'Search Strategies' Category

Page 2 of 12

Training for Investigative Internet Research (IIR)

IIR is a very competitive sport. If you don’t find the needed data, then the opposition wins.

Now you might ask, “how does one train for the ongoing IIR competition?” My answer to this question comes in two parts.

First, read about IIR and read the manuals for the software that you use to produce your end product. You must learn about sources and the methods used to produce a report that is fit for decision-making.

Second, one must practice using these sources and methods.

You can get a sound grasp of the first requirement from my book, Sources and Methods for Investigative Internet Research and this and other blogs, and I will share some secrets about the second requirement right now.

Practice finding more details about obscure news items that you see on TV or Twitter. You must collect the full story, write the story in report format, and preserve all the supporting material. Time yourself for completing the overall task. Also time your wasted effort. It is important to do both if you want to improve your performance. You can also set a time limit for the task using a countdown timer like XNote Stopwatch. For a timer that allows you to log wasted time, you can use Time Stamp.

Consider the following training exercise; there is a news item about a Spitz dog found near death on a trash heap in California during the week of 9 Dec 13. I knew the dog was a Spitz from the TV news item and I also knew the approximate date from the date of the news item. My training task was to get the basic 5 W’s on paper in twenty minutes. Could you do the same thing? If not, then here’s how.

I had the basic when and where—only in a vague sense. I know that search engines are not very good at handling calendar dates. I know my basic search statement will be dog trash California and I am certain they won’t report the breed accurately. That leaves me with the date, search statement, and as it was a TV news items there will be images and video. Where do I start to get it done in twenty minutes?

I know that only Google handles calendar dates in a usable manner and that it has excellent news content. I should also search Bing, Yahoo!, DDG, and Devilfinder. Time is not on my side.

I set-up a OneNote notebook with two tabs. One for research material collected from the web and one for the 5 W’s. Under the 5 W’s tab, I create a sub page for each W. I will use the 5 W’s material to create my report in Word as I would any other report.

Fagan Finder to the rescue. It organises search engines into useable groups and gives you an easy to use interface, such as the Google Ultimate Interface and Google Search By Date Interface.

For the search term, dog trash California, Google had excellent results and Bing had poor results, as did DDG and Yahoo!. The problem was that there were two similar stories one involving a poodle and one that was the subject of this exercise. Google eliminated the poodle stories when searched by date. Devilfinder produced excellent results as well.

From Devilfinder, along with the Google Ultimate Interface and Google Search By Date Interface I was able to provide all the W’s and complete a short reporting memo in twenty minutes while maintaining the proper citations and source material in OneNote.

Train hard.

Geo-locating Images

MyPicsMap.com allows viewing Flickr photos on a fullscreen Google map. To view photos of a  particular Flickr user just enter the username.

loc.alize.us provides the geo-location of photographs uploaded to Flickr. You can search by username, tags, and sort them by date. It uses satellite imagery is provided by Google.

Connect the Dots and the Dox

You don’t need to hack into a computer to learn about someone. Today, most people that I investigate leave a revealing online profile — I just have to connect the dots or the publicly available dox (documents).

Online malefactors try to do their misdeeds anonymously through an alias. Usually, they tend to reuse their aliases. It only takes one obscure use connected to the miscreant’s real name. Now I have the real name to run through the usual searches which will reveal other aliases, Facebook pages, and Twitter accounts, all of which yield titbits of useful information.

Getting Advance Knowledge of New Products

Companies operating in the U.S. often file ‘Intent-To-Use’ applications for trademarks and thereby disclose the names and descriptions of forthcoming products and services six months before the product launch. Extensions of up to two years are sometimes granted if the launch process becomes bogged down.

Searching the Trademark Electronic Search System (TESS) of the U.S. Patent & Trademark Office will find the ‘Intent-To-Use’ applications.

How to Get More Relevant Google Results

Did you know that you can improve your Google results by changing the order of the words in your search statement? Try searches for “civil society” or “society civil”, with and without double quotes. Do you notice any difference in the search results?

Did you know that you can make your Google search results more relevant by changing the reading level? If your search statement is complex or the topic is complex then selecting the advanced reading level may yield more relevant sites. To make this selection, click on Search tools then All Results and click on Reading level. The results will then be annotated with reading levels as well as a percentage breakdown of results by reading level. To filter by a reading level, click on the desired reading level. To go back to all results, click on View results for all.

Carrot Search

I use clustering search engines to build the most specific search statement possible for use in the large search engines. Carrot Search is a clustering search engine that I have added to my stable of tools. It uses Lingo3G — the third generation document clustering engine that features multilingual and hierarchical clustering, synonyms, and advanced tuning capabilities. This produces good results that are properly clustered with tabs to cluster results from different search engines, except Google.

ICANN Wants to Close Whois

A working group for Internet regulators at ICANN wants to close all Whois databases. They what to force anybody needing this data to grovel before them before granting access. They are trying to centralize global control over a key component of the Internet. WHOIS allows you to find out who owns a domain name. Without this data, fraud and other crimes will become easier to commit and harder to solve.

Google Drops Synonym Search

Google eliminated the synonym search feature in June. If you wanted to search your search term and its synonyms, you placed the tilde sign (“~”) immediately in front of your search term. They said nobody used this feature. I guess my new name is ‘Nobody’.

With this gone, an alternative called Google synonym Search Tool has appeared as a usable replacement.

Tim Horton’s & Investigative Internet Research

An article titled, Tim Hortons apologizes for blocking gay and lesbian news website by The Canadian Press on Friday, July 19, 2013 caught my attention. Tim Hortons is a popular Canadian coffee shop chain.

The online site of a popular paper that caters to the gay community was blocked by the coffee shop chain as “not appropriate for all ages viewing in a public environment.”. Once the outrage got going, Tim Hortons relented and changed its WiFi network policy.

What has all this got to do with Investigative Internet Research (IIR), you ask? Well, think about it. We often work while on the road and that means doing some aspects of IIR in places like coffee shops.

When you do IIR outside your normal work environment, different rules apply. How do you know what the WiFi network allows and what it doesn’t? How do you know if some things are censored and others are not? How do you know that your results are complete?

Now do you understand the dangers that doing this presents? I haven’t even mentioned the security issues.

Canadian Government Documents

The Canadian Government Documents Google Custom Search Engine covers over 775 core domains at the Federal, Provincial and Municipal levels of government. Unfortunately, the search engine was last updated 10 Aug 12. A lot can change in one year.

Searching for Hacked Accounts

I always use the subject’s known email addresses as search terms. I assume that any good Investigator would do the same. However, where you search matters.

Have you ever searched an email address and found that it was compromised? Groups like Anonymous and Lulzsec sometimes post lists of compromised email addresses along with the associated passwords. Do you know where to search for this and how to report it?

“I didn’t post that! My account was hacked!” is a common ‘Weinergate’ inspired excuse. If the Investigator doesn’t make a reasonable effort to search for the possibility of a compromised account, then he may be judged incompetent or negligent.

Without the co-operation of the subject, the Investigator must start an organised search for indications that the email account has been compromised.

Always search for the name of the email service provider and the words ‘hacked’ and ‘compromised’ along with  ‘accounts’ and ‘email’. If you find something, then compare the date of the security breach to the time of your own Weintergate.

Next, search shouldichangemypassword.com, pwnedlist.com, and hacknotifier.com. The first two only tell you if the account might be compromised, while the last one sometimes links the searcher to online information about the security breach.

Of course the Investigator should document the search and explain the sources that were searched.

What’s on Your Wishlist?

The Boston Marathon incident is somewhat instructive from an Investigative Internet Research (IIR) perspective.

News reporters are skilled at IIR — some to the exclusion of real journalistic skills if the preponderance of churnalism in the popular media is any measure. However, one instance of a reporter finding the terrorist’s Amazon Wish List is interesting. The reporter was drawing conclusions about the terrorist from the contents of the wish list.

The default Amazon Wish List setting is ‘Public’. The other settings are ‘Shared’ and ‘Private’ which seems to defeat the purpose. The default setting is the most common.

Social Search — Namechk.com

Knowem is probably the most comprehensive search site for finding user names & screen names.

NameChk is similar, but it doesn’t search as many sites (158). Be warned, this site doesn’t like Firefox, it is better to use Chrome as a browser.

The advantage of this username search is that it tells you which sites have the username available for use. Conversely, the sites that don’t have the username available might have the user that you are seeking. The sites where the name is taken are the ones that you should investigate further.

Google-Free Wednesday — Alerts

During the recent apparent demise of Google Alerts, I turned to using Talkwalker and Mention.

I found Talkwalker to be better than the broken-down Google Alerts. Mention seemed interesting, but the Web interface was not confidence inspiring and the need to download an app always makes me suspicious of what security risks that would cause.

Now that Google Alerts is working better, I am finding that it is almost keeping up with Talkwalker and finding new material in each set of results.

With the reawakeing of Google Alerts, I am not going to abandon Talkwalker and Mention — I am just going to add them to toolkit.

Social Search — Delicious.com

Delicious is a social bookmarking site. Social bookmarking is storing and sharing the sites that the user finds interesting. This site has over 6 million users. That makes it a huge catalog of what interests the registered users.

By searching for a topic, you will find users interested in that topic. Topics to search could be a protest, scandal, political movement, or a distinct event. Delicious will identify all the users who bookmarked the same site or sites about the topic. You may also find links to related meet-ups and groups interested in the topic.

Once you have matched a Delicious user-name to a real person, you can see all the sites he or she has bookmarked starting with the most recent. The bookmarks are dated. This will tell a lot about the subject’s interests, skills, plans, education, and employment. The URL of the users bookmarks will be http://delicious.com/user-name/.

All of the foregoing allows you to start building a map of the social network surrounding the topic and the associated people.