Archive for the 'Search Strategies' Category

Forgotten But Not Gone

The European Union “right to be forgotten” law that allows individuals to demand the removal of links from Google’s EU search sites is starting to come into play.

The EU “Right to be Forgotten” is clearly a form of censorship in the 28 member nations and 4 other European countries that encompasses over 500 million people. Google has 90% of the search engine market there.

Demanding the removal of an indexed item only renews interest in the story. As the law only applies to Google and not the pages themselves or other search engines, traffic to the articles in question increases thanks to journalists calling attention to them once they receive notification that the article was removed from the EU sites. This is known as The Streisand Effect.

European Google search results for any name display the disclaimer that, “Some results may have been removed under data protection law in Europe,” even if nobody requested the removal of anything.

Of course, people will soon tire of writing about the removed articles and people will stop demanding the removal of indexed items.

Certainly, a free speech enthusiasts will start to collate all the missing search results and make them available. This has already started with Hidden From Google. This site archives articles that Google must remove from European Union search results. I’m certain a Twitter account like @gdnvanished will also appear to provide similar content.

The easiest way to circumvent this censorship is to search using the Google.com site instead of the local EU search sites—or better yet, use other search engines like DuckDuckGo, Yandex, and blekko.

Saving Bozo Eruptions for Posterity

During research projects I sometimes come across astounding levels of stupidity posted for all to see. Sometimes this occurs in obscure corners of the interweb, sometimes it’s done on Twitter.

If I think an instance of stupidity might become important in the future, I manually archive the web page or Tweet by submitting it to the Wayback Machine using the Save Page Now option.

This doesn’t work with all sites, but when it works, the “Bozo Eruption” will be available on an authoritative site in the future. There won’t be any question that the eruption occurred if someone has second thoughts and removes it from the site.

How to Use Boolean to Improve Social Media Monitoring

Twitter and Boolean Searching

Twitter has a robust search facility  that includes Boolean search operators. Twitter Support provides the following table of search operators.

Twitter defaults to the AND operator when you include search terms to the search statement. Don’t forget to use the -sign for NOT to eliminate search terms and OR to broaden the search. To get the results that you really want, you can filter the search results using the selections on the left side of the results page or you can start your search on the Advanced search page. Always search for variations of hashtags, spellings, and sentiment words in order to capture the largest number of tweets possible.

Unearthing a GeoSocial Footprint

I try to learn something every day. Today, I learned about GeoSocial Footprints. A geosocial footprint is the combined bits of location information that a user divulges through social media. Now I had to learn an easy way to unearth someone’s geosocial footprint.

First, I had to find an easy way to uncover which social media (SM) a person uses. To do that, I found an add-on for Firefox called Identify. This extension used to help you explore an individual’s web identity across SM sites. However, it is not compatible with V. 26 or later. It was also not compatible with Comodo IceDragon.

That left me with trying Hoverme. This is an add-on for Chrome that provides a SM profile when you mouse a name on SM sites. You will supposedly be able to view the social web profile of the subject by mousing over the profile picture in Facebook, etc.. It should provide links to the person’s profiles on sites such as Facebook, LinkedIn, Delicious, etc..

I tried installing it in Comodo Dragon, which is built on the open source Chrome browser and doesn’t phone home to Google like Chrome. Unfortunately, Hoverme needs the Kynetx browser extension that many apps require. It’s like Greasemonkey for Firefox, but to install this you need to set-up an account or use Facebook or Google to sign-in. This means I might be giving away too much information. This also means that to collect evidence safely, I will have to install it on a sandbox machine or in a VM and then do my main collection on another machine. I would do this because I don’t know what Kynetx might be doing to the machine that is collecting the evidence and I don’t know what information this might be giving away to unknown parties.

I guess it’s back to good old-fashioned Investigative Internet Research to uncover which SM sites someone uses. From there, I will have to figure-out how to collect, collate, validate, and explain all this geosocial footprint stuff.

Veracity of Online Images & Video

My mother advised me not to believe everything I read remains true today as it was 50 years ago. Today, this advice extends to online video and images.

Hoax imagery and video abounds online. A fake video of an eagle trying to fly off with an infant in a Montreal park is only one example. Students at the National Animation and Design Centre created this ‘Golden Eagle Snatches Kid’ video. Their skill was impressive. It took a frame-by-frame analysis to uncover the fake. Frames that lacked the eagle’s shadow revealed it to be a hoax.

Free editing software like VLC Media Player or Avidemux Video Editor can help split video into frames, but locating and investigating the person who posted the video proves more productive in most cases. The following is a short outline of how I approach this problem.

First, start listing the places you find the item and user names that posted it. Look for the first instance of the item by filtering by date. Try to find the first instance as this may be the original and the original poster of the item. Compare video thumbnails to find the earliest and largest as that may be the original. Search the thumbnails in Google Image Search, TinEye, and Bing. However, searching TinEye, et al, will require an image with high contrast and distinctive colour combinations.

Next, try to identify the person who first posted it. Sometimes, discovering the creator of the item is easy because it was posted on a Facebook page or on YouTube, but usually it was just duplicated there and originates elsewhere. Search all text associated with the item—tags, descriptions, user names. Use everything as search terms. Search all the user names to identify the people. Use sites to LinkedIn, Facebook, etc., to get a feel for the background of the people you may later contact.

Once you have found the likely source of the item, examine and question the source to establish his reliability. You need to engage this person to establish that he created the video or image and that it isn’t a hoax or an altered version of something he still possesses.

WebMii

I have written about pipl.com before and often find it useful when I am trying to track-down people. Unfortunately, its usefulness is limited if the subject person lives outside the U.S.A..

When searching people outside the US, I turn to WebMii. This has data sets for specific countries which you can select or you can select all by selecting ‘International’  as the region.

You may also search by keywords to get a list of people associated with the keywords. However, this has never worked for anything I have searched. Searching by company or brand name often returns useful results, but selecting a region failed to change the results in any search that I have done.

New Bing Image Search

Images that appear on a web site offer many insights into the people who created the site. They tell you if they have the money to buy copyrighted content, or that they took the time to create their own imagery to get across their message. The imagery may also tell you that they don’t respect copyright law. The use of the same image on several sites may indicate a relationship between the sites that use the image.

Bing now offers an image search facility that allows you to paste the specific image URL into the search box at Bing.com/images.  If you have a picture that you want to match, then you may upload it directly to Bing.com/Images and Bing will search for matches. To match an image, submit a URL, or upload an image, just click on image match.

When you come across an image on a site you find in the Bing Web results, go to Bing Image search and clear the search box. That will make the Image Match link appear next to the search box. When using this, the best approach is to have Bing Web open in one tab and Bing Images in another. As you click on Web results, they will open in a new tab between Bing Web and Bing Images. To isolate the images you wish to search, in Firefox, right click the image and click on view image. This will take you to the image itself and its unique URL. This makes it easier for Bing to isolate the image it is trying to match.

Chrome is Listening

So you want to use Chrome as your browser. Are you aware that it has recently been reported that a Chrome Bug Allows Sites to Listen to Your Private Conversations?

The best way to avoid this threat is as follows:

  • Go to chrome://settings/content
  • Scroll down to Media
  • Select “Do not allow any sites to access my camera and microphone.

This will disable Google’s Conversational Search, etc. but security will be increased.

I never liked the way Chrome ‘phoned home’ to Google with user tracking, bug tracking etc. I have also found extensions that had malware-filled updates. However, it is faster than Firefox, which over the course of a research project may save hours of extra time. I resisted using Chrome due to security & privacy issues.

I now use is Comodo Dragon, which is based on the open-source Chrome browser, however, it is more private and secure if used properly. I disable the camera & mic as SOP, so I haven’t investigated how Dragon responds to this exploit. The setting change that I outlined was in reference to the actual Chrome browser and this particular exploit, there may be more that I don’t know about.

I am very careful about exposing myself to the internet. My outward-facing computers don’t have cameras or mics to entirely circumvent malicious software like this and the likes of Finspy.

Exif Viewers

In a past article, I explained Exchangeable Image File or Exif data and pointed you to www.regex.info, an easy to use exif viewer with a geo-locator. The regex.info Exif viewer allows you to enter the image URL or to upload an image for analysis. It doesn’t require JavaScript and it doesn’t have any widgets.

Another easy to use online exif viewer may be found at www.fotoforensics.com, but you must enable JavaScript to use it. You can use the URL of the picture instead of uploading the image.

The online exif viewer at www.gbimg.org has a lot of widgets on it.

My last discovery was the Exif site at http://www.findpicturelocation.com. Just upload the picture and it will show the location where it was taken. It only works with .jpg or .tif files. You must upload the image to the site, so who knows where it might end-up. This uses the Google API for the mapping. Not all pictures have the GPS coordinates in them.

Trolling RSS Feeds

RSS (Rich Site Summary) is a format for delivering regularly changing web content. Many news-related sites, blogs and other online publishers syndicate their content as an RSS Feed to whoever wants it.

I have written quite a lot about RSS in the past. The following are my choices for both installation on a PC and for a web-based reader.

RSSOwl

RSSOwl is cross-platform as it’s Java-based. It handles RSS, Atom and RDF in terms of feed formats. You must have Java installed, no matter where you run it. It cooperates with Firefox to add feeds to RSSOwl from the browser. Just go to the feed and copy the URL then go to RSSOwl and click on add feed and it knows where to find the feed. You can also drag and drop Feeds from Firefox into RSSOwl. RSS Owl has an embedded web browser, so you don’t have to open up a separate browser window to view links or to view the full version of feed items that are shortened. You do have to set this up under “Browser” in the Preferences menu option. Choose to Default to the Embedded Browser. To get the RSSOwl embedded browser to work properly with OneNote so that it includes the URL in pasted items, you must enable Java Script. I do not recommend doing this except on an isolated machine otherwise, malicious Java Script code could cause serious problems.

RssBandit

When I need to collect video and podcasts from RSS feeds, I turn to RssBandit. The embedded browser is MS Internet Explorer, therefore, it includes the pertinent URL when you copy to OneNote as the embedded browser is the same.

This is my favorite RSS reader overall, though, I have experienced occasional problems with exporting feeds for another implementation of the reader. This problem seems to stem from differences in the underlying OS on the importing computer. It can be an irritation when starting a project with tight deadlines.

RSSOwl has an edge for a group of researching working in a collaborative environment as it is easier to set-up and distribute to the group.

Web-based RSS Reader

The two most popular seem to be Feedly and Inoreader readers that offers similar features and options.

Inoreader offers secure HTTPS access and over 40 different customization options. If I must use a web-based reader this is the one.

I refuse to use Feedly because extensions like NoScript, Adblock, HTTPS Everywhere, etc. prevent the site from loading. I never use sites infested with stuff that my normal suite of extensions prevents from loading. You only have to encounter one ad with malicious code to cost you many hours of work to purge the problem code from your machine.

Incognito Searching

Your search and browsing behaviour allows Google to personalise your search results. To escape this filtering of your results use a private browser window called incognito as it is called in Chrome. Google will then ignore tracking and search cookies to stop personalising your results. To get a private browser or incognito window use the following key combinations:

  • Chrome –  Ctrl+Shift+N
  • FireFox – Ctrl+Shift+P
  • Internet Explorer – Ctrl+Shift+P

I have found that this approach doesn’t work with Bing.

Google-Free Wednesday–Metasearch

Metasearch for the Big Guys

Dogpile returns results from Google, Yahoo!, and Yandex. The Russian engine, Yandex, is the fourth largest search engine in the world and Yahoo! is really the Bing search engine database.

Dogpile is only good for short and simple search statements, however, it is a good for a quick look at what you are likely to get from the largest search engines.

Copernic Agent

Copernic has stopped selling its professional version metasearch tool and discontinued all support for both the professional and free personal versions of Copernic Agent. It only searches five of the 15 search engines it purports to search (Google, Bing, Yahoo, Dogpile, and Open Directory Project).

Copernic is Windows only.

iMetaSearch

iMetaseach is a possible replacement for Copernic. It is now in version 5.03, so it isn’t a new kid on the block. The paid version searches Google and purports to search 11 other search engines.

The program groups search results by concept; click a group that interest you and the search results will be revised. This is an effective method to refine search results and get the most relevant results. It’s very effective for ambiguous search terms.

Unfortunately, iMetasearch has a steep learning curve, but if you frequently conduct Investigative Internet Research it is worth the effort to learn how to use this advanced web search tool.

iMetasearch is Windows only.

Google Free Wednesday—DDG Site Search Command

The DuckDuckGo (DDG) search engine aggregates content to provide search results while offering significant privacy features. My favorite search shortcut in DDG is its version of the Google site: command. Place an exclamation point before the site you want to search–for example, “private investigator” !facebook. The exclamation point directs the search to a specific site. In this case, you will have to login to your Facebook account to see the results.

Google Free Wednesday — Yahoo! Alerts

The apparent demise of Google Alerts forced me to turn to Talkwalker and Mention for alerts. However, Yahoo! Alerts offer some utility for keeping up with the world. In the past Yahoo! Alerts was only good for news. It now extends into the full web as catalogued by the Bing database. If you don’t already know it, Microsoft swallowed Yahoo! search whole in 2009. Perhaps we should call it Microhoo.

You need a Yahoo! account for Yahoo! Alerts. The results cannot be pushed to an RSS feed, they only arrive via email, Yahoo Messenger, or mobile device, depending on what you have set-up in your Yahoo! account. Not all alerts allow for delivery using all three of the above delivery options.

To create an alert, select Y! Search from the drop-down list on the right side of the opening page or select Y!Search from the list on the initial screen. Next sign-in to your Yahoo! account. In the Search keyword field add the search terms as you would in the normal Yahoo! search box. In the next drop-down list select what you want searched, I normally select Web or News. Finally select the frequency of the search. The search preview will only show anything added to the database in the last 24 hours.

Windows Error Reporting Risk

Windows Error Reporting (WER) is a crash reporting technology introduced by Microsoft with Windows XP. However, we now know that it may send Microsoft unencrypted personally identifiable information contained in the memory and application data that may make you vulnerable to attack. WER is turned on by default. WER from Windows 8 may now use TLS encryption.

The Snowdon leaks described how the U.S. National Security Agency intercepts the unencrypted WER logs to fingerprint machines like some malware to identify potential system, network and application weaknesses to execute attacks that move through an enterprise network. WER reports on more than Windows crashes. It reports hardware changes, such as the first-time use of a new USB device and mobile devices. It sends time-stamp data, device manufacturer, identifier and revision, along with host computer information such as default language, operating system service pack and update version, hardware manufacturer, model and name, as well as BIOS version and unique machine identifier. This creates a blueprint of the applications running on a network to help an attacker develop or execute attacks with little chance of detection.

This is only one example of the OS, applications, browsers, etc. leaking information that the investigator must be aware of when conducting investigative internet research.

To shut-off WER in Windows 7 go to Control Panel>System and Security>Action Center>Change Action Center settings>Related settings>Problem reporting settings. The selections for “Each time a problem occurs, ask me before checking for solutions” and “Never check for solutions” disable WER. Choosing Never check for solutions will fully disable error reporting in Windows 7.