Archive for the 'Search Strategies' Category

Finding Deleted Tweets

Paper.li is a web service that let’s members create a daily newspaper of sorts containing their favorite material that they then sharing it with their followers. Here are some points that the investigator should note:

  • A lot of content of these papers comes from Twitter.
  • These papers are archived.
  • Twitter users sometimes delete Tweets
  • Deleting Tweets on Twitter are not deleted on sites like Paper.li

Paper.li is a content curation service. A Content Curator is someone who continually finds, groups, organizes and shares the best and most relevant content on a specific issue online. These sites are a good place to find content deleted from the originating social networking site.

If you go to Paper.li and use their search feature, you won’t find anything unless your search is for the title of a paper. Their search doesn’t look within individual articles.

To find mentions of content from Twitter, or any other content, use the Site: operator. When using this search strategy, search by the Twitter account’s name and the user name (@username) along with any keywords that might apply to what you are looking for.

Operative Research

Operative research is the process of learning how things work in a particular area. As an investigator, I often have to learn how something works or the nature of the skills used in a certain area of human endeavour.

I sometimes start by interviewing people who are in the field, but more often, I do a literature search of the topic before conducting interviews. That leaves me with the task of locating relevant published material that will give me an overview of the topic and allow me to formulate a list of questions to ask during interviews.

The first task in this is to understand how the subject matter is indexed. That means understanding who might have a use for this material. For example, many military topics are also useful to engineers, construction companies, outdoorsmen, miners, sailors, and many more individuals and organisations. Another example would be the topic of physical security.

Once you know who might collect and catalog the subject material that interest you, learn what terms they might use to describe the material. Now add the words “library” and “subject guide” to your search. What you are looking for is a targeted collection of material. Once you find such a collection search the site using the site: operator.

Using the above search strategy in a recent search for information on evacuation of urban areas, I found urbansruvivalsite.com and its library of ebooks. While searching for data on electrical wiring led me to the Pole Shift Survival Information site and its library of publications about wire where I found tables of wire-gauge sizes. When trying to decipher old shorthand notes in a deceased lawyer’s file I found a library of publications about shorthand.

The focus of each of these ‘library’ sites is far removed from my interests, however, the people who created these sites had their own use for the information and that made my job easier.

Site Investigation Tools

When you start to investigate a particular Internet site, I suggest you begin with these resources.

Domain Dossier Investigate domains and IP addresses. Get registrant information, DNS records, and more—all in one report.

InterNIC Public Information Regarding Internet Domain Name Registration Services

Network Solutions’ Whois

DomainSearch.com  Search multiple top level domains at once to see if the domain name is in use. I use it to find the domain name in other top level domains.

Convert Host/Domain Name to IP Address and vice versa  Find the IP of a host machine (convert host to IP) or domain name (convert domain name to ip address) or find the name of one of the hosts at an IP address (convert ip address).

Using Traceroute Learn how to use and interpret traceroute results.

Additions thanks to Kirby:

hostcabi.net  Provides lot of information, but most importantly, it identifies other users of same Google Analytics account and all the sites using that account.

sitedossier.com  Sometimes shows older servers, which is useful when website has upgraded to cloud service or CloudFlare.

Motherpipe

Do you want a search engine that does the following:

  • doesn’t keep details on what you are searching for
  • doesn’t store your IP address
  • doesn’t use cookies
  • doesn’t track you
  • doesn’t send your search term to the site you clicked on
  • doesn’t store or share your search history
  • doesn’t share your personal information
  • doesn’t have servers in the U.S.A.
  • doesn’t hide the search results amongst a deluge of ads

Try Motherpipe. It operates privacy oriented search engines at motherpipe.com, motherpipe.co.uk, motherpipe.de and motherpipe.se that don’t do things I don’t want done.

It gets its data from Yahoo!Bing. It offers the search operators “site:” and Boolean operators “AND” and “OR“. It also searches Twitter anonymously.

Forgotten But Not Gone

The European Union “right to be forgotten” law that allows individuals to demand the removal of links from Google’s EU search sites is starting to come into play.

The EU “Right to be Forgotten” is clearly a form of censorship in the 28 member nations and 4 other European countries that encompasses over 500 million people. Google has 90% of the search engine market there.

Demanding the removal of an indexed item only renews interest in the story. As the law only applies to Google and not the pages themselves or other search engines, traffic to the articles in question increases thanks to journalists calling attention to them once they receive notification that the article was removed from the EU sites. This is known as The Streisand Effect.

European Google search results for any name display the disclaimer that, “Some results may have been removed under data protection law in Europe,” even if nobody requested the removal of anything.

Of course, people will soon tire of writing about the removed articles and people will stop demanding the removal of indexed items.

Certainly, a free speech enthusiasts will start to collate all the missing search results and make them available. This has already started with Hidden From Google. This site archives articles that Google must remove from European Union search results. I’m certain a Twitter account like @gdnvanished will also appear to provide similar content.

The easiest way to circumvent this censorship is to search using the Google.com site instead of the local EU search sites—or better yet, use other search engines like DuckDuckGo, Yandex, and blekko.

Saving Bozo Eruptions for Posterity

During research projects I sometimes come across astounding levels of stupidity posted for all to see. Sometimes this occurs in obscure corners of the interweb, sometimes it’s done on Twitter.

If I think an instance of stupidity might become important in the future, I manually archive the web page or Tweet by submitting it to the Wayback Machine using the Save Page Now option.

This doesn’t work with all sites, but when it works, the “Bozo Eruption” will be available on an authoritative site in the future. There won’t be any question that the eruption occurred if someone has second thoughts and removes it from the site.

How to Use Boolean to Improve Social Media Monitoring

Twitter and Boolean Searching

Twitter has a robust search facility  that includes Boolean search operators. Twitter Support provides the following table of search operators.

Twitter defaults to the AND operator when you include search terms to the search statement. Don’t forget to use the -sign for NOT to eliminate search terms and OR to broaden the search. To get the results that you really want, you can filter the search results using the selections on the left side of the results page or you can start your search on the Advanced search page. Always search for variations of hashtags, spellings, and sentiment words in order to capture the largest number of tweets possible.

Unearthing a GeoSocial Footprint

I try to learn something every day. Today, I learned about GeoSocial Footprints. A geosocial footprint is the combined bits of location information that a user divulges through social media. Now I had to learn an easy way to unearth someone’s geosocial footprint.

First, I had to find an easy way to uncover which social media (SM) a person uses. To do that, I found an add-on for Firefox called Identify. This extension used to help you explore an individual’s web identity across SM sites. However, it is not compatible with V. 26 or later. It was also not compatible with Comodo IceDragon.

That left me with trying Hoverme. This is an add-on for Chrome that provides a SM profile when you mouse a name on SM sites. You will supposedly be able to view the social web profile of the subject by mousing over the profile picture in Facebook, etc.. It should provide links to the person’s profiles on sites such as Facebook, LinkedIn, Delicious, etc..

I tried installing it in Comodo Dragon, which is built on the open source Chrome browser and doesn’t phone home to Google like Chrome. Unfortunately, Hoverme needs the Kynetx browser extension that many apps require. It’s like Greasemonkey for Firefox, but to install this you need to set-up an account or use Facebook or Google to sign-in. This means I might be giving away too much information. This also means that to collect evidence safely, I will have to install it on a sandbox machine or in a VM and then do my main collection on another machine. I would do this because I don’t know what Kynetx might be doing to the machine that is collecting the evidence and I don’t know what information this might be giving away to unknown parties.

I guess it’s back to good old-fashioned Investigative Internet Research to uncover which SM sites someone uses. From there, I will have to figure-out how to collect, collate, validate, and explain all this geosocial footprint stuff.

Veracity of Online Images & Video

My mother advised me not to believe everything I read remains true today as it was 50 years ago. Today, this advice extends to online video and images.

Hoax imagery and video abounds online. A fake video of an eagle trying to fly off with an infant in a Montreal park is only one example. Students at the National Animation and Design Centre created this ‘Golden Eagle Snatches Kid’ video. Their skill was impressive. It took a frame-by-frame analysis to uncover the fake. Frames that lacked the eagle’s shadow revealed it to be a hoax.

Free editing software like VLC Media Player or Avidemux Video Editor can help split video into frames, but locating and investigating the person who posted the video proves more productive in most cases. The following is a short outline of how I approach this problem.

First, start listing the places you find the item and user names that posted it. Look for the first instance of the item by filtering by date. Try to find the first instance as this may be the original and the original poster of the item. Compare video thumbnails to find the earliest and largest as that may be the original. Search the thumbnails in Google Image Search, TinEye, and Bing. However, searching TinEye, et al, will require an image with high contrast and distinctive colour combinations.

Next, try to identify the person who first posted it. Sometimes, discovering the creator of the item is easy because it was posted on a Facebook page or on YouTube, but usually it was just duplicated there and originates elsewhere. Search all text associated with the item—tags, descriptions, user names. Use everything as search terms. Search all the user names to identify the people. Use sites to LinkedIn, Facebook, etc., to get a feel for the background of the people you may later contact.

Once you have found the likely source of the item, examine and question the source to establish his reliability. You need to engage this person to establish that he created the video or image and that it isn’t a hoax or an altered version of something he still possesses.

WebMii

I have written about pipl.com before and often find it useful when I am trying to track-down people. Unfortunately, its usefulness is limited if the subject person lives outside the U.S.A..

When searching people outside the US, I turn to WebMii. This has data sets for specific countries which you can select or you can select all by selecting ‘International’  as the region.

You may also search by keywords to get a list of people associated with the keywords. However, this has never worked for anything I have searched. Searching by company or brand name often returns useful results, but selecting a region failed to change the results in any search that I have done.

New Bing Image Search

Images that appear on a web site offer many insights into the people who created the site. They tell you if they have the money to buy copyrighted content, or that they took the time to create their own imagery to get across their message. The imagery may also tell you that they don’t respect copyright law. The use of the same image on several sites may indicate a relationship between the sites that use the image.

Bing now offers an image search facility that allows you to paste the specific image URL into the search box at Bing.com/images.  If you have a picture that you want to match, then you may upload it directly to Bing.com/Images and Bing will search for matches. To match an image, submit a URL, or upload an image, just click on image match.

When you come across an image on a site you find in the Bing Web results, go to Bing Image search and clear the search box. That will make the Image Match link appear next to the search box. When using this, the best approach is to have Bing Web open in one tab and Bing Images in another. As you click on Web results, they will open in a new tab between Bing Web and Bing Images. To isolate the images you wish to search, in Firefox, right click the image and click on view image. This will take you to the image itself and its unique URL. This makes it easier for Bing to isolate the image it is trying to match.

Chrome is Listening

So you want to use Chrome as your browser. Are you aware that it has recently been reported that a Chrome Bug Allows Sites to Listen to Your Private Conversations?

The best way to avoid this threat is as follows:

  • Go to chrome://settings/content
  • Scroll down to Media
  • Select “Do not allow any sites to access my camera and microphone.

This will disable Google’s Conversational Search, etc. but security will be increased.

I never liked the way Chrome ‘phoned home’ to Google with user tracking, bug tracking etc. I have also found extensions that had malware-filled updates. However, it is faster than Firefox, which over the course of a research project may save hours of extra time. I resisted using Chrome due to security & privacy issues.

I now use is Comodo Dragon, which is based on the open-source Chrome browser, however, it is more private and secure if used properly. I disable the camera & mic as SOP, so I haven’t investigated how Dragon responds to this exploit. The setting change that I outlined was in reference to the actual Chrome browser and this particular exploit, there may be more that I don’t know about.

I am very careful about exposing myself to the internet. My outward-facing computers don’t have cameras or mics to entirely circumvent malicious software like this and the likes of Finspy.

Exif Viewers

In a past article, I explained Exchangeable Image File or Exif data and pointed you to www.regex.info, an easy to use exif viewer with a geo-locator. The regex.info Exif viewer allows you to enter the image URL or to upload an image for analysis. It doesn’t require JavaScript and it doesn’t have any widgets.

Another easy to use online exif viewer may be found at www.fotoforensics.com, but you must enable JavaScript to use it. You can use the URL of the picture instead of uploading the image.

The online exif viewer at www.gbimg.org has a lot of widgets on it.

My last discovery was the Exif site at http://www.findpicturelocation.com. Just upload the picture and it will show the location where it was taken. It only works with .jpg or .tif files. You must upload the image to the site, so who knows where it might end-up. This uses the Google API for the mapping. Not all pictures have the GPS coordinates in them.

Trolling RSS Feeds

RSS (Rich Site Summary) is a format for delivering regularly changing web content. Many news-related sites, blogs and other online publishers syndicate their content as an RSS Feed to whoever wants it.

I have written quite a lot about RSS in the past. The following are my choices for both installation on a PC and for a web-based reader.

RSSOwl

RSSOwl is cross-platform as it’s Java-based. It handles RSS, Atom and RDF in terms of feed formats. You must have Java installed, no matter where you run it. It cooperates with Firefox to add feeds to RSSOwl from the browser. Just go to the feed and copy the URL then go to RSSOwl and click on add feed and it knows where to find the feed. You can also drag and drop Feeds from Firefox into RSSOwl. RSS Owl has an embedded web browser, so you don’t have to open up a separate browser window to view links or to view the full version of feed items that are shortened. You do have to set this up under “Browser” in the Preferences menu option. Choose to Default to the Embedded Browser. To get the RSSOwl embedded browser to work properly with OneNote so that it includes the URL in pasted items, you must enable Java Script. I do not recommend doing this except on an isolated machine otherwise, malicious Java Script code could cause serious problems.

RssBandit

When I need to collect video and podcasts from RSS feeds, I turn to RssBandit. The embedded browser is MS Internet Explorer, therefore, it includes the pertinent URL when you copy to OneNote as the embedded browser is the same.

This is my favorite RSS reader overall, though, I have experienced occasional problems with exporting feeds for another implementation of the reader. This problem seems to stem from differences in the underlying OS on the importing computer. It can be an irritation when starting a project with tight deadlines.

RSSOwl has an edge for a group of researching working in a collaborative environment as it is easier to set-up and distribute to the group.

Web-based RSS Reader

The two most popular seem to be Feedly and Inoreader readers that offers similar features and options.

Inoreader offers secure HTTPS access and over 40 different customization options. If I must use a web-based reader this is the one.

I refuse to use Feedly because extensions like NoScript, Adblock, HTTPS Everywhere, etc. prevent the site from loading. I never use sites infested with stuff that my normal suite of extensions prevents from loading. You only have to encounter one ad with malicious code to cost you many hours of work to purge the problem code from your machine.

Incognito Searching

Your search and browsing behaviour allows Google to personalise your search results. To escape this filtering of your results use a private browser window called incognito as it is called in Chrome. Google will then ignore tracking and search cookies to stop personalising your results. To get a private browser or incognito window use the following key combinations:

  • Chrome –  Ctrl+Shift+N
  • FireFox – Ctrl+Shift+P
  • Internet Explorer – Ctrl+Shift+P

I have found that this approach doesn’t work with Bing.