I see many courses for Private Investigators (PIs) about using the Internet for Open Source Intelligence (OSINT). These courses are predominately about Internet sites that might yield useful information. These courses don’t teach how to process and analyse the captured data or how to properly report what was found. The OSINT concept usually misses the “intelligence” part, and it’s more about gathering raw information, not the production of intelligence.
As an example, I just captured a FB account with about 1000 posts, thousands of friends and pictures, along with about 20 videos. How would anyone search through all of this and link it to relevant people, places, things, or companies? Even if the PI can identify some useful linkages and other data, how does he report it in a timely and cost-effective manner? All these courses conveniently omit the fact that a senior decision-maker needs an accurate and concise report that illustrates the linkages between relevant data.
Unfortunately, many of the course providers don’t create investigative or intelligence product, they teach courses about Internet sources.
According to Justin Seitz, the creator of Hunchly, a Chrome browser extension for collecting OSINT material from the Internet, “the greatest limiting factor of the OSINT concept is budgets that don’t recognise the time, resources, and training needed to complete the research, or the complexity of creating a true intelligence product. The budget provided to the PI leaves no choice but to simply provide screenshots and captured raw data to clients who don’t want to pay the premium required to deconstruct a network, or to chase-down the best breadcrumbs.” In the information industry, we call this ‘rip & ship’. Nobody expects other professionals to work like this.
In a recent discussion with Mark Northwood of Northwood & Associates, a large Canadian private investigation company, he easily summed-up the problem. “If a client retains a lawyer and the lawyer researches case law in order to determine which are the best methods to advance or defend a claim does the client simply say “give me the case law, I will interpret it”–No–the lawyer gives the client his opinion and supports it with the case law. Clients pay lawyers substantial fees for their analysis of case law, not the collected the case law itself.” Clients need the PI doing OSINT to work in the same manner.
Northwood believes that PIs need to educate their clients into understanding that someone needs to analyze the raw OSINT data and the only person that can do that is the PI because he collected the raw data and has it immediately at hand. The PI is in the best position to collate, analyse, and report on the data he has collected.
Chicken or Egg
As I see it, this is a chicken or egg problem.
Without reasonable budgets on offer, clients won’t find PIs with the programming experience necessary to mine the collected data. Nor will clients find PIs experienced with the complex and expensive software to collect and report on the data in the first place.
Clients cannot find PIs to conduct OSINT and create actionable reports because there is no profit in it for the PI. No PI is going to acquire such skills if there is no profit in doing so. Without the prospect of reasonable wages, people with the above skills won’t become PIs; nor will people with the training in the logic, rhetoric, and argumentation needed to produce actionable reports. Existing PIs won’t be motivated to learn these skills without the prospect of financial benefit.
If the PI consistently has appropriate budgets to work within, then he will have or acquire proper tools and skills needed to collect, analyse, and then report on the significance of the collected data. Proper budgets also permit the PI to develop a viable reporting protocol for the type of data he collects. Proper budgets preserve the integrity of the collected data and allow for the creation of intelligence reports that include proper citations.
This chicken definitely grows from the budget egg. A large Canadian PI firm is currently advertising for someone to conduct ‘social media investigations’ at a pay rate of $15 per hour. One can only imagine the nature of the client’s expectations and the type of work produced for so little pay.
Today, any intelligence or investigative product requires a fusion of many types and sources of data. A complete report usually needs surveillance observations, content from interviews, public records, and government documents.
Again, the budget to collect and analyse public records and government documents creates the skills and knowledge needed to perform this task. This fusion of data sources allows the PI to establish relevant links between the people, places, things, and companies of interest to the client.
OSINT Tools & Skills
If the budgets come to truly represent a desire for a better product, then the following will be the tools and skills your PI should possess in the realm of OSINT. This is the rocket science behind real OSINT.
Hunchly is a Google Chrome extension that tracks and captures every page that you view during an investigation. This saves you from having to stop and take screenshots or from having to create handwritten logs of every URL that you have visited. It includes the ability to track names, phone numbers and other pieces of information. Hunchly builds a data rich case file from all of your investigative steps that helps you to preserve evidence.
Hunchly permits the use of “selectors,” such as a name, address, or phone number that save you from manually searching each page or the collected data for the terms. In my opinion, this feature alone is worth the purchase price. The other useful features include:
- being able to add notes to what you find
- you can download notes as a Word document
- all collected data is stored, tracked and accessed on your local machine–no security or privacy concerns about cloud use
If your research requires graphing of the relationships between people, places, things, and companies, then CaseFile provides that at a much lower cost than other solutions if the dataset small enough to be managed manually and this is the case presently for most of the PI’s work.
Maltego is the favoured software of many intelligence analysts, researchers, and investigators for searching, and linking OSINT data. While it helps search through mountains of data and sort it in useful ways based on publicly available information that is currently sitting on the Internet, it has many limitations.
If you need to search FB by email address, Instagram by photo GPS, search people in social media sites, or search LinkedIn by company or college, then this is the tool to use. However, some these capabilities can cost $1000 per year on top of the Maltego yearly fees. Less costly alternatives exist.
Given its current state of development, I am not certain that Maltego warrants its cost for the PI. Most of the search capabilities of Maltego are in ‘transforms’, which are Python scripts that access a search site’s API.
The search functions of the most used ‘transforms’ can be created in Python for a lower cost. The graphing component of Maltego is available in CaseFile. Using Hunchly, CaseFile, Python scripts, Word, and PowerPoint together should produce on acceptable product if the collected data is properly summarised and then analysed.
Python is a programming language best described as a language used to create scripts that execute specific tasks, such as searching for a specific word in a sea of text.
Python automates time-consuming tasks. It allows you to parse raw data untouched by other tools and read information from databases. It aids in the generation of reports and moves files into folder structures based on their content type. From the PI’s perspective, Hunchly can handle these tasks.
Python scripts may also provide access to a search site’s API. A page of scripts enables searching a site for search terms in a variety of ways. In practice, this is the PI’s favored use of Python.
The High-end Tools
When the volume of collected data increases, so does its lack of organisation for investigative purposes. This fact has spawned many products designed to search and retrieve text strings in masses of data. This is usually called “free text retrieval” (FTR) software. The following are the current leaders in utility for investigative purposes.
The dtSearch product line enables searches of terabytes of text across a desktop, network, Internet or Intranet site.
In the near future, PIs may resort to high-end tools like the Nuix suite to find connections in the vast seas of data that like the Panama Papers dataset. Nuix is a FTR software that enables searching through huge volumes of unsorted data for people, places, things, and companies. It also allows users to display connections between all these entities along with timelines is a manner similar to Maltego and CaseFile.
For more than a decade, FTR software has been the province of well-funded intelligence agencies, law firms, and businesses. Journalism has discovered this due to the donation of Nuix to the Panama Papers project.
Social Media Monitoring
Products like XI Social Discovery, Geofeedia, Dataminr, Dunami, and SocioSpyder, to name a few, are being purchased by Fortune 500 companies, and government to manage social media research. Products are now becoming necessary for the successful private investigator.
In broad strokes, the PI’s report creation process should look like the following:
- The PI will assemble or collate all of the collected information from all the tools used, examine links, or shared information such as URLs, email addresses, etc.. From this collated material, a summary begins to take shape.
- The investigator ensures that each piece of crucial information is put into its own section within the logical order of the summary; visuals (screenshots, text captures, tagged photos) are included as much as possible.
- Relationship graphs exported from CaseFile or Maltego should be included in the report if they fit the page, if not, screen clips may be used or Powerpoint slides can be imported.
- From the summary rises the true analysis of how the data relates to or affects the client’s objectives.
- The report must describe the sources and methods used and describe all investigative activities. This is crucial when little information is uncovered about a subject. This level of detail is not included in the summary.
- Evidence (captured images, videos, etc.) remains in a separate file from the report.
In conclusion, as with all new products, the price will drop and quality will improve as PIs adopt the necessary programming skills and software in an increasingly competitive market. Of course, this will not happen if clients are not willing to provide reasonable OSINT budgets today.
 Application program interface (API) specifies how software components should interact, ie. a search interface.