Google-Free Wednesday — Similar Pages & Link Searches

Lately, Google has begun eliminating as much search functionality as they can.  One of their recent efforts is the revamped advanced search page.  If you don’t think so, then just try to find the advanced search page on your own, I dare you.  Did you find it?

Evidently Google thinks you aren’t smart enough to use such advanced stuff.  If you really want to find the advanced search page you have to start your search first and then go all the way to the bottom of the SIRP where you will find a link to advanced search.

Under the guise of “people don’t use it”, the similar pages and links to a specific page (backlinks) options have been removed.  Now why would anybody want those nasty things anyway?

Similar Pages

Similar pages now have to be searched using SimilarPages.com and WhoIsLike.it.  This type of search is important to the expert searcher to develop search syntax and to find other players in a given market. The Google search syntax related:www.ConfidentialResource.com is most often a poor substitute for the above search engines.

Backlinks

To find the sites linking to a particular page you have to do it in the main search box using the Google search syntax,  link:ConfidentialResource.com.  Google’s link command isn’t very useful because Google collects so few backlinks. Bing is no help with backlinks. Yahoo closed its Site Explorer some time ago.  It might seem like searching backlinks is now limited to the scant Google results or nothing if you don’t have SEO tools on hand. Fortunately, that is not true.

Blekko Backlinks to the Rescue

Blekko is an excellent alternative for finding backlinks.

The search syntax is to use their slashtags /links or /domainlinks with a URL or domain name. The /links slashtag will find pages that link to a particular page whereas the /domainlinks slashtag finds all inbound links to a particular site.

The second route is via your search results. At the end of each search result is a downwards pointing arrow labelled SEO. Click on this and select links from the pop-up box. This creates a /links search syntax for the page given in the search result.

ReverseInternet.com Backlinks

We have also used ReverseInternet.com successfully. Search by the domain name, then select [backlinks] next to the domain name in the resulting table.  At the top right of the backlinks table, select External Only: On to get the external backlinks.

 

Stealth Search for Google-free Wednesday

Stealth Search Engine

When I first looked at this search engine on 29 Oct 11, its ‘about’ and ‘privacy policy’ pages looked suspiciously like what was on another search engine’s ‘about’ pages. Worst of all, it didn’t find any results when I searched for my name.  That was in the first days of November 2011, today this thing is working much better and the about pages have been rewritten, but still confusing in places. However, I am not sure I would trust the results or the privacy features yet.

Given the scale of the improvements I have seen in less than one month, this is a search engine I will keep tabs on. For example, in their @UseStealth Twitter feed they say, “we don’t pass info through http refferer”, if this is true, then this will become one of my search tools.  The news search returned good results from an interesting assortment of sources during my tests today. The video search only seems to search Google and YouTube and the image searches return poor results compared to other, larger search engines.

 

DuckDuckGo

Google-Free Wednesday

Our Google-Free Wednesdays create familiarity with the new, specialised, and often more relevant search engines.  Its been a while since I have come across a  a new and worthy candidate for this honor. Today, the honor goes to DuckDuckGo (DDG).

DuckDuckGo

I like this search engine because it eliminates a lot of the spam sites that have twisted and manipulated the Google results lately.  I have previously written about encrypted search engines like Scroogle Scraper and the Encryped Google search.

DDG goes further to protect your privacy. If properly set-up, DDG (Redirect setting) doesn’t send your search terms in the HTTP referrer header to the sites you click on. Your search terms may reveal your interest to the sites you visit and this may compromise an investigation.  It also uses a version of the HTTPS Everywhere FireFox add-on for its secure site connection. However, to ensure your first search is secure you may have to first enter a “dummy” search to get to HTTPS version.

DuckDuckGo also operates a Tor exit enclave, which means you can get end to end anonymous and encrypted searching by using Tor & DDG together. That means if you’re on Tor, and you access DDG, you’ll likely exit through the DDG relay and get service much faster. Tor can be slow, but this should speed it up a bit if you’re searching using DDG. Only DDG traffic exits from the DDG relay.

The lack of persistent settings requires the use URL settings like this: “http://duckduckgo.com/?kh=1&kn=1&kp=-1”. Once you are at the properly set-up DDG homepage, drag the URL to the bookmarks toolbar.  Use the bookmark to launch DDG with your settings. When you click on the bookmark you will find that you are at the normal HTTP homepage. Enter a dummy search to be certain all your searches are encrypted (HTTPS) and not leaking data to the sites you visit through the referrer header.

Ixquick for Google-free Wednesday

The Ixquick search engine results appear normal, but underneath each link description a Proxy link appears. Clicking it gets the website through an anonymous proxy. The page will load slower when viewed through the proxy, but if privacy is important, then you probably won’t mind the wait.

The search results aren’t as good as you would get from the large search engines, but the proxy thing is quick, handy, and just simply cool. The problem I see is that it only displays an artificially small set of results for your search. For example, 64 unique results selected from at least 1,121,619,121 matching results for “intel”. You only get 64 hits — nothing more.

Finding Slides

SlideFinder.net offers a search engine powered by Slide Executive, a PowerPoint software and tools company.

Searching “McEachin” in Google I get 37 hits. Doing the same search in SlideFinder, I get one hit. In the Google results, the SlideFinder result appears third from the bottom with a different file name than found by SlideFinder.

According the SlideFinder  blog, they concentrate on indexing presentations from university websites as these “will often contain high quality content.” The blog is worth following if you regularly search for PowerPoint presentations.

This thing works very well for finding references to company names and Web sites. The person who prepared the presentation usually knows things that interest me. It’s usually easy to find the person who made the PowerPoint file. Write-out my questions, make a telephone call, get answers, write report, and move on to the next job.

Google-Free Wednesday

FindThatFile

Previously, I wrote about file searches using OSUN.ORG.

findthatfile.com provides a file search  encompassing Web, FTP, Usenet, Metalink and P2P resources (ed2k/emule) including 47 file types and 554+ file extensions including over 167 file upload services. It also offers an alert service sent to your email.

However, not all information in the search database has every property you might be searching for, therefore, you have to explore the different ways to search for the file in the advanced search screen.

In my experience, this is not a good search engine to use to search by a person’s name or a company name. The files are not well indexed in this fashion.  One must also be careful to select the “All Files” button in the “Adult Filter” to be sure all the files found appear in the search results.

I usually search by a file name for other versions of a file that I already know about. In some cases, findthatfile.com will give me an understanding of how widely circulated a file may be, or turn-up different versions of the same file.

Avoiding Google’s Own Censors

Better off with Bing

This excellent article by Lawrence Solomon illustrates why a researcher or investigator must use more than one search engine.

Googlegate: The search engine may be standing up to Chinese censors. What about Google’s own censors? 

Search for “Googlegate” on Google and you’ll get a paltry result (my result yesterday was 29,300). Search for “Googlegate” on Bing, Microsoft’s search engine competitor, and the result numbers an eye-popping 72.4 million. If you’re a regular Google user, as opposed to a Bing user, you might not even know that “Googlegate” has been a hot topic for years in the blogosphere — that’s the power that comes of being able to control information.

… Google began to minimize the Climategate scandal by hiding Climategate pages from its users.

Bing, in contrast, didn’t make climategate pages disappear. As you’d expect from a search engine that wasn’t manipulating data, search results on Bing climbed steadily until they peaked at around 51 million…

Document Hunting on Google-free Wednesday

Searching for specific terms in indexed documents on the Web is something many searchers fail to do. It is amazing what you can find when you go looking for it. I’ve written about searching by file type before. Now I have found a search engine for .pdf, .doc, and .ppt files.

OSUN.ORG

OSUN.ORG provides a simple interface for searching PDF documents, MSWord documents, and PowerPoint files. The large search engines allow one to search more file types and you must search one file type at a time using OSUN.ORG as you do in Google. I don’t know what database this search engine uses, but it doesn’t compare very well with Google. A search for my name in PDF files give 52 results in Google and only 9 in OSUN.ORG. This is not a good performance.

Sometimes it’s really hard to find an alternative to the big three search engines.

The First Google-Free Wednesday of 2010

DevilFinder

According to the site, DevilFinder began as a project to display results from search engines like Google and Yahoo without setting cookies while presenting fewer pages of results.  It does not collect search data from users and no invasive cookies or JavaScript is used.

DevilFinder seems to rank the search results on the search term alone, rather than a combination of relevance and the popularity of the site. This is why relevant results from less popular sites may appear  at the top. It is might also be the reason the result set is so small. DevilFinder shows the results arranged 100 per page and I rarely get more than 2 pages.

The Image search works quite well. The images are much larger than  other search engines. The Video search only returned hits from Youtube for any search I have done – not exactly useful. To be fair the Video search seems to be a new feature. The News tab is just a crude collection of feeds that aren’t searchable.

Search Strategy

This has become a favorite choice for searching the names of people and companies. The results often provide more useful sites in the first page than Google and I don’t have to go to the last page of results to find out what wasn’t searched, as I do with Google.

For long, complex search statements, I still rely on Google, Bing, and Yahoo!, but for searching names and some other common short search statements, DevilFinder does an excellent job and sometimes a better job than the big guys.

Crisis Planning at Yahoo!

If you enclose your search terms in square brackets, then Yahoo! will only retrieve pages that have your search terms in that order. The search terms may be anywhere on the page, but the first term will appear before the second and the second before the third, etc..

An example would be [crisis planning]. It returns a document that is entitled Planning for a Crisis. It might seem backward but at the end of the quotation near the top of the page you find “Crisis Management:Planning for the Inevitable (1986)“.

Rollyo

If you need to search up to 25 websites with a specific search statement, then Rollyo may be an alternative to Google Custom Search. Rollyo is powered by Yahoo! so it’s acceptable fare on Google-Free Wednesday.

You can create an unlimited amount of search engines, called Searchrolls, on each topic. Each engine can be kept private or made public if you become a “member”. Each topic is limited to 25 urls.

You name your Searchroll, put in the URL’s, then select the named Searchroll on the search page when you enter your search statement.

Google-Free Wednesday Catches-on

Phil Bradley’s article on some of the virtues of Exalead is worth reading.

10 reasons why librarians should use Exalead

It’s a search engine for people who like to use search engines, and it’s an engine for librarians. If you’ve not used it, I’d strongly recommend giving it a whirl next ‘Google free Wednesday’.

Phonetic and Approximate Spellings

Exalead Phonetic

The phonetic search option will give results sounding like the word you have entered. This works well when you don’t know how to spell a word, or there is more than one spelling for your search word. This can be done from the advanced search page or using the operator soundslike:type_word_here or soundslike:(type_word_here)

Exalead Approximate Spelling

This also works well when you don’t know how to spell a word for which you wish to search. This can be done from the advanced search page or with the operator, spellslike:type_word_here.

Use Google & Exalead Together

Use the Google Synonym search along with both the Exalead phonetic and approximate spelling searches to find the unknown spellings, similar spellings, incorrect spellings, and synonyms.

Remember, in normal searches, Google also identifies alternate words through stemming, abbreviations, combined and split words, and words with or without accents.

Google-Free Wednesday

Low Profile Search Engine

Ixquick.com professes to delete its users’ search data (including IP addresses) within 48 hours. Furthermore, Ixquick does not set any uniquely identifying cookies or share your privacy details with 3rd parties.

Meta-Search

Ixquick has the normal syntax options available in most of the large search engines. As a meta-search it searches:

  • All the Web
  • Ask/Teoma
  • CNN Search
  • EntireWeb
  • Exalead
  • Gigablast
  • MSN
  • NBC
  • Open Directory
  • Qkport
  • Wikipedia
  • Winzy
  • Yahoo

I didn’t get good results searching for images and telephone numbers using Ixquick as I get in other search engines, but the meta-search works quite well. However, you don’t know how many results are chosen from each search engine to make-up the results you see in Ixquick, but this is true with other meta-search sites.

Forum Search

Google-Free Wednesday

Twing purportedly offers the ability to search many forums more thoroughly than traditional search engines. Forums offer a soap-box to both the worst and best denizens of the Internet.

I won’t be replacing the large search engines with Twing for searching forum posts, but Twing found many items that the large search engines missed or place extremely low in the search results. However, it also failed to find some large forums.