Archive for the 'Search Engines' Category

Finding Inbound Links

Evaluating a web site or blog is never easy. Fact checking will weed out the crap, but who needs to start with a lot of crap. The number of links to a site will supposedly put it towards the top of the search results, but that isn’t a guarantee of accuracy if the inbound links are from sites full of crap.

When I see something worth citing, I begin the evaluation by seeing who links to the site, perhaps it will be other sites already proven reliable through fact checking. This may also lead you to more or better data.

Continue reading ‘Finding Inbound Links’

GooFresh

GooFresh

Google offers a date-based syntax, but you can only access it via the advanced search, which limits your time options, or the date range: syntax, which uses Julian dates and is a bit difficult to use.

Goofresh is a way to search for sites added today, yesterday, within the last seven days, or last 30 days.

Crisis Planning at Yahoo!

If you enclose your search terms in square brackets, then Yahoo! will only retrieve pages that have your search terms in that order. The search terms may be anywhere on the page, but the first term will appear before the second and the second before the third, etc..

An example would be [crisis planning]. It returns a document that is entitled Planning for a Crisis. It might seem backward but at the end of the quotation near the top of the page you find “Crisis Management:Planning for the Inevitable (1986)“.

Unsecured Web Cam Search

This Google search should find any unsecured web cams: inurl:”ViewerFrame?Mode=”.

However, you can take this further by adding the site: operator. You could even set-up a Custom Google Search Engine for a group of domains.

Finding Corporate Owned Internet Domains

I was recently asked how to find the domains owned by a particular company. Here is what I recently unearthed on this topic.

Whois

You can still search RIPE (Regional Internet Registry for Europe), which contains registrations for most of the European countries. The US server at InterNic no longer allows this.

Databases

Domain Names database on Dialog includes information on registered domain names with Top Level Domains (TLD) of COM, NET, ORG, BIZ, and INFO as well as those that use country codes, e.g., AT. it can be searched by owner name. This database was last updated in September 2004.

DomainTools offers a rather expensive solution which is like an updated version of the Domain Names database. According to their web site:

…currently indexes all domains in the .COM, .NET, .ORG, .INFO, .BIZ, and .US TLDs. That is 103,042,578 domains as of today. In addition to indexing every active domain, it also knows about the 334,835,604 inactive domains that have been registered and deleted since the early days of the Internet. Great names are deleted daily so it is important that we keep track of them.

The partial word searching ability of Name Intelligence is unmatched by any other engine. We allow more options and faster results then anything else on the market and we continue to add new functionality monthly. In an information world a company can only focus on so many problems at one time. We dedicate our time to making domain searching faster and more efficient so our partners can dedicate their time to their own core technologies.

Every month Name Intelligence actively probes every domain name in its search engine to figure out the domain’s status. Our search results not only reflect active and deleted domains but domains with websites or not. We have taken searching for domains very seriously.

… DomainTools has leveraged the power of its Registrant Search engine to provide notifications whenever a person or company registers a new domain, has one transfered to them, or transfers a domain out of their control.

They report on two Richard McEachin names for $57. When I search on Scarborough, the city for the registrant address, it finds two records for one domain for $57. When I add Canada as a limiter, they says they have no reports.

Searching the name McEachin returned 248 records in 147 domains and a report cost of $147. When I add Canada as a limiter, they again say they have no reports. When I search on Scarborough, the city for the registrant address, it finds four records for one domain for $61.

Not exactly what I call a stellar performance.

Proxy Registrations

Many domain registrants are now are concealed by registrars such as Domains by Proxy.

Security Scanner or Research Tool

FoundStone (a division of McAffee) recently released a free tool called SiteDigger. The tool uses the Google API to scan cached pages of a web site and then performs security checks on those cached pages. One of the things it will look for is open security webcams.

Rollyo

If you need to search up to 25 websites with a specific search statement, then Rollyo may be an alternative to Google Custom Search. Rollyo is powered by Yahoo! so it’s acceptable fare on Google-Free Wednesday.

You can create an unlimited amount of search engines, called Searchrolls, on each topic. Each engine can be kept private or made public if you become a “member”. Each topic is limited to 25 urls.

You name your Searchroll, put in the URL’s, then select the named Searchroll on the search page when you enter your search statement.

Custom Google Search Engine

Have you ever wondered how you could run the same search against a list of websites? Typically, the search would be something like search_statement site:your_list_of_sites.com. Google has the answer to this problem. (See our article about this search operator.)

Google Custom Search Engine may be set-up to search one website or multiple websites with one search string. Of course you need a Google Account to create the custom search.

Google-Free Wednesday Catches-on

Phil Bradley’s article on some of the virtues of Exalead is worth reading.

10 reasons why librarians should use Exalead

It’s a search engine for people who like to use search engines, and it’s an engine for librarians. If you’ve not used it, I’d strongly recommend giving it a whirl next ‘Google free Wednesday’.

Phonetic and Approximate Spellings

Exalead Phonetic

The phonetic search option will give results sounding like the word you have entered. This works well when you don’t know how to spell a word, or there is more than one spelling for your search word. This can be done from the advanced search page or using the operator soundslike:type_word_here or soundslike:(type_word_here)

Exalead Approximate Spelling

This also works well when you don’t know how to spell a word for which you wish to search. This can be done from the advanced search page or with the operator, spellslike:type_word_here.

Use Google & Exalead Together

Use the Google Synonym search along with both the Exalead phonetic and approximate spelling searches to find the unknown spellings, similar spellings, incorrect spellings, and synonyms.

Remember, in normal searches, Google also identifies alternate words through stemming, abbreviations, combined and split words, and words with or without accents.

Confidential and Not for Distribution

You just had to ask how I found it. Now look, you may be the client but it’s a secret — their secret that is.

Do a Google search for “not for distribution” confidential and look at all the stuff you find. Now try “not for distribution” confidential site:microsoft.com and look at what you find. Now if you were to try this on the subject company’s web site….

Web-based documents are neither confidential nor private, but they might be a secrets in plain sight.

Advanced Search Operators

Google Synonym search

If you want to search not only for your search term but also for its synonyms, place the tilde sign (”~”) immediately in front of your search term. For example, ~fraud~facts

Now you know why I find stuff you don’t.

Google Domain search

You can use Google to search only within one specific website by entering the search terms you’re looking for, followed by the word “site” and a colon followed by the domain name. For example: “service pack” site:microsoft.com

Never ask me how I found it again.

Google’s numeric range search (Numrange Search)

Numrange searches for results containing numbers in a given range. Just add two numbers, separated by two periods, with no spaces, into the search box along with your search terms. You can use Numrange to set ranges for everything from dates ( Willie Mays 1950..1960) to weights (5000..10000 kg truck). But be sure to specify a unit of measurement or some other indicator of what the number range represents.

Now you know how to identify stuff about someone’s dead father. Too bad it doesn’t work for Junior very often.

Clustering on Steroids

iSEEK is a clustering meta-search site. It covers Ask, Yahoo, Goggle, MSN, and its own database. I like its coverage. I’m not too keen on a particular navigation feature.

When you select one of the topics in the left-hand panel you loose the hits in many other topics. You have to click on the topic again to regain the other hits.

Otherwise this is a very good meta-search with excellent clustering.

As it Happens on Google

As an Investigator you may have wished you had pictures of that horrible traffic accident you’re investigating. Have you ever thought to look in Google. That’s right, Google.

Ever since Google made satellite imagery available, everybody thinks it is only for finding a good restaurant. But when you think of it, the satellite takes pictures of what is happening when it flies over. For example, a truck crash (Google Earth coordinates 46.765669,-100.79274) outside of Bismarck, North Dakota.

An example of how far this can be taken is All Aircraft in Flight where more than 3300 images of planes in flight are identified. Or there is the Mirage fighter jet in a parking lot (Google Earth coordinates 48.825183,2.1985795) near Paris for those interested in the parking problems in France.

Check-out the Google Earth Communities site for more interesting stuff about Google Earth and its imagery.

Google-Free Wednesday

Low Profile Search Engine

Ixquick.com professes to delete its users’ search data (including IP addresses) within 48 hours. Furthermore, Ixquick does not set any uniquely identifying cookies or share your privacy details with 3rd parties.

Meta-Search

Ixquick has the normal syntax options available in most of the large search engines. As a meta-search it searches:

  • All the Web
  • Ask/Teoma
  • CNN Search
  • EntireWeb
  • Exalead
  • Gigablast
  • MSN
  • NBC
  • Open Directory
  • Qkport
  • Wikipedia
  • Winzy
  • Yahoo

I didn’t get good results searching for images and telephone numbers using Ixquick as I get in other search engines, but the meta-search works quite well. However, you don’t know how many results are chosen from each search engine to make-up the results you see in Ixquick, but this is true with other meta-search sites.