Tag Archive for 'Search Engines'

Page 3 of 8

Knowem

James Ruotolo at FraudPro found Knowem to be a good way to find what social sites have a particular user name. I’m going to add this to my list of ways for Finding Usernames.

Stealth Searching III

In a previous article on Stealth Searching I wrote:

You will not click on any links on the cached pages as these will go to live pages. You will not allow your browser to download any images on the cached pages, as they may be live images from the target domain. You will be STEALTHY. They won’t see you coming.

A reader suggested that this requires some further explanation.

Google Cache Risks

Google caches only the text of the Web page. When  the Googlebot copies the first 101K of HTML to a Google server, external files such as Javascript, Cascading Style Sheets, images, Flash, etc. are not saved. The images load from the live site not the Google cache.  Normally, when you view the cached copy, you are not connecting to the live site. However, following any link on the cached page will connect you to the live Web site, if it still exists. Some pages in Google’s cache load the entire page from the original server thanks to a simple redirection script. If a cached page has no external files, then you will not show up in the site’s log by viewing Google’s cache; but how likely is that?

The Wayback Machine

The Wayback Machine changes the links of cached pages, to allow navigation within the cached pages. However, there is always the chance that you will navigate yourself out to the original site. Remember, nothing is prefect and this stuff wasn’t designed with anonymity as its objective.

The Dangers of TOR

Using TOR to explore the Google cache and The Wayback Machine seems to be the only option. However, Web history and geographic origin affects search results when you use TOR or similar methods.

TOR does require a certain level of technical knowledge and sophistication or it can backfire on you. For example, the SSLstrip attack that is now in the wild:

The attack is more than theoretical. Marlinspike tested the software on a public server he hosted for users of the Tor anonymous browsing network; he was, by his own account, able to grab passwords to 117 e-mail accounts, 16 credit cards numbers, seven Paypal logins and about 300 other logins to supposedly secure sites ranging from Gmail to Ticketmaster to Facebook.

If a TOR server is set-up for the purpose of running SSLstrip, then you’re in trouble. The very nature of TOR makes the possibility of a corrupt TOR server rerouting your data to the attacker very possible and an ideal situation for the crook.To use TOR effectively, the proxy must be configured properly and the user must be very observant to prevent an attack via SSLslip and similar threats. Google Cache Google The Onion Router The Wayback Machine Private Investigator Toronto Ontario Canada

Bing searching Facebook and Twitter

Microsoft to Data-Mine Facebook & Twitter

Microsoft has cut non-exclusive deals with both Facebook and Twitter for Bing.

Microsoft has cut non-exclusive deals with both Facebook and Twitter for Bing to search their real-time data feeds. Google has followed suit at least with Twitter, but Facebook is the prize because it has like 40 million updates a day from its 300 million users. Not all Facebook updates will be searched by Bing, however, only the ones made available to the wider public. Facebook, where Microsoft has an equity stake, will apparently provide users with a numbers of new tools to do so. It is unclear how much Microsoft is paying. The Twitter integration is already in beta. The deals suggest that Twitter, which has raised $155 million in venture capital, will see its first revenue since ads will follow. Terms were not disclosed.

Microsoft’s stake in Facebook may give us some interesting tools for searching Facebook in the near future.

Internet Detective School 101

Google Alerts

We all know know and love Google, but how many people use its best investigative features? Investigations aren’t done in one day so why search Google on only one day?

Google Alert service is free and it allows you to create custom RSS feeds using Google search results, or you can receive the alerts by email.  Thus, if you create focused searches using phrases, site qualifiers, etc. in Google, you now can have those results as a RSS Feed.

Login to you Google account, then use the advanced query options to construct your search.  Select the Feed setting in the “Deliver to” column to activate your RSS feed.  It’s that simple; there is no need to program a Google API. Alternatively, select email to have the results sent to you by email.

Your search can be set-up to notify you as the new data appears if you select email notification. You may select as-it-happens, daily, or weekly. Simply make the selection in the “How often” column. Of course the RSS feed option doesn’t need to be told when to send you the results, it captures new data as it appears and publishes it in the feed.

To receive the feed you will have to wait until it is populated with some results. Once there are results in the feed, you may then click on the feed link for the Alert and copy the URL into your newsreader.  This takes about one day to occur in my experience.

Internet Detective School

Internet Tracking

Mantracker hunts people by following their spoor for a popular TV show.

On the Internet, Investigators have to do the same thing. However, the digital spoor may be on a computer in Singapore while your prey is in Corner Brook Newfoundland.

For this series of articles, the terms tracking, monitoring, and alerts  all mean the same thing. These terms are applied to methods of collecting new information as it appears in a variety of searches of many sources throughout the Internet.  This is a systematic way of locating information about a subject as it becomes available. These are sources and methods that monitor news reports, social media, blogs, or other open sources of information relevant to your investigation. I will illustrate how to construct the search statement and get the results in your hands on an ongoing basis.

I will start with the large search engines and move onto the lesser know sources and methods.

Google Search Options

Google’s “Search Options” was launched last May and it provides several filters to narrow down your search results. On the results page the “Show Options” link appears at the top of the search results. Click on that and you get a sidebar that looks  like the one on Google News search.

The best option in all this is that you can now SORT RESULTS BY DATE instead of relevance.  Other options that offer interesting results are the filters for formats such as video, forums, Blogs, and reviews. The reviews filter is quite strange — but I have found the best way to make it useful — use it when you are searching a person’s name and it will turn-up results from a wide variety of publications and blogs. Searching for reviews about things seems quite useless, but using this when searching names, then sorting by date, makes this very useful.

Directory of Social Networks

I came across this interesting directory of social networks: http://www.social.com/Social-Networking/. This seems to have in excess of 500 listings for social network sites in something like 100 categories.

Many of the listed sites aren’t social sites like Facebook or MySpace. I wasn’t quite sure how I might use this, so I Googled it, and found an interesting use for it in ResearchBuzz.

According to ResearchBuzz, “Google Sets allows you to specify a couple of different things and get lists of additional similar things”, and I have been using it to help me build searches and find stuff for awhile. Sometimes I wonder how some things get into the set list, but it is good to play around with the new toys from Google.

Tweeple at Work

These searches will help you to find people associated with a company or find  a subject’s co-workers.

Start with Twitter’s Find People. Search for the company name. A long list of followers of the company Tweets might be very enlightening.

Search the Twitter Profiles using Twellow by searching for the firm name, web site URL or other relevant search terms.Sometimes former employees appear in the results and may prove to be useful interview subjects.

LinkedIn is one of the most used social networking sites. Use Google to search LinkedIn for Twitter references with a search term such as site:linkedin.com company name. twitter to the search string to find twitter feeds. Do the same search using Bing and Yahoo.Then redo all the searches for FaceBook and MySpace and any other social network site that might be useful.

Use TweepSearch to search the Twitter name of someone and then index the bios of all the users they are following or are following them. Once you have them indexed, you can do a keyword search using relevant search terms.  The results may lead you to the bios of additional members of the firm for which the subject works.

Real-time Search Engine

Collecta

Collecta claims to provide results in real-time from the Web. Your search results will appear in a constantly-reloading stream — everything from Twitter updates to news and blog articles, and even  Flickr photos.

However, Twitter usually deluges the results. The “Search Options” to the left of the results allows you to select the type of updates you want to see. Leaving the Twitter updates unchecked makes it easier to see the other real-time search results.

Limitations

Like all Meta search engines, it is hard to create a search statement because you’re searching 140-character Tweets, full-text news, and Blog entries. I don’t use this as a starting point. However, it searches a wide variety of places, which makes it good for tracking breaking news.

Searching the Personal Ads

CraigsList Search Engine

AllofCraigs is another CraigsList search engine built on a Custom Google Search.

It also  allows you to query specify all Craigslist and  other ad sites and get results pulled from a custom Google search.

A Twitter stream tool allows you to see tweets that contain the word Craiglist.  However, you also get Tweets that  just mention the word Craigslist not ones with links to ads.

A search for the words toronto incall returns  many, many hits in this fast changing type of ad, while Search All Craig’s returns none. This might be useful for searching Craigslist for telephone numbers. My early searches for telephone numbers seem more successful using this than Craigslist itself.

Yauba

states:

 We do not keep any personally identifiable information.

Period.

Anonymity may be important for some people. However for most, it’s search results that count and this review clearly shows that this is a search engine with yet undeveloped potential.

Chickipedia

I recently read a news article that mentioned Chickipedia.  I immediately began searching this site. I found porn stars, actresses, athletes, and many more.  If a local paper can find a drunk driver in this thing, maybe I could find the subject of an investigation. I searched using names, city names, and occupations. Every search returned valid results.  Too bad there are only 9,177 ladies profiled on the site. Too bad I didn’t find the subject of an investigation.

Microsoft Bing

What is Bing?

Bing is now the official MS search engine.  Don’t bother searching Google for information about this evolution of Live Search. Here is the stuff you need to understand and use Bing.

According to Microsoft CEO Steve Ballmer, Bing is a decision engine”

More than just a rebranding of Live Search, Microsoft is repositioning Bing as a “decision engine,” with a goal “to provide customers with intelligent search tools to help them simplify tasks and make more informed decisions,” according to a Microsoft spokesperson.

Bing’s “decision engine” will begin by focusing on four key vertical areas: making a purchase decision, planning a trip, researching a health condition or finding a local business.

Bing includes some advancements to Live Search’s core search, such as entity extraction and expansion, query intent recognition and document summarization technology. It also offers a new user experience model, which changes based on the query to offer more relevant decision-making tools.

Search Language

The search language seems to be the same as Live.com. The Bing Virtual Presskit outlines the features and search syntax quite well.

Feature Comparison

PCWorld has written a good article about the  comparative merits of Google, Yahoo and Bing.

G Vs. B

If you want to compare the results you get with the  same search term in Google and Bing go to Google Versus Bing.

Reviews

The reviews seem to imply Bing is like Bullwinkle saying, “watch me pull a rabbit out of my hat,” while Rocky looks on, having seen this trick fail every time saying, “oh Bullwinkle …” But Bullwinkle insists, “this time for sure!”

Karen Blakeman and Phil Bradley both feel that Bing offers nothing  innovative. However, I like xRank which keeps track of notable people and puts them in order for you. This tends to be US-centric but it seems to help with Canadians and people with a strong web presence.  I suspect that this has improved with the launch of Bing.

The other thing I like about Bing is the video search. I like how the results are presented.

What’s Bing Good For?

As a researcher, I find Bing good for two things. Searching for info about people and for its video search.

I have always liked xRank and it seems to be a bit better for Canadians in Bing, or it could be the strong bias in the results depending on where you are located that makes it work better in Bing. In the Extras>Preferences it allows you to select the location that creates this bias. (I have not been able to maintain the changes I make to the preferences from one session to another. This could be how I have the browser set-up.)

When I search by a person’s name, I change the preferences to indicate the city where the person lives and I get different results than if I leave it set to here Bing thinks I am located. I also shut-off the porn filter. Both of these changes will affect the result you see. In any case, this is the general purpose search engine to use for searching on a person’s name.

The  video search allows you to see a preview of the videos before selecting any, this is a real time saver. This won’t replace blinkx or Samepoint but it is quite functional.

FriendDeck

FriendFeed, allows people to aggregate their activities across the social web. It is a great place to find what sorts of things people are talking about. In some ways, FriendFeed is better for “real-time” web searches than Twitter because a FriendFeed search will not only return Twitter posts, but also shared RSS feeds, Facebook status updates, items posted natively in FriendFeed itself, stories being promoted on social news web sites like Digg.com, and much more. However, unlike Twitter, FriendFeed’s user population is smaller and tends to consist of people who are more technology-focused, so the results will be somewhat skewed in that direction.Although useful, searching FriendFeed today still leaves a lot to be desired. That’s where FriendDeck can help. After authenticating with your FriendFeed username and remote key, you can kick off searches from the box at the top of the FriendDeck window. Each search term will then display in its own column within FriendDeck. The end result is a web app that very much resembles the TweetDeck’s desktop application, which also lets you display search terms in columns. However, unlike FriendDeck, TweetDeck additionally lets you organize your Twitter friends into groups in order to follow and track different sets of users along with your search queries.

FriendDeck is a web-based interface for searching the social web aggregation service, FriendFeed. It can also be downloaded and used as an Adobe AIR desktop application. FriendDeck isn’t based on TweetDeck. However, you can also search Twitter from inside FriendDeck. Use the command twitter:search term

FriendDeck displays search results in columns, allowing you to track multiple search terms within the same window. As the individual items appear, you have the option of clicking “like” or commenting online on the postings.

What FriendDeck Won’t Do

Unfortunately, FriendDeck only allows monitoring of searches, not groups. Perhaps because FriendFeed already includes a “lists” feature, FriendDeck’s creator didn’t include the ability to simultaneously track different groups of people. That’s disappointing, since tracking lists (groups) on FriendFeed means having to constantly switch between them to see the latest news from each group.  I would like an application that tracks lists, rooms, and search terms.

What FriendDeck Can Do

That said, there are still a couple of tricks you can do with FriendDeck in order to see more than just traditional searches. You can also:

  • See a user’s likes – type in the query likes:{username}
  • See a user’s comments – type in the query comments:{username}
  • See a user’s friends – type in the query friends:{username}
  • A list of posts relating to a URL – type in the query url:{url.com}
  • A list of posts about a domain – type in the query domain:{domain}

Although those custom queries are certainly handy, I would like to see FriendDeck do more.

History & Geography Distort Search Engine Results

Web history and geographic origin affects search results

Google search results are based on your web history and geographic origin. If you want to see how this can distort the search results you get, then do a Google search using your normal ISP connection, then do the same search using TOR, then  again with Xerobank. Each search will return different results.

Google isn’t the only search engine where this happens.