• Votes for this article no votes for this yet
  • Dashboard Insight Newsletter Sign Up

Text Analytics - It’s Standard Practice

by Jeff Catlin, CEO, Lexalytics USWednesday, March 30, 2011

Text Analytics as a technology is in the process of coming of age and social networking has led to an explosion in the existence of unstructured data.

This explosion has encouraged the development of such cutting edge capabilities. Businesses are looking for faster and more efficient ways to dissect and translate conversations on a magnificent and global scale.

As with Full Text Search, technologies such as Entity Extraction and Document Level sentiment are becoming commodities, with customers expecting them to be present in any checklist. This means that to maintain a technological edge there is a lot of pressure on the sentiment industry to come up with the next big thing.

It seems lately that there are more and more companies offering sentiment solutions to a variety of markets. Everything from health care to customer service to financial services and reputation management. But in spite of this, very few prospects seem to really understand what the technology will and won't do for them.

What does it mean to measure sentiment? How do I know if I really need to use it?

That depends entirely on the intentions of the user and the content being measured. If you're looking at customer review data (let's say hotel reviews for example), then you may be interested in the sentiment of each review for the hotel. Were people happy with their stay at this hotel? This would be an example of document sentiment. It would tell you if the overall review was good or bad, and offer little insight to the details of each review. In this case, processing large amounts of data about the same topic works well.

If, however, you're reading a publication like Consumer Reports, then you're probably thinking more about how the different hotels stack up against one another. You'd like to do some comparison. In this case, the overall document sentiment wouldn't be of much help because the document will have some good and some bad content mixed within it. In fact, what the reader really cares about in this kind of content is the tone for each specific hotel that's being described in the document and the reasons why. Were the beds comfortable? How was the shower pressure? Are the staff friendly? In some cases the beds may have been comfortable but the staff rude, which can sway the sentiment of a review. Depending on what is important to you, you'd want to extract the sentiment of each entity. This is known as entity-level sentiment.

What really matters in sentiment analysis? Is it the accuracy or the automation?

Again, it depends on your needs and goals for using sentiment analysis.

An example is in financial services where the trends across a collection of stories indicate user interest. Such users care less about the accuracy of every document detail, and more about the sentiment across a corpus of data that needs to be processed quickly. The use of sentiment technology by the financial services industry is becoming more popular because the technology tends to perform better than humans in processing large collections of content.

Reputation Management is another industry where automated sentiment analysis shines bright, but where accuracy comes under more scrutiny. It could be said that automated sentiment analysis was born in this space, and was invented because of the amount of time people spent hand measuring the tone around products and brands. While Reputation Management is currently the biggest market for the technology, it's probably not the best example of accuracy. It's hard enough to get humans to agree with humans on the tone for a specific story, but to get people to agree with a computer is even harder. It's important for people to think about their specific needs and requirements before they jump into using any vendor's solution. Make sure the solution you're looking at is well-suited for the problem you're trying to solve.

So while there are more claims of sentiment analysis hitting the market it's interesting to see how sentiment appears to be somewhat of a commodity. It challenges all the providers to do a better job in all aspects of the technology. However, it's a fact that analysis of good, bad and neutral isn't as easy as 1,2,3. Ask for a proof of concept before making a decision and make sure the solution is right for you and your business.

Twitter and social networks

With the volumes of online data growing at an unbelievable rate, decreasing processing time and implementing automation become key to getting the job done. The automation process delivers incredible value such as all the associated concepts and themes with a particular topic. We have information that goes beyond the hash tag: Who is talking about those topics and who else is mentioned within the conversation. The value is not always in the number of mentions, while in some aspects that is helpful, but with the context surrounding the tweets and how businesses can use them.


Sentiment measurement is at the forefront of much business analysis these days, but in some ways Twitter seems as if it was designed from the ground up to defeat any automated sentiment engine. For instance, there isn't much sentence structure in tweets and many of the tweets are simply hyperlinks with absolutely no content contained in the URL itself.

Given these challenges, is monitoring and measuring sentiment in Twitter a hopeless chore? Fortunately the answer is No. Even though there are some challenges to automated scoring of Twitter content, there are also some advantages to processing tweets and in particular the tone within Twitter. Specifically sentiment technology can also recognize emoticons and acronyms to help identify the tone of content.  

The beauty of Twitter is that there is very little grey area in tweets. You're either posting some source of information, posting an opinion you have, or replying to another informative or opinion-oriented tweet.

About the author

Jeff Catlin, CEO Lexalytics US, discusses how analytics technology has established itself as a reliable source for business decisions, and what reliable means when analyzing unstructured data.

Tweet article    Stumble article    Digg article    Buzz article    Delicious bookmark      Dashboard Insight RSS Feed
Other articles by this author


No comments have been posted yet.

Site Map | Contribute | Privacy Policy | Contact Us | Dashboard Insight © 2018