Accession Number:



Automated Chat Thread Analysis: Untangling the Web

Descriptive Note:

Conference paper

Corporate Author:


Report Date:


Pagination or Media Count:



As networked digital communications proliferate in military operational command and control C2, chat messaging is emerging as a preferred communications method for team coordination. Chat room logs provide a potentially rich source of data for analysis in after-action reviews, affording considerable insight into the decision-making processes among the training audience. The multitasking nature of these types of operations, and the large number of chat channels and participants lead to multiple, parallel threads of dialogs that are tightly intertwined. It is necessary to identify and separate these threads to facilitate analysis of chat communication in support of team performance assessment. This presents a significant challenge as chat is prone to informal language usage, abbreviations, and typos. Techniques for conventional language analysis do not transfer very well. Few inroads have been made in tackling the problem of dialog analysis and topic detection from chat messages. In this paper, we will discuss the application of natural language techniques to automate chat log analysis, using an AOC team training exercise as the source of data. We have found it necessary to enhance these techniques to take into consideration the specific characteristics of chat-based C2 communications. Additionally, our domain of interest provides other data sources besides chat that can be leveraged to improve classification accuracy. We will describe how such considerations have been folded into traditional data analysis techniques to address this problem and discuss their performance. In particular, we explore the problem of automatically detecting content-based coherence between messages. We present techniques to address this problem and analyze their performance in comparison with using distinguishing keywords provided by subject matter experts. We discuss the lessons learned from our results and how it impacts future work.

Subject Categories:

  • Information Science
  • Linguistics
  • Non-Radio Communications

Distribution Statement: