Numerical Algorithms for the Analysis of Expert Opinions Elicited in Text Format
DEFENCE SCIENCE AND TECHNOLOGY ORGANISATION (AUSTRALIA) JOINT OPERATIONS DIVISION
Pagination or Media Count:
Latent Dirichlet Allocation LDA is a scheme which may be used to estimate topics and their probabilities within a corpus of text data. The fundamental assumptions in this scheme are that text is a realisation of a stochastic generative model and that this model is well described by the combination of multinomial probability distributions and Dirichlet probability distributions. Various means can be used to solve the Bayesian estimation task arising in LDA. Our formulations of LDA are applied to subject matter expert text data elicited through carefully constructed decision support workshops. In the main these workshops address substantial problems in Australian Defence Capability. The application of LDA here is motivated by a need to provide insights into the collected text, which is often voluminous and complex in form. Additional investigations described in this report concern questions of identifying and quantifying di erences between stake-holder group text written to a common subject matter. Sentiment scores and key-phase estimators are used to indicate stake-holder di erences. Some examples are provided using unclassi ed data.
- Information Science