Accession Number:



BRAT: A Random Walk through the Semantic Spaces of the Blogosphere

Descriptive Note:

Conference paper

Corporate Author:


Personal Author(s):

Report Date:


Pagination or Media Count:



Semantic spaces, such as the Latent Semantic Analysis LSA, Hyperspace Analog to Language HAL or Random Indexing RI, offer convenient methods to represent semantic relations between words and concepts, abstracted from a distribution of documents. The distribution of documents determines the local co-occurrence pattern between words all over the corpus and, then, determines the semantic abstracted from the local distribution. Such methods are sensitive to the statistical properties on the distribution of words over documents. For instance, the semantic on the word table abstracted from a scientific corpus or a general corpus may be different. In the first case, since table may occur in the context of table of correlation or table of results, it would be considered to be associated to the word correlation whereas in the second case, because it may co-occur with kitchen or living-room, it would rather be considered as similar to chair. Nevertheless, the formal relation bearing the properties of the distribution of words co-occurrence and the final semantic produced by Semantic space methods have not been described until now. In the case of a mixed scientific and general corpus, what makes that the semantic of table became more similar to chair than Speerman and vice-versa

Subject Categories:

  • Linguistics

Distribution Statement: