Leipzig Corpus Miner (LCM)
The iLCM is not a stand-alone software, but rather an infrastructure consisting of a multitude of components including a document database (MariaDB), an NLP pipeline for processing different text mining processes (In R statistical language), a full-text index (Solr) and a web application (R Shiny). To make the infrastructure available as a decentralized installation for other projects it is embedded in a virtual machine ensemble (Docker), which can be easily set up with predefined configuration scripts. The application is therefor a fusion of R scripting capabilities, Data Management and visualization by R Shiny. By using a ORC approach, the documentation of the data-processing happens on the fly, with always having the data, the used scripts and their description available in a “notebook” for later consideration.