Speaker
Description
In the last two-three decades researchers in human language technologies have tried to apply various statistical methods to understand what is encoded and how in large text corpora -- with limited success. The previous 5-6 years have basically changed both the basic research paradigm and the level of success in this research area as well. Continuous vector space models, neural networks, deep learning: these are some of the main terms widely used today in most data-intensive fields including linguistic research. With the new paradigm, however, new questions have arisen. One of them concerns with the possibilities of mapping the new categories (vectors, dimensions, layers, etc.) onto linguistic concepts. Besides this, the presentation deals with questions like ‘how and why is it possible that rather simple word embedding models are able to grasp real-world relations without any other knowledge sources but pure texts’?