Mining Personality In German (MiPinG) is the result of a master's thesis by Henning Usselmann at Technische Universität Braunschweig, Germany (2020).
Abstract: People's personality influences their behaviors, attitudes, beliefs and feelings. Therefore many scientific studies would benefit from easy ways of measuring personality. By analyzing the language of a person, it is possible to derive the Big Five personality score. One approach for this is to apply the Global Vectors word embedding to English Twitter posts. The overall goals for this work are to show that this word embedding can be applied to German Twitter posts as well and to increase the accessibility to results in personality mining research. Therefore a framework is built for training and applying machine learning models for personality predictions. It is tested if a working prediction model for English Twitter users can be adapted for German users. This could reduce efforts for collection training data. The evaluation is based on a personality survey with a small sample of German users. The method of adapting an existing model does not perform as good as expected, but helps preparing the framework for higher volumes of data. In the end, the final model is based on the evaluation data, which results in acceptable performance. Via a web application (www.miping.de) anyone can easily retrieve personality scores for any public German Twitter user. Altogether, it is shown that the Global Vectors word embedding is suitable to predict personality based on German language. The published framework and source code allow for independent improvements to and easy application of the trained model. Now, scientific studies and other applications, e.g. chatbots, could easily incorporate personality data.
Download the GloVe database file needed for the MiPinG Python package here.