The vast majority of non-European alchemy sources remains, for a variety of reasons, untranslated. I address the strategic question of where to start the slow, labor intensive process of translating vast multilingual corpora from the Middle East, India and China. My short-term focus is to use natural language processing (NLP) methods of unsupervised metaphor detection to triage untranslated alchemical texts and sort them for different research needs. I propose to identify key metaphoric phrases drawn from well-known English alchemical text as a basis for our NLP discrimination model. Recent advances in unsupervised metaphor detection, based on the ubiquity of metaphor in all languages may anchor NLP detection of naturally occuring meaningful clusters in virtually any language. My overarching goal is to demonstrate the potential of digital humanities tools in combination with computational chemistry resources, such as the Reaxys CC database, with data on 200 million chemical substances and about 60 million chemical reactions from 1750 to 2015. My colleague, Prof. Restrepo at the Max Planck Institute for Mathematics in Science, and the History of Science, has licensed access through MPI to Reaxys and has agreed to collaborate on extending his analysis of chemical knowledge production to pre-1750 periods.
Digitizing Chemical Humanities
Toward a Cross-cultural History of Chemistry