About the RealBible Project
Computational Linguistics is a very new field of research in the world of Language. Modern computer power, data aggregation, machine learning, and powerful programming environments have opened a kind of “black box” allowing us to research and understand ancient texts and extinct languages in ways never thought possible. One example is a machine-learning project at MIT’s CSAIL to help decipher lost languages.
Language, considered the greatest human invention, is at the heart of human consciousness and intelligence. It evolves over time, but more importantly it conglomerates into larger “child languages” as the world becomes more and more connected. English itself is a conglomeration of many parent languages. This process creates “language death” as conglomerate child languages supplant old human parent ones. It is estimated that there have existed at least 31,000 human languages where now only 6000 exist today. Definition of words evolve and take on different meanings and shapes throughout this process. Word meanings can change drastically even in the span of one generation. The Ancient Hebrew texts are the most meticulously preserved texts in human history. But while the text is little changed, the language is long dead. Modern Hebrew is a modern vernacular language based on the ancient written form making Modern Hebrew a rather bizarre mating of modern language with ancient writing. It was developed in the 19th and 20th centuries. The NT Greek letters and the Septuagint provide a certain level of witness to the dead language, but the exile and the disappearance of all the tribes but two resulted in a death of the language long before the Greek witness of the Septuagint. Consequently, it remains a mystery as to whether the ancient Hebrew was even a spoken language or not, though there seems to be much evidence that it was not a spoken language, but a literary one. When a language dies, fundamental understandings are lost. This begs the question, how much of the ancient language is actually known? And how much has undergone change through being handled by many religio-cultural groups in many linguistic categories over two-thousand years?
The RealBible Project is an ongoing research and translation project of the Hebrew text utilizing computer technologies such as Alteryx Machine Learning, the Text-Fabric Python module, and the BHSA data graph built by the Eep Talstra Centre for Bible and Computer based in Amsterdam. The RealBible Project is not affiliated with any religious sect, denomination, or confession.
The BHSA data graph currently consists of over 1,000,000 features that can be compared, analyzed, and studied for linguistic patterns. We hope to be able to tackle some of the most enduring questions and mysteries of the texts such as the numerology, math, parallelisms, the hundreds of unknown words, using machine learning and algorithims.
The project is spearheaded by Matthew Pennock who began to study ancient Hebrew in the year 2000 and completed a full course in Hebrew grammar by 2002. Later after traveling and studying seven more languages including Arabic, Mandarin, Kiswahili, and German, he enrolled in a Biblical seminary and studied the Hebrew Bible. After a couple of terms he discovered how misleading, contrived, legalistic, and even political seminaries can be and quickly dropped out. After discovering how misleading, biased, and even dishonest English translation methodologies have been, he went to exclusively reading the Hebrew and Greek. In 2018 he began to re-translate portions of text entirely. He now researches using the ETCBC BHSA database and translates in Krakow, Poland where he is currently studying data science, python programming, machine learning.
Check out the BHSA Hebrew Bible + Linguistic annotations in text-fabric format available on github.