Computational Linguistics is a very new field of research in the world of Language. AI, machine learning, modern computing power, data clouds, and powerful programming environments have opened a kind of “black box” allowing us to research and understand ancient texts and extinct languages in ways never thought possible. One example is a machine-learning project at MIT’s CSAIL to help decipher lost languages.
Language, considered the greatest human invention, is at the heart of human consciousness and intelligence. It evolves over time, but more importantly it conglomerates into larger “child languages” as the world becomes more and more connected (or dissolved, depending on how you look at it). English itself is a conglomeration of many parent languages. This process creates “language death” as conglomerate child languages supplant old human parent ones. It is estimated that there have existed at least 31,000 human languages where now only 6000 exist today. Definition of words evolve and take on different meanings and shapes throughout this process. Word meanings can change drastically even in the span of one generation. The Ancient Hebrew texts are the most meticulously preserved texts in human history. But while the text is little changed, the language is long dead. Modern Hebrew is a modern vernacular language based on the ancient written form making Modern Hebrew a rather bizarre mating of modern language with ancient writing. It was developed in the 19th and 20th centuries. The NT Greek letters and the Septuagint provide a certain level of witness to the dead language, but the exile and the disappearance of all the tribes but two resulted in a death of the language long before the Greek witness of the Septuagint. Consequently, it remains a mystery as to whether the ancient Hebrew was even a spoken language or not, though there seems to be much evidence that it was not a spoken language, but a literary one. When a language dies, fundamental understandings are lost. This begs the question, how much of the ancient language is actually known? And how much has undergone change through being handled by many religio-cultural groups in many linguistic categories over two-thousand years?
The RealBible Project is an ongoing research and translation project of the Hebrew text utilizing AI computer technologies such ChatGPT, Alteryx Machine Learning, the Text-Fabric Python module, and the BHSA data graph built by the Eep Talstra Centre for Bible and Computer. The RealBible Project is not affiliated with any sect, denomination, or confession, save the Apostle’s Creed and the Didache (Apostles’ Teaching to the Nations).

First Century BC Dead Sea Hebrew Scrolls
The BHSA data graph currently consists of over 1,000,000 features that can be compared, analyzed, and studied for linguistic patterns. We hope to be able to tackle some of the most enduring questions and mysteries of the texts such as the numerology, math, parallelisms, the hundreds of unknown words, using machine learning and algorithims.
Project Research Sources
The following resources are some of the best for word research:
- Gesenius: Hebrew & Chaldee (i.e. Aramaic) Lexicon (1846)
- Gesenius Hebrew Grammar, 1813
- Brown-Driver-Briggs Hebrew and English Lexicon (1906). Based upon the work of Gesenius.
- A Hebrew & Chaldee lexicon to the Old Testament by Fürst, Julius (1867), student of Gesenius.
- James Strong’s Exhaustive Concordance (1890)
- Dictionary of Targumim, Talmud and Midrashic Literature by Marcus Jastrow (1926)
- Pulpit Commentary (1880)
- Cambridge Bible Commentary (1965)
- Keil and Delitzsch Biblical Commentary on the Old Testament (1864)
- Septuagint (LXX) Interlinear Greek OT (https://studybible.info/interlinear/)
- Perseus Greek Digital Library (http://www.perseus.tufts.edu/hopper/)
- An Old Testament Commentary for English Readers (1897)
- Word Biblical Commentary, Gordon Wenham
- The Book of Genesis 1-17, 18-50, Victor P. Hamilton (1990, 1995)
- Exodus: An Exegetical Commentary, Victor P. Hamilton (2011)
About Matt
The project is led by Matthew Pennock. His journey with Biblical Hebrew began in 2000 when he embarked on a comprehensive study, culminating in a full course in Hebrew grammar by 2002. From 2000 to 2016, he dedicated himself to missionary work and church leadership. Eventually, he stepped away from ministry to focus on writing, the theology of sonship, and a deep dive into Hebrew studies.
His thirst for knowledge extended to various other languages, including Arabic, Mandarin, Kiswahili, and German. He also pursued theological education at a Biblical seminary. However, the prohibitive costs and his dissatisfaction with the inefficiencies prompted him to leave the world of biblical academia.
Subsequently, Matthew recognized the limitations and biases in English translation methodologies. He resolved to delve exclusively into the study of Hebrew and Greek. By 2018, he found himself re-translating significant portions of text. In 2019, his interest in computational linguistics and the Bible in data form grew. This passion led to the inception of the RBT translation project, aiming to harness technologies like Python, Machine Learning, and AI to aid in translation and interpretation. Although the understanding of scripture is ultimately a spiritual endeavor (1 Cor. 2:14), 21st-century technology holds the promise of shedding light on hundreds of unknown words and obscure readings, making deep study and doctrine-building more efficient.
For those interested, the BHSA Hebrew Bible + Linguistic annotations datagraph research, translation programming work, the RBT translation app, and other development projects can be explored on Matthew’s GitHub.
contact
email: mp@realbible.tech
Support the project on Patreon
