In a recent paper for the UNESCO and The LiiV Center for Digital Anthropology Global Partnership to Advance the Field of Digital Anthropology, I wrote about my vision for creating a disciplinary-wide knowledge graph (KG) for the field of anthropology that can be used to train a fine-tuned large language model (LLM) that will push us closer to the machine interpretation of cultures.
In the paper, I discuss that the emergence of the internet has created tremendous new opportunities, particularly as it relates to digital anthropology. However, the use of traditional ethnographic methods to explore the internet has limited our ability to make sense of the internet at scale. I explore a potential solution that combines anthropology, data science, and AI to create a transdisciplinary practice that can better support the machine interpretation of human culture at scale.
The internet is a vast space with unstructured data and information without explicit associations. This scale challenges traditional research methods, which is where machine learning comes in. There are two main approaches that have been used in recent years: deep learning and symbolic AI. Deep learning is excellent at handling large datasets and finding patterns, while symbolic AI can be useful for more structured data. However, both approaches have their strengths and weaknesses.
Ethnographic Knowledge Semantic Data Modeling (EKSDM)
One potential solution to the current limitations of the deep learning approach that powers large language models with knowledge graphs. I proposed that we combine ethnography with semantic data modeling to model our research data, analysis, insights, final research outputs, and all of the actors involved. Ethnographic Knowledge Semantic Data Modeling (EKSDM) is therefore proposed as a technique used to semantically structure data and merge ethnography with semantic data modeling. This process can create systems of analysis that embody greater context and explainability. With EKSDM, there is a focus on the richness of ethnographic data and the ability to create meaningful relationships between different pieces of data.
Anthropological Knowledge Graph (AKG)
The model data would be stored in an Anthropological Knowledge Graph (AKG). A knowledge graph is a type of database that uses links and semantic metadata to situate data in context. The advantage of an AKG is the ability to model human culture accurately. However, the common publicly available graphs have some limitations, and there is a need for an AKG that is disciplinary-wide and controlled by the anthropology community. Likewise, I have proposed AnthroGraph.io.
There are some hurdles to creating an AKG or using EKSDM, such as privacy, intellectual property, and the time-intensive process of modeling historical data. However, a structured model of anthropological knowledge will be essential to help us make sense of the diversity of human culture with machines.
The paper explores the potential for combining anthropology, data science, and artificial intelligence to create a transdisciplinary practice capable of modeling and interpreting human culture. It proposes an Ethnographic Semantic Data Modeling (EKSDM) approach to combine ethnographic practices with semantic data modeling techniques for greater context and explainability. An Anthropological Knowledge Graph (AKG) is proposed as a tool for situating data in context through links and semantic metadata. An AKG maturity model provides a roadmap for progressively building a robust and contextually rich knowledge graph. Considerations such as privacy, intellectual property, and time-intensive historical data modeling are acknowledged. Despite those considerations though, using EKSDM to create an AKG offers a powerful tool for enhancing other technologies, such as Large Language Models (LLMs), and moving from machine learning to machine knowing for the machine interpretation of cultures powered by the wisdom of anthropology.
Read the Paper
If you want to read the paper, please check it out on the UNESCO site.