University of Cambridge > Talks.cam > RSE Seminars > A Modular OCR Solution for Logographic Scripts: From Labeling to Recognition and User Interface Design

A Modular OCR Solution for Logographic Scripts: From Labeling to Recognition and User Interface Design

Add to your list(s) Download to your calendar using vCal

If you have a question about this talk, please contact Jack Atkinson.

In recent years, optical character recognition (OCR) has become increasingly efficient in recognizing real-world image-related data, particularly in contexts involving phonetic writing systems such as Latin-based or modern alphabetic scripts, where there are a manageable number of categories or sufficiently labeled training data. However, for early logographic systems whose characters derive from pictographic origins, such as Chinese oracle bone inscriptions, Egyptian hieroglyphs, Mesopotamian cuneiforms, and Mayan glyphs, there exist usually thousands of characters and even more graphic variants. As such, the relevant OCR systems often suffer from data inefficiency and class imbalance, presenting challenges for models like ResNet and other CNN -based networks. To make matters worse, historians and palaeographers constantly disagree on issues regarding character decipherment and classification, further complicating the processes of data labeling and dataset compilation. This talk will use Chinese oracle bone script as a case study to demonstrate how to efficiently address these challenges primarily through four stages of work: 1). font creation for ancient characters via image vectorization; 2). text encoding and labeling using external relational tables; 3). ResNet-based model training using synthetic data augmentation; 4). Product deployment using modern web architectures such as React and Vue.js. You will be able to find part of these work on: https://oracular.azurewebsites.net/.

This talk is part of the RSE Seminars series.

Tell a friend about this talk:

This talk is included in these lists:

Note that ex-directory lists are not shown.

 

© 2006-2025 Talks.cam, University of Cambridge. Contact Us | Help and Documentation | Privacy and Publicity