Indexing Go

Name: Indexing Go
Author: Brigham Young University

Brigham Young University

DOCUMENTED

Published in academic literature

For:Researchers & AcademicsGeneral Public & Enthusiasts

Free

Download on the App Store

App Summary

Indexing Go is a genealogical research tool that uses advanced handwriting recognition to automatically index and improve the accuracy of historical census records for family history researchers. The app's scientific basis is a deep learning model that combines a convolutional neural network with a Long-Short-Term-Memory (LSTM) network, trained on a dataset of 2.4 billion images from the 1940 U.S. Census. The associated research concludes this technology can correct mistakes made by original human indexers and expand the number of searchable fields, improving the quality of historical data for research.

App Screenshots

Detailed Description

Functionality & Mechanism

Indexing Go is a mobile data contribution tool that facilitates the crowdsourced transcription of historical census records. The interface presents users with digitized images of individual cells from census documents, such as a name or occupation. The system captures user-entered transcriptions of the handwritten text. This human-generated data serves as a training and validation set for a sophisticated handwriting recognition algorithm that leverages a convolutional neural network (CNN) and a Long-Short-Term-Memory (LSTM) network to automate large-scale indexing.

Evidence & Research Context

The associated research details the underlying algorithm, which integrates a convolutional neural network with a Long-Short-Term-Memory (LSTM) network for handwriting recognition.
The system's design leverages a training dataset of 2.4 billion labeled sub-images derived from the 1940 U.S. Census.
A pilot application of the algorithm on a 1930 census dataset demonstrated a character error rate (CER) of 10.4% for names.
To enhance accuracy, the system incorporates data from the FamilySearch Family Tree to correct transcription errors and identify alternative name spellings.

Intended Use & Scope

This application is designed for volunteers, genealogists, and family history researchers as a data contribution and verification platform. Its primary utility is to improve the accuracy and completeness of large-scale digital census archives. The tool does not function as a genealogical search engine; it is intended exclusively for performing transcription and indexing micro-tasks.

Studies & Publications

1 publication

Peer-reviewed research associated with this app.

Development/Design Paper

Using Hand-Writing Recognition to Auto Index the US Census Records

Clement et al. (2019) · SSHA Annual Meeting

Describes the research-driven development of this app

Recent breakthroughs in handwriting recognition have the capability to improve the quality of the 1940 Census data and expand the set of fields that are available to use for research. Our hand-writing recognition algorithms uses new data augmentation and normalization methods applied to a convolutional neural network that feeds into a Long-Short-Term-Memory (LSTM) network. We also have a unique advantage by having access to a training set that