Our past ought to be more readily knowable. We have so many historical resources and it is time to treat them like the data that they are. The time is near when this raw information will be converted into data, interpreted, understood, and related to each other, all by automated processes. My work is a prototype that will allow us to peek into — and shape — the future of historical studies.
The first step is to create value-added historical resources, such I have done with the microfilmed building permits and the 1858 directory. These resources are so much more than improved references. When translated into contemporary formats and viewed in their entirety they provide rich, broad views into our city’s history. The availability of all of the data within the resource creates context for each record within the resource, and therefore a much fuller understanding.
Historical research is naturally focused on historical subjects — people, events, places, and such. In order to best understand a subject, a researcher must utilize as many resources as might include information on that subject. That is to say, historical research has always been subject-oriented. I don’t believe that will change, but how it is done will change because the resources will change.
What is coming is more resource-oriented historical research, in which mere collections of information become value-added resources — readily accessible, quantifiable, and map-able data. This will happen because the process is useful and enlightening, and will become increasingly feasible through technological advances.
Resource-oriented research will not only create a whole new way to “do” history, it will also have a profound effect on the old subject-oriented model. So many resources will be readily accessible that the process will, in many cases, become much more efficient and consume significantly less time and effort. The bases of our collective historical understanding will become more broad, clear, certain, and accessible.
But creating value-added resources is just the first step. The power in collecting and having data lies in relating datasets to one another. That is when information turns into knowledge and starts to form stories. Value-added historical resources will relate, one to the other, through names, dates, and location identifiers.
If we thoroughly and accurately associate the data that we collect from various resources, we will achieve a certain level of data synergy in which each resource clarifies, refines, and complements the others. Some sources will clarify locations included in others, some will include related information excluded by the others. Some sources will answer the questions raised by other sources. Misspellings or other small errors in one resource will be shouted down by the other resources.
Accuracy and even understanding will emerge to a degree that is not possible now, as a true knowledge emerges directly from the resources, with little to no interpretation. The basic outlines of individual histories will become readily available to us. Historical documentation – the bones of stories – of people, places, buildings, and institutions will be at hand and universally available.
At what point will machines be able to accurately read and intelligently analyze our historical resources? How soon after that will they start to fit the pieces together, to recognize variations in name spellings, to understand, for instance, the subdivision of lots and address changes, to follow an institution through a name change, to put all of the pieces together and crank out millions of reasonably robust histories and possibly even achieve a certain broader understanding of many aspects of human history?
I haven’t a clue, but my intention is to go there first because this is the future of history.