As amply documented by Avotaynu over the last three decades, members of the Jewish genealogical community have made important contributions to the field of online genealogy with innovations such as the Jewish Genealogical Family Finder, the Daitch-Mokotoff Soundex, JewishGen, and Steve Morse’s Ellis Island database search platform, which were revolutionary at the time of their introduction, and after years of use remain widely admired. From these beginnings, Jewish genealogy has benefitted from new resources created by Yad Vashem, the US Holocaust Museum, Genteam (Austria), Akevoth (Netherlands), and GenAmi (France) to name only a few. Dominating all of these have been the huge and not-specifically-Jewish databases hosted by aggregators of data such as myHeritage, FamilySearch (LDS), and Ancestry.
But as evidenced by the new genealogical vocabulary of the keynote speakers at the RootsTech 2013 conference in Salt Lake City, some of the dominant players are finding the world shifting under their feet. The first inkling of change at RootsTech was the surprising report from Ancestry.com that in the three months leading up to the conference, more than 50% of its new subscriptions came from users of mobile devices. It is notable that these devices, with their relatively small amounts of computer memory and slow cellular connectivity, do not lend themselves to PC software and the labor-intensive database searches that most of us have known until now. Further evidence of the shift to mobile computing was the well-attended booth of the new BillionGraves.com initiative in which participants (Boy Scouts, even!) walk through their local cemeteries snapping photographs of row after row gravestones with their smartphones, then uploading the photos (and the included GPS data supplied by the smartphone) to a website where they would be transcribed and indexed by a second set of work-at-home volunteers.
As those of us who have been to genealogical conferences in recent years have observed, the next generation is barely represented. Although this was likewise true at the most recent RootsTech, there was a pronounced generational shift in the breakout sessions on technology. Without exception, any event that focused on mobile computing or had the terms “collaboration”, “crowdsourcing”, “matching”, “metadata”, or “apps” in its title drew an overwhelmingly youthful and tech-savvy crowd. Perhaps recognizing this generational shift, the president of Family Search, the genealogical arm of the Church of Latter Day Saints, announced in his keynote speech the inauguration of a fully collaborative, shared online family tree of the kind pioneered six years ago by still-vibrant, user-driven Geni.com (now a division of MyHeritage).
And as most of us remember from the PC Revolution, it was accompanied by a new vocabulary containing bytes, floppies and RAM, followed less than a generation later by the Internet Revolution and its new terms such as the Web, URL, and Google. Given the revolution in Mobile computing exemplified by the enthusiasms of the next generation at RootsTech, it is high time that we acquaint ourselves with their vocabulary:
API (Application Programming Interface). In laymen’s terms, an API is a road map describing the means by which an application can deal with a second application and make use of its resources. Google Maps is an example of a database that has used its API to encourage website developers to bring millions of users to its application. The duration of a shared API relationship is based on the agreement of parties involved, and may be of varying duration and levels of compensation. Access to the API is a crucial part in the development of “Apps”, the lean and efficient mobile computing applications that connect those handy little devices to the conventional Internet databases.
Collaboration. This was one of the most frequently heard words at the conference. Wikipedia, the sixth most accessed website worldwide, is the largest example of a fully collaborative website, moderated by an upper caste of volunteer “editors” to assure the integrity of its entries. In the genealogical world, the most prominent equivalent is the fully collaborative Geni.com World Family Tree of over 70 million related individual profiles added by more than 2 million users. The Geni tree is operated by MyHeritage and is overseen by a team of 125 volunteer genealogists deputized as “Curators” who roam the Geni tree like park rangers, resolving difference of opinion, insuring the integrity and accuracy of the tree, and helping lost genealogical wayfarers. The Curators are drawn from all over the world, a substantial percentage having expertise in Jewish genealogy. One might think of a collaborative family tree as a connected series of Facebook pages, with a single page for each person in the tree, the only difference being that in the case of a fully collaborative tree, every member of the family gets to contribute photographs, memories and genealogical data for their relatives. The major announcement at RootsTech 2013 by Family Search (LDS) that it was launching its own collaborative (but evidently un-moderated) family tree has clearly moved the entire notion of collaborative trees into the mainstream, once and for all. Collaborative family trees are the wave of the future according to the LDS, who as the world’s largest purveyor of genealogical data and services will likely have a role in assuring that outcome.
Crowdsourcing. Those of us active in the Jewish genealogical world are familiar with the general concept, although not the term, from two decades of JewishGen indexing projects. It describes the development of a knowledge base through the parallel work of volunteers, who may or may not be aware of the identities of their fellow participants. Major crowdsourced projects discussed animatedly at RootsTech included Billion Graves (described above), FindAGrave, and many projects in which volunteers around the world are proofreading computer-scanned and transcribed editions of hundreds of thousands of old newspapers of genealogical interest. Without using the term, the LDS has relied on its members to crowdsource the US census indices for years.
[Update: Crowdfunding. At the recent 2015 RootsTech conference, the big story among software developers involved a variant described as Crowd-Funding, a method of start-up fundraising in which nonprofit and for-profit ventures seek contributions from the public. For a typical example of this phenomenon which has yet to be fully engaged by the genealogical world, but should be, visit www.Kickstarter.com]
Data Model. “In the beginning, there was GEDCOM.” Or so it seems to those of us who have been laboring in the genealogical vineyards the last few decades. GEDCOM, as most of you know, is the standard data transfer format developed by the LDS in 1984 to facilitate the transfer of data between otherwise incompatible genealogical software programs. What you may not know is that while the genealogical world has taken advantage of the Internet by tremendously expanding the types of information that is available to be shared, the GEDCOM standard has not been expanded or otherwise modified in over 17 years! This has led to growing incompatibility between websites and disagreements, with the major Internet providers continuing to hold fast to their own individual proprietary standards and resisting calls for commonality.
Matching. The explosion of online databases and family trees, while a tremendous blessing to genealogy by giving us instantaneous access to far-flung resources, can be overwhelming to the typical Jewish genealogist with a sizable family tree that crosses many borders and has a multitude of surnames. The notion of computerized matching, epitomized in the websites hosted by myHeritage, Geni and Ancestry, describes the process by which online genealogy websites automatically compare hosted family trees to the records in their databases and the family trees of other subscribers. The program then notifies the user automatically of a possible match that can be confirmed or rejected as desired. In the time-challenged world in which we live, and the infinite number of avenues for conducting research, the introduction of matching schemes that put family trees to work gathering data around the clock is an obvious discovery.
The enormous hurdle that needs to be overcome in making our Jewish genealogical databases and trees available to aggregators with matching schemes is the fact that access to a given database is not enough alone to give matches. Under the current technology, at some point, entire databases need to be copied to the aggregator’s server, the data set must be massaged to be consistent with that of the aggregator’s matching algorithm, and a slow crawl of the enormous data sets must take place so that prospective matches can be identified. JewishGen dealt with this challenge by moving its entire database to the servers at Ancestry. Short of JewishGen’s all-in strategy, a common scheme should be developed to provide for the negotiation, compensation and implementation of matching schemes for our Jewish genealogical databases lest they become irrelevant in the years ahead.
Metadata. Most genealogists are not aware of this, but any image that we create (whether a photograph, video or scan) includes not only the 0’s and 1’s that describe the features and colors of the image, but also the capacity to carry within it an unlimited amount of data describing where and when the image was created, its contents, and just about any other information that could be of interest. This information is known as metadata. Anyone who has ever tried to list the names of all the people in a family photo into a filename understands the difficulty of the process. It is equally exasperating to find that typical websites managing photographs abandon the assigned filename and the identifying data is lost. This information can easily be embedded in a digital image file, individually or in bulk, by using consumer software such as Adobe Bridge or the more robust programs developed for intensive users such as the news media. When NBC News reporters are looking for a photograph in its archives, they do not search for filenames or hunt through file folders on their servers to find a photograph of David Ben Gurion declaring Israel’s independence: they simply conduct a metadata search for “Ben Gurion”, “Independence”, “1948”, and every digital image in its database with metadata containing those three words will instantly appear, no matter what filename was used by the original or intervening sources. This data, such as the name and email address of the author, remains with the photograph even when shared via an Internet site or an email.
To facilitate the distribution of digital images, the professional photography industry and the news media worldwide have developed widely adopted standards for metadata that can be automatically accessed from most commercial photographic packages. One tantalizing prospect that has yet to be realized is the integration of face recognition software and metadata.
Present Web applications such as Picasa, which offer amazingly accurate face recognition features, record the information to the record for the photograph on the Picasa website, but do not add the information to the photographs own metadata. Hence when Picasa helps you identify Aunt Goldie in 30 images over the course of 50 years, there is at present no ability to automatically augment each photograph’s metadata to reflect that fact. When you copy the photograph to email to someone else, or wish to upload to an online family tree, you must enter the information into the metadata manually. Most importantly, well — crafted metadata might facilitate matching of photographs to family tree profiles, a prospect of profound implications.
[Update: Social Media. If you are reading this article, the odds are greater than 50% that you found it through the Facebook, Twitter, or one of the other social media outlets through which Avotaynu Online promotes it material Virtually an entire day of lectures was dedicated at the recent Federation of Genealogical Societies to the importance of social media to the successful management of a genealogical society or collaborative project. Rachel Akaha referred to this in her article on managing a DNA project recently posted on this site. She is one among many societies, projects, historical institutions, and affinity groups who have found social media to be an essential part of their communications strategy.]
What does the future hold?
My assessment of the current state of Jewish genealogy indicates that where once we led the genealogical world into the next generation of computer technology, today we are behind the curve. We have taken advantage of none of the technological developments that were made possible by the explosive growth of mobile computing. Our images libraries lack searchable metadata and are largely hidden from view, our websites lack published API’s and relationships with family tree aggregators that would enable the automated matching of data with online family trees, our databases are not open to the developers who through the creation of mobile Apps are driving millions of young users to commercial websites but not to ours, our trees are largely based on a long-obsolete GEDCOM standard that does not integrate with modern usage, and lastly, and most tragically, too many of our trees remain disconnected from one another for the simple reason that we cannot wrap our heads around the notion that our Internet-imbued children fully grasp: that genealogy in future will be an entirely communal, collaborative and crowdsourced experience.
We are at a pivotal moment much like that when the World Wide Web suddenly became dominant during the late 1990s, leaving firms without a Web strategy seriously disadvantaged. As Ancestry revealed at RootsTech, our genealogical world is becoming centered on mobile computing far more rapidly than we could possibly have imagined, and any resource that is not designed to take advantage of these developments risks becoming increasingly irrelevant.
The next generation, reared on the instant gratification that the Internet provides, will be disinclined to spend their hours conducting endless searches on an ever-increasing number of disconnected online genealogical resources. But they are far more willing than past generations to share information and collaborate on complex projects. We expect that mobile applications will dominate the future genealogical world by aggregating genealogical data using algorithms to match information derived from multiple platforms and API’s. These applications may be managed by today’s Internet aggregators such as Ancestry, myHeritage, or Family Search, or we could wake up one morning and find the field dominated by one of the Internet giants with the cash, market reach, and the business savvy to turn the entire playing field on its head.
To stave off such intervention, we believe that the genealogical aggregators of the future will need to adapt to the mobile world by offering access and matching to all four of the following platforms:
- Proprietary databases owned by the aggregators themselves, such as the databases developed and/or purchased by myHeritage and Ancestry for the exclusive use of their own subscribers;
- Independently operated Internet resources of all shapes and sizes, from the institutional giants exemplified by Yad Vashem to the small but invaluable gems such as the Eastern European cemetery projects;
- For genealogists concerned with having complete personal control of their data, aggregators will offer “stand-alone trees” with limited collaborative functionality, automatically matched to the databases described above (the present Ancestry and myHeritage models);
- For the remaining genealogists, for whom sharing of data with family and fellow genealogists is of paramount importance, aggregators will offer fully collaborative family trees and projects, either moderated by expert curators (the Geni model) or un-moderated (the present Family Search model).
To help bring this model to fruition, we urge the technology leaders of our Jewish genealogical community to begin developing the standards and infrastructure to prepare us for the future full integration of knowledge across these four platforms. To that end, we propose that we as a community immediately begin taking the following steps:
- New and existing databases should embrace a standardized “Data Model”, developed by our technologists to replace GEDCOM across all Jewish genealogical platforms.
- Establish a standardized genealogical Metadata format, allowing for integration of digital images into matching systems, and we should advocate for its inclusion in standard image creation packages such as Adobe Bridge.
- Publish Standardized API’s of Jewish genealogical platforms and facilitate collaboration and encourage experimentation by database managers and mobile App developers.
- Encourage Jewish genealogists to share their trees online, and to allow integration into on-line collaborative trees.
- Publish Jewish periodicals in digital formats supported by the iPad and comparable mobile devices used by the next generation.
- The community should develop “best practices” for the design and commercialization of databases so that we might encourage development of new resources by “genealogical entrepreneurs” who would find markets for their resources among aggregators with matching applications.
In sum, automated, computerized “matching” of online resources and family trees, and the development of mobile apps, has transformed the genealogical experience for those who have taken advantage of it, and provided unprecedented opportunities for growing and documenting our family trees. It is time for the Jewish genealogical community to take a leadership role in introducing the standards that will bring this technology to our field, so that our ranks may be replenished from among our future generations.