Annual Report 2006/2007

Major Research and Education Activities

We are particularly proud that the project won the Best Student Paper Award in the Joint Conference on Digital Libraries (JCDL) 2006. The paper described our EcoPod PDA based tool for identifying plant and animal species.

BioACT's ButterflyNet continued to be developed into several directions. The system comprises a number of components that we developed to bridge the gap between the physical field notebooks that Biologists are used to, and online tools. Physical notebooks work well in harsh outside conditions, don't have glare or battery problems, and are effortlessly archived. Online information, on the other hand, is searchable, can easily be shared, and is simple to reorganize. A system that spans both options is therefore attractive.

We used the commercial system Anoto as the hardware vehicle for our hybrid (http://www.anoto.com/). Anoto is a combination of special pen and regular paper with a barely visible dot pattern printed on it. The pen dispenses regular ink, but also recognizes its position by observing the dots with a built-in very small camera. The pen records all ink strokes and their time stamps as its user writes or draws on the paper. The strokes are communicated to a desktop immediately via Bluetooth, or later through a docking procedure.

Biologists use our Bnet paper notebook as they would any other stationary. But they take their notes with an Anoto pen, and the Anoto pattern is pre-printed on each notebook sheet. Along with notes, the Biologists may take photos with any digital camera. All such equipment adds a time stamp within each photo's file. Once the Anoto strokes, and the photos have been uploaded to a computer, Bnet collates the writing with the photos, using time as the pivot.

In a parallel effort we developed EcoPod, a PDA-based species identification tool. The tool can load taxa information from an XML encoded file. The system analyzes the characters and their states, generating dichotomous keys. EcoPod then asks its user questions about an individual that the user wishes to identify. The questions are automatically sorted to minimize the number of answers the user needs to provide. The system further provides an image browser to support visual identification.

The user may mark any answer as `tentative.' One or more nodes along the user's path through the dichotomous tree may thus be flagged for possible revisiting at a later time. In support of such revisions the user may attach evidence to any node. This evidence may be a photo that shows the specimen's state, a voice note, or other form of supporting material.

Major Findings

Our EcoPod prototype confirmed that a species identification tool is indeed workable. Our implementation used a Windows Mobile PDA. The application is stylus driven and mixes textual and pictorial material. One big challenge is to minimize the number of questions that users must answer before an identification is made. We therefore decided to compute the underlying dichotomous key from a species matrix that is encoded in a standard SDD XML format. The file is read into the tool ahead of time. The user interface then dynamically modifies the sequence of questions such that the most powerful, highly discriminating question is always at the top. Please see our JCDL publication for details on the EcoPod tool.

We explored how the number of questions EcoPod asks can be minimized further if prior years' abundance distributions of the respective species is available for a census area. To this end we analyzed six years' of historical bird count data at the Jasper Ridge Preserve. From the counts we computed the probability distribution of observing each species, and used that distribution to bias our dichotomous-key generation.

We found that, for this data set, just one year of historical data can significantly decrease the necessary number of questions that EcoPod asks of its users. We also found that additional years did not improve the result, and that---for this particular historical data---any of the six years yielded the same size of decrease in required questions. Please see our technical report publication for details.

Opportunities for Training and Development

A number of students, both Masters and Ph.D. level, were again involved this year. In fact, we graduated two women Ph.D.s from the project; one in Computer Science, one in Biology. Two additional Masters students graduated with Masters degrees, one of them earned a Masters with Distinction in Research. We continue to work primarily with students.

Outreach Activities

Prof. Garcia-Molina regularly presents our research at summer-long events for under-represented High-School students from across the US. These events are organized by the Stanford Engineering School's Diversity Programs. The Stanford Summer Engineering Academy in particular was established in 1998 with the goal of attracting a diverse student body to the School of Engineering.

The Jasper Ridge Preserve, one of the BioACT partners, runs the Quest Scholars Program, which is a five-week summer program for gifted, low income, minority high school juniors and seniors who are interested in the environment. More broadly, this year's Community Day attracted hundreds of visitors from the surrounding area to come see the Preserve's work, and our BioACT work in particular. We had many questions about when our EcoPod would be a product.

The Preserve's strong volunteer force is mostly composed of retired persons, who thereby contribute to the scientific endeavor, while enjoying intellectual stimulation for themselves. The bird count team, in particular, contributed greatly to this year's BioACT work. Their data served as foundation for our EcoPod optimization study. The California Academy of Sciences organizes frequent special exhibits and classes for children and youth.

Contributions to Principal Discipline

Our study of how historical abundance data can be used to calibrate species identification tools is new to biodiversity science. The vision is that such tools auto-calibrate whenever they can acquire such historical records. This acquisition might occur when the PDA that runs the tool enters a field station. The field station's computers would communicate with the tool to effect the data transfer. At night, upon return from an observation trip the PDA could in turn inform the field station of new sightings. Such interaction, to our knowledge is novel in biodiversity practice.

Development of Human Resources

Please see our section on 'Opportunities for Training and Development.'

Other Aspects of Public Welfare

The entire project is designed to improve the public's welfare. A deeper understanding of our ecosystems will directly benefit nearly everyone's future quality of life.