EFG publications, presentations, software and technical reports

Papers and manuscripts
Machine reasoning about anomalous sensor data Matt Calder, Robert A. Morris, Francesco Peri. Revised Sep 2009; to appear Ecological Informatics. doi:10.1016/j.ecoinf.2009.08.007
Preprint (PDF)
Expanded version of presentation at the Sixth International Conference on Ecological Informatics (ISEI6)
Abstract. We describe a semantic data validation tool that is capable of observing incoming real-time sensor data and performing reasoning against a set of rules specific to the scientific domain to which the data belongs. Our software solution can produce a variety of different outcomes when a data anomaly or unexpected event is detected, ranging from simple flagging of data points, to data augmentation, to validation of proposed hypotheses that could explain the phenomenon. Hosted on the Jena Semantic Web Framework, the tool is completely domain-agnostic and is made domain-aware by reference to an ontology and Knowledge Base (KB) that together describe the key resources of the system being observed. The KB comprises ontologies for the sensor packages and for the domain; historical data from the network; concepts designed to guide discovery of internet resources unavailable in the local KB but relevant to reasoning about the anomaly; and a set of rules that represent domain expert knowledge of constraints on data from different kinds of instruments as well as rules that relate types of ecosystem events to properties of the ecosystem. We describe an instance of such a system that includes a sensor ontology, some rules describing coastal storm events and their consequences, and how we relate local data to external resources. We describe in some detail how a specific actual event---an unusually high chlorophyll reading---can be deduced by machine reasoning to be consistent with a hypothesis of benthic diatom re-suspension and inconsistent with a hypothesis of an algal bloom, both of which might otherwise have been potential explanations.

Schema-Driven Security Filter Generation For Distributed Data Integration. Hui Dong, Zhimin Wang, Robert A. Morris, Douglas Sellers (NatureServe). 1st IEEE Workshop on Hot Topics in Web Systems and Technology. (HotWeb 06). pp. 1-6. 2006.
Preprint PDF.

Abstract. We describe an access control system for a project directed by NatureServe for the distributed service of biodiversity data served by the 75 partners of the NatureServe Natural Heritage Network members.
Security System Configuration Manual. Manual for above software

Rule based Security Policy management forWeb Service Integration Hui Dong, Zhimin Wang, Robert A, Morris. PDF. Submitted

Abstract. With the prevalence of loosely coupled web service composition in the service oriented infrastructure, distributed security management has become important to offer flexible web security. In our project affiliated with NatureServe we developed a XACML policy oriented security system to provide trust data service to 75 network partners in the Natural Heritage Program. A Horn Logic based rule inference engine extension to XACML model is used to solve the possible policy conflicts over context and semantics in the key decision making step. Approaches to facilitate such decision making process in various ways by using the extension are illustrated.

Ontology-based Peer Exchange Network(OPEN). Hui Dong, Zhimin Wang Robert A. Morris, and Jun Huang. International Symposium on Collaborative Technologies and Systems( CTS 2007) pp 191-198, 2007.
preprint PDF.

Abstract. Peer-to-peer systems represent a widely accepted approach to sharing massive data and services among large, diverse and varying sets of nodes in the network. In this paper, we introduce an overlay network which constructs the logical network topologies using a global ontology and peer ontological characteristics. Network construction and query model are defined to allow the efficient answer of complex queries based on concepts and relations. Simulation results are used to verify the predicted network properties.

Engineering considerations for biodiversity software. Robert A. Morris, Mathew Passell, Jun Wan, Robert D. Stevenson and William Haber.  [PDF], in Towards a global biological information infrastructure Challenges, opportunities, synergies, and the role of entomology H. Saarenmaa and E. S. Nielsen, eds. European Environment Agency, Technical report 70 , pp. 49-59, Copenhagen, 2001.

From the introduction.The UMASS-Boston Electronic Field Guide Project , UMB-EFG  provides a web-accessible distributed object-oriented database for the identification of biological specimens from field observations. The data, including both taxonomic and environmental or ecological data, will aid in identification by building a context for each observation. As observation data accumulates, larger-scale ecological studies can be carried out using the data. UMB-EFG is being constructed and populated under The EFG project has recently been expanded to encompass investigation of a number of issues and solutions under discussion in the eco-informatics community, including the use of XML for federation of data from disparate distributed database, as well as for more common tasks such as data exchange and system configuration. This paper describes our engineering approach to the building of these systems and reports on their current status.
The paper is derived from a presentation at the 2000 International Congress of Entomologists (ICE2000).

Electronic field guides and user communities in the eco-informatics revolution. Conservation Ecology 7(1): 3. [online] URL: http://www.consecol.org/vol7/iss1/art3

Abstract: The recognition that taxonomy is central to the conservation of biodiversity has reestablished the critical role of taxonomy in biology. However, many of the tools taxonomists produce for the identification and characterization of species, e.g., dichotomous keys, have been difficult to use and largely ignored by the general public in favor of field guides, which are essentially browsable picture guides. We review the role of field guides in species identification and discuss the application of a host of digital technologies to produce user-friendly tools for identification that are likely to greatly enhance species identification in the field by nonspecialists. We suggest that wider adoption of the citizen science model and the use of electronic field guides will enhance public understanding and participation in biodiversity monitoring.
KEY WORDS: bioinformatics, birding, citizen science, ecoinformatics, field biology, field guides, species identification, taxonomic keys, taxonomy.
Published January 20, 2003 Stevenson, R. D., W. A. Haber, and R. A. Morris. 2003. Electronic field guides and user communities in the eco-informatics revolution. Conservation Ecology 7(1): 3. [online] URL: http://www.consecol.org/vol7/iss1/art3 Copyright © 2003 by the author(s).

Database-backed decision trees with application to biological informatics. 2005, Robert A. Morris, Jacob K. Asiedu, William Haber, Fred SaintOurs, Robert D. Stevenson and Hua Tang.  To appear, J. Intelligent Information Systems.[PDF]

Abstract.We describe a mechanism for the identification of biological organisms through the use of enhanced taxonomic keys–decision trees with nodes augmented by property lists that can serve as arguments to web or local services that access databases or other resources about species, specimens, and ecosystems. Authors of these identification schemes can use simple spreadsheet tools to structure the identification abstractions, and middleware renders the resulting trees into many different forms, with the databases possibly discovered and queried at the time an identification is proposed.

 See also http://efg.cs.umb.edu/keys/

An Architecture For Electronic Field Guides.2005, Robert A. Morris, Robert D. Stevenson and William Haber. To appear, J. Intelligent Information Systems. [PDF]

Abstract: People who classify and identify things based on their observable or deducible properties (called “characters” by biologists) can benefit from databases and keys that assist them in naming a specimen. This paper discusses our approach to generating an identification tool based on the field guide concept. Our software accepts character lists either expressed as XML (which biologists rarely provide knowingly—although most databases can now export in XML) or via ODBC connections to the data author’s relational database. The software then produces an Electronic Field Guide (EFG) implemented as a collection of Java servlets. The resulting guide answers queries made locally to a backend, or to Internet data sources via http, and returns XML. If, however, the query client requires HTML (e.g., if the EFG is responding to a human-centric browser interface that we or the remote application provides), or if some specialized XML is required, then the EFG forwards the XML to a servlet that applies an XSLT transformation to provide the look and feel that the client application requires. We compare our approach to the architecture of other taxon identification tools. Finally, we discuss how we combine this service with other biodiversity data services on the web to make integrated applications.

Problems and solutions in distributed biodiversity data. Robert A. Morris. To appear, Proceedings of the Conference on Biodiversity Informatics, Ashoka Trust for Research in Ecology and the Environment (ATREE), Bangalore, June 2003. [Word]
Abstract. We present a survey of some of the issues in biodiversity data that arise when the data is scattered and in many forms. The focus is on XML-based standards and technologies that are being developed or applied by the biodiversity communities worldwide

Markus Nosse.(PDF), May 2001.
Summary. This report discusses the underpinnings of the JSP- and XSLT-directed  translation of XML species pages generated by the EFG to browser-independent HTML, as this generation stood in May 2001.


--- --- Semantic Processing of  Invasive Species data:
   2003 summit on invasive species, Invasive Plant Atlas of New England IPANE poster materials.
   2003 ESA Meeting, Fort Laurderdale, November 2003, workshop on data sharing . PowerPoint
Distributed Biodiversity Data Architectures:
   2002 ACM Joint Conference on Digital Libraries. PowerPoint

Recent biology papers by members of the project

Coming soon.