Summary: | Purpose - This paper aims to review key content, architecture, and metadata model decisions and strategies in creation of a publication portal (on DVD to start), based on a 30+ year series of flagship reports from the World Bank. Design/methodology/approach - The paper describes and analyzes key considerations and aspects of the project, including content architecture, content analysis, DTD selection, retrospective conversion, vendor management, design of metadata architectures, use of automated profiling methods, user-information behavior, and search architectures supporting complex content architectures. It includes the challenges of applying an institutionally based taxonomy required to express subject-matter responsibilities and relationships within the World Bank. Findings - The team learned that the metadata behavior and architecture (inheritance, relationships, variations) are more complex than simple links between parent and child objects. The project also reinforced the importance of comprehensive and dynamic topic taxonomy for classifying content that is both historical and current. The approach to defining classes for each full report (parent) will be likely to change, given what has been learned. The team would recommend that parts be classified and the sum of the part classes be assigned to the whole report. As a result of this exploratory work, the Bank's approach to classification and indexing of report series is changing from a top-down to a bottom-up inheritance. Originality/value - The study provides insights into both general and World Bank-specific challenges in creating a publication portal and derives some best practices for content architecture, metadata architecture, and use of automated profiling methods.
|