Proposal for phase II (2015)

On this page you can find a slightly reworked version of the OER World Map proposal for phase II (2015). The version finally agreed with The Hewlett Foundation can be found here .

Proposal for Consultation

Launching the Development of anOER World Map: Phase II
Prepared for The William and Flora Hewlett Foundation


Consultants
North Rhine-Westphalian Library Service Center (hbz)
Jülicher Straße 6
55674 Köln

Felix Ostrowski (graphthinking)
Reichenberger Str.1
10999 Berlin

In association with
Dr. Robert Farrow
Institute of Educational Technology
The Open University
Walton Hall
Milton Keynes
MK7 6AA

Contact person
Jan Neumann (Project Management)
Jülicher Straße 6
55674 Köln
Telephone: +49-221-40075-239
Email: neumann@hbz-nrw.de
Fax: +49-221-40075-180

.


CC-BY button
“Launching the Development of an OER World Map: Phase II” by North Rhine-Westphalian Library Service Centre is licensed under a Creative Commons Attribution 4.0 International License

1. Summary

This paper proposes an approach to build an OER World Map as requested by The William and Flora Hewlett Foundation. It provides the means to allow the OER community to collect data about itself in order to facilitate and speed up its development. The data will be visualized in different forms including a geographical map, statistics, profile pages and a calendar. Additionally, read and write access to the underlying data will be provided through the “OER Data Hub”. Using Linked Open Data technology and an API the platform aims at maximal reuse so that third party services like OER search engines can easily be developed on top of it. It builds upon phase I, in which members of this team have successfully built a prototype in a relatively short time frame. The scope is much broader though, as we believe that for open education to be successful, a vivid and open community is at least as important as building upon open licenses, open source software and open data standards. Community building and actively involving the community in the development process is thus at the core of the suggested approach. The basic technical features of the proposed solution remain the same as in the prototype, albeit with a much greater focus on the user interface.

2. Background – Institution and Team

2.1 The hbz

The North Rhine-Westphalian Library Service Center (hbz) is a central service and development organization for university libraries in North Rhine-Westphalia. It was founded in 1973. The centrepiece of the hbz services is the provision and management of a union catalogue for university and other libraries, around which has formed a lot of expertise in data aggregation, data normalization and provision of discovery interfaces. The hbz is also generating and hosting the German and the Austrian national library statistics.
The hbz has been active in the area of Open Access (OA) since 2002, initially in the hosting of institutional repositories and since 2004 in providing an OA journal platform Digital Peer Publishing (DiPP). For more than four years, the hbz has been actively promoting web standards and the open licensing of data published through libraries. The hbz has been recognized as one of the library organizations worldwide pioneering with opening up library data. As Peter Suber put it in his Open Access Newsletter for 2010:

[In 2010] libraries around the world began lifting restrictions and putting their bibliographic data into the public domain, usually under CC0. The movement seems to have started in Germany, with six libraries in Cologne plus the Hochschulbibliothekszentrum des Landes Nordrhein-Westfalen… Suber, Peter (2011): SPARC Open Access Newsletter, issue #153: Open Access in 2010.

In connection with the publication of open data, hbz is focused on making open data easily discoverable, accessible and reusable on the web. Since 2009, it has been promoting the use of web standards in the library world and especially Linked Open Data (LOD) technologies. Together with the ZBW (German National Library of Economics / Leibniz Information Centre for Economics) hbz established the SWIB (Semantic Web in Libraries) conference which developed from a German event into “the premiere conference for semantic web technologies in libraries” worldwide (Salo, Dorothea (2013): Linked Data in the Creases). By now the hbz has several years of experience with Linked Data and with developing open source software for LOD applications, especially in the context of hbz’s Linked Open Data service lobid.
Today, Germany still lags behind regarding OER development and implementation, since a significant number of OER activities have been emerging during the last two years only. hbz itself has been part of the German OER activities at an early stage. Since 2012 hbz has been exploring the possibilities of open educational resources and contributing its experience in metadata, web standards and technology. By now it has become one of the leading institutions from the German library world that is involved in the OER movement.
This proposal is submitted in partnership with Felix Ostrowski (graphthinking) and in association with Dr. Rob Farrow of the OER Research Hub project at The Open University. Over the last years, the consultants have made a lot of experience in collaboration on an international scale. For example, hbz members have been participating in working groups within the World Wide Web Consortium (W3C), the Open Knowledge Foundation, the Eclipse Foundation, UNESCO and of course in library-centric contexts. Generally, the consultants are experienced at working in an international, collaborative environment and will use English as the default language for communication and publication on the web (see for example the blog of the lobid team or the issues on github).

2.2 The Consultants

2.2.1 Jan Neumann (Project Management)

Jan (@trugwaldsaenger on twitter) is working as head of legal affairs and organisation at the hbz. He studied law, economy and systems thinking and has more than 10 years of experience within (international) project management for different publishing houses and libraries. He attended the 2012 World OER Congress and is member of the educational advisory board of the German UNESCO chapter concentrating on OER. 2013 and 2014 he joined the review team of the first major German OER conferences OERde 13 and OERde14. Jan blogs about OER on OERSYS.org.

2.2.2. Felix Ostrowski (Technical Lead)

Felix (@literarymachine on twitter) acts as hbz’s partner in this project. He is a web engineer, Linked Data technologist and knowledge management consultant who has been on the web since the mid-’90s. After graduating in communication studies and computer science he worked as a software developer and repository manager at hbz from 2008 – 2010. He was also a driving force behind the institution’s Linked Open Data (LOD) strategy. In 2010 he moved on to work as a research assistant in a joint project on digital long term preservation of the The Berlin School of Library and Information Science, Stanford University Libraries and the German National Library. Finally, he became self-employed under the name of graphthinking in 2013. One of the projects he has been working on since is Edoweb, a repository system for which he is building a Drupal-based frontend.

2.2.3 Adrian Pohl (Metadata & RDF Vocabularies)

Adrian (@acka47on twitter) has been working at the hbz since 2008. He is the metadata and RDF vocabulary expert for the development of lobid and also responsible for the project management. He has been actively promoting open knowledge, web standards and the use of open source software in the international library community, e.g. by coordinating the Open Knowledge Foundation’s Working Group on Open Bibliographic Data since June 2010 or recently by initiating the development of the Libraries Empowerment Manifesto. Adrian has been on the organizing team for the SWIB conference since 2011. Adrian initiated the establishment of a German-speaking working group for OER metadata within the Competence Center for Interoperable Metadata (KIM) of the German Initiative for Network Information (DINI) for which he became the group moderator. The working group deals with professional communication in German-speaking countries about metadata for OER, covering metadata standards, data conversion tools, data publication on the web etc.

2.2.4 Philipp von Böselager (Back-End Developer)

to be completed

2.2.5 Johannes Schnettker (Front-End Developer)

to be completed

2.2.6 Dr. Robert Farrow (Project Communications)

Robert (@philosopher1978 on twitter) is an interdisciplinary open education researcher at the Open University, UK who acts as one of the research leads for OER Research Hub and leads the development of OER Impact Map. He has many years of experience in curating and visualizing data about OER activity and discourse, primarily in the Hewlett funded OLnet and OER Research Hub projects. His input to the project will focus on generating authentic use cases for the OER Map as well as acting as a conduit for outreach and awareness-building in the OER community.

2.2.7 Ben Buschfeld (Design)

to be completed

3. Theory of Action

3.1 Which problems should be solved by the OER World Map?

In 2012, an international consultation process was initiated in association with the UNESCO/COL Chair in OER at Athabasca University – “Can the global OER community design and build an ‘OER World Map’ together?
Although – or maybe because – the OER World Map Project has been intensively discussed by the OER community since then, the goals of the project are not yet completely clear at the time being. The advice given by the UNESCO/COL includes many helpful and practical ideas how to develop and organize an OER World Map. As far as the question “What should be put on the map?” is concerned, it focused on initiatives (projects) as well as on the institutions which are financing these initiatives and the persons which are doing the concrete work. Nevertheless, it remained rather fuzzy why someone should use the map and how concrete use cases should look like. The Hewlett RFP for phase II of the project points out that the OER World Map “should give a global overview of OER projects and initiatives” in order to “give a comprehensive view to illuminate the OER world”. More precisely, it mentions that the map should “act as a resource for communicating with stakeholders” and “facilitate connection among OER providers, teachers, and learners”. It also underlines that “over time and as the project scales” the OER World Map should help teachers and learners to answer questions like “What OER are available in my country/language/grade level/subject?”. This point is especially remarkable, since it moves the focus of the project away from the mezzo level of the institutions to the micro level of documents which are developed by these institutions. As we will see later, this requires technically a completely different approach than the original idea to focus on projects and institutions.
In order to make it clearer how specific use cases of the map could look like, it makes sense to look at the different stakeholder groups which could have an interest in using the OER World Map. There are at least three major groups which have to be taken into account:

  1. Students and teachers: These typically seek out specific OER to use in a certain learning or teaching situation; the main demand here is a mechanism which offers easy and fast access to the micro level of documents.
  2. OER producers and advocates: This group certainly has interest in finding OER documents as well, but additionally it is also interested in finding other institutions and experts in order to cooperate within the development of new contents and services and to foster the development of the OER movement.
  3. Policymakers: This group is normally less interested in the micro level of documents but more on (quantitative) evidence about the development of the OER movement in order to take decisions about future funding of OER projects. Very similar is the interest of researchers who are also looking for evidence in order to better understand the OER movement in research terms.

In summary, there are three main types of problems which could be addressed by the OER World Map:

  1. Finding OER in order to support concrete learning/teaching Here it is important to notice that the proposal does not include a search engine for individual OER but a list of existing OER services, which can be used for searching OER manually or for building OER search engines on top of it..
  2. Supporting communication between stakeholders in order to facilitate the self-organizing processes of the OER movement.
  3. Collecting information about the state of the OER movement in order to provide evidence for decision making and better understanding.

3.2 How to approach the challenges provided by the OER World Map?

3.2.1 Basic development principles

In general, we believe in these five principles for providing access to data on the web:

  1. Publish data according to the open definition. (This includes providing a full dump of the data with incremental updates.)
  2. Develop services as Free/Open Source Software.
  3. Use open web standards for the publication of data on the web, especially acknowledge the best practices of the Linked Open Data community.
  4. Provide an open web API (see below) for web developers to easily interact with the data.
  5. Provide a simple and intuitive data interface for people to easily and intuitively explore and interact with the data.

The hbz’s principles for publishing data on the web resonate well with the Hewlett Foundation’s criteria for openness. To consider them worthy of funding, the Hewlett Foundation expects from projects to meet the following criteria: “free and legal to use, revise, remix, and redistribute”, “formats that are usable, sharable, revisable, and remixable with free and open source software”, “offer accessibility to a diverse body of users”, “follow standards developed by the OER movement to enhance discoverability, interoperability, quality, and accessibility”. These criteria reflect the thinking we developed and promoted during the last six years.

3.2.2 General design characteristics

API Overview

Figure 1: Basic technical architecture

The basic technical architecture developed in phase I of the project, which is based on the distinction between a front end (the OER World Map) and a back end (the OER Data Hub), will be maintained and expanded.
While the hbz prototype developed during phase I of the project mainly focused on technical aspects, the scope of phase II will be much broader. Goal of phase II is to develop a working socio-technical system. Both parts of this system — humans and technology — have to be addressed appropriately to facilitate the success of the project. Actually, we believe that the main challenge of the next phase of the OER World Map project will be to build a community which drives the editorial processes needed to create and update the included data — not once but on a long term basis. In order to do so we propose to integrate editing provided by volunteers (i.e. every registered user) with qualified editing provided by editorial teams. To do so, several requirements have to be met: On the one hand and in order to motivate volunteers to contribute to the OER World Map it is necessary to make the editing process as simple and intuitive as possible. Additionally, their contribution should directly be made visible. On the other hand it will be necessary to find intelligent ways to interconnect and synchronize databases used by different OER data curation projects (DCP) in order to reuse their work for the OER World Map and thus avoid duplication of effort.
Looking at the three main challenges mentioned above we assume that although the Word Map has been the starting point of the conceptual design of the project it will neither be the ideal interface to find OER, nor to support communication, nor to provide quantitative information about the OER movement. Therefore we believe that within Phase II additional gadgets like a powerful search, a statistics module and a calendar module should be added in order to support and complement the information delivered by the map interface.

3.2.3 Project management

The project will be managed using an agile approach. This approach will allow the project to moderately change its goals during the course of the project. At the beginning of the project concrete user stories will be defined and prioritized. After each sprint, during which new functionality will be added to the working system, additional user stories can be added. Though some aspects of the design will be compulsory it will be possible to change the prioritization of the user stories which can result in differences concerning the implemented functionality. We propose to collect possible input of the community and give the advisory board of the project the possibility to influence the backlog of the project. In order to do so we are planning to use social media tools like Twitter and our Wiki as well as web-based management environments like the GitHub platform to keep the community up to date.

4. Outcomes

4.1 The Center of the Project: Community Building & Editing Process

4.1.1 A community of editors

As stated before, we believe that the most challenging part of the project will be in the long run, i.e. to build a community which drives the editorial processes needed to create and update the data of the OER data hub. Since the OER community exists for more than ten years, it will not be necessary to develop a completely new community but to activate the parts of the existing community which could actually drive the editorial processes needed for the OER World Map. As far as we can see today there are three major groups of editors, which might input data into the OER World Map:

  • OER Projects, Services and Experts: Given that OER World Map establishes itself as the central information source concerning the OER community, OER projects and experts, which are not yet included will have a natural interest to be included, since it will help them to be more recognized and to disseminate their contents and/or expertise more effectively.
  • Volunteers: The OER World Map will also be open for volunteers who participate in the editing process out of intrinsic motivation caused by their interest in fostering the OER movement.
  • OER data curation projects: Last but not least there is a growing number of projects which, in the one way or the other, are already involved in collecting and maintaining data about the OER community. Examples include:
  • The Open Educational Consortium
  • The UNESCO WSIS Community
  • The ROER4D Project
  • The OER Data Hub based at the Open University
  • The South American Mira project
  • Creative Commons OER policy registry
  • OEREXCHANGE
  • The William and Flora Hewlett Foundation’s grantee database

One major goal of the project will be to start integrating the data delivered by these projects into the OER World Map so that their effort can be reused.
Building a community is a quite different task to building a technical system. One can take measures to foster its development, but in the end it will be the community which decides if it wants to participate or not. As far as we can see there are several points which have to be respected in order to support the growth of a community:

  • Create a clear and understandable product vision which can be communicated easily.
  • Make use of existing OER communities and social networks to disseminate project information by building a network of strategic partners, which operate as multipliers.
  • Listen to the community and react to community feedback. Include the community in the development process from the beginning on, e.g. by collecting user stories and providing a web-based open development process.
  • Keep barriers to participate low e.g. by providing editorial processes which are easy and intuitive to use. This also means that the data model has to be kept as simple as possible.
  • In order to enable the development of the skills needed to add and edit data a well-structured documentation (text and video) must be provided.
  • Give immediate feedback which proves the impact of the contribution. If someone contributes to the OER World Map, his entries ideally should be visible instantly and disseminated via social media channels such as Twitter and Facebook if desired.
  • Acknowledge contribution, e.g. by displaying the number of edited data sets within your user profile.
  • Implement a community event like an OER World Map Hackathon, which aims at collectively filling the OER World Map with data.

Communication is the key to success. To support effective communication a Project Communication Plan will be developed, which should include different kinds of communication. On the one hand the project will provide comprehensive information which is open to participation in the form of a project landing page, an open project wiki and a GitHub project.
On the other hand effective dissemination of information (e.g. calls for participation) to the OER community has to be guaranteed by the intensive use of existing social networks. The excellent network of the OER Research Hub makes an ideal starting point. Additional channels (Like the OER-Community, UNESCO`s WSIS Knowledge Community or the OKFN-Mailing List) should be included, so that important news reach a high percentage of the global OER Community.
Additionally it will be necessary to build up a network of strategic partners which will be addressed bilateral as well as by virtual events like web conferences. The list of potential partners includes OER data curation projects, but also organizations which are active in the field of OER like Creative Commons, Wikimedia, the Open Knowledge Foundation as well as UNESCO, OECD and the EU. Scientific institutes might be other institutions, which could in the future contribute to the curation of OER data. One way to address the extension of the partner network could be to search for one “official” curation partner in every country or at least in every continent. .
It is important to recognize that the map can act as a focal point for the OER community though we do not intent to develop an “OER-Facebook”. While we do not plan to provide full social networking functions as part of the map – an idea that has been suggested by some, since social networks are designed to facilitate the formation of communities – our project can support networking and community cohesion by providing up to date and relevant information about the OER world and the actors within. Especially we plan to generate profile pages which will include widgets that make use of all major existing social media platforms.

4.1.2 Defining the right editorial process

One major work package will be the definition of an editing process which fits the needs described above. A central feature of this process will be the integration of basic and qualified editing functionalities. The basic process will be open for everyone who is willing to register and contribute. In order to guarantee the quality of the data included, we believe data collected by the basic process should be controlled by qualified editing teams, which could be made of members of the data curation projects listed above. We believe that this kind of quality control will be necessary for several reasons. On the one hand we learned from phase I that sometimes it can be difficult to exactly distinguish between an institution, a project or a service. On the other hand we believe it will be necessary to verify that projects and services included correlate with the OER definition, since it is still a common misapprehension to believe that the OER concept also covers resources that are simply free to access but not openly licensed. One important design decision will be to determine the integration of the basic and the qualified editorial process. The “classical” approach to this problem will be that data being input by the basic process has to be controlled by qualified editors before its publication. Alternatively it could also make sense to publish all data directly and to give the users the possibility to restrict search results to quality controlled entries.

4.2 Versatile use through a modular user interface: The OER World Map

The user interface shall be designed browser-based and modular. It should consist of individual components that are aggregated into one web application. Additionally the components (“widgets”) should be made available individually so that they can be embeded in external web sites easily. The components range from an actual geographical representation of the data to diverse statistics in form of charts to an editing tool in form of simple web forms. A user documentation will also be made available and integrated into a contextual help system.

4.2.1 Geographical World Map

World Map

Figure 2:High level view with opacity

country_details

Figure 3: Country Overview

Even if other kinds of visualization should also be addressed, the geographical world map is of outstanding importance for phase II. Although many design parameters are determined by the mapping approach, there is still a considerable creative leeway in terms of functions that can be integrated into an OER World Map. For example, instead of flooding the map with a multitude of individual markers, the initial view could be a high level one: the state of OER (number of institutions providing OER services, number of OER policies, …) for each country is expressed by opacity. It is also possible to apply filters, such as grade level, subject and language. This will result in the opacity of each country being set according to the amount of services in that country that match the given filter.
Like any component developed in this project, it will be possible to link to the map with the currently applied settings so that any interested user can generate and publish an individualized OER map, which contains information relevant for him, with little effort.
Clicking on a country brings up an overview of OER there, including statistics on grade level, subject and language of OER available through services that are provided by institutions in that country. These statistics are visualized as pie charts. Clicking on a section of such a chart will take the user to the general search component (see below) with the corresponding filter set. Also, a detailed map of the selected country is displayed.
The functions of the OER World Map described above represent our current ideas and will be refined according to upcoming and additional user stories. Again, an individual country overview can be embedded in other websites. In order to make the map as self-contained as possible, client-side rendering of the map will be implemented instead of using external services such as OpenStreetMap tiles.

4.2.2 General Search

data_hub

Figure 4: General Search Module

A map is not necessarily the best way to perform more complex queries, and for some types of entities (e.g. some services or projects) it isn’t even possible or sensible to locate them on a geographical map. Because of this, a general search component should accompany the World Map component. If a user is looking for a service providing OER on a particular topic, for example, she will not do this on a map. It is simply not important for her where the institution that provides the OER is physically located. The search component will make it possible to quickly find all entities (projects, services, people, policies and institutions) described in the OER Data Hub. Diverse facets (type, location, grade-level, subject, language, etc.) will be available along with a full text search. Given that metadata exists about OER provided by a service, it will thus be possible for example to filter services so that only those which offer contents from a particular subject or level of education will be displayed. It should be noted, however, that the goal is not to provide a search engine for OER material (which will not be included in the OER Data Hub) but rather a database that would allow others to build such search engines.

4.2.3 Data Export

A basic feature that should be implemented is that all data included in the OER Data Hub can be exported in various formats, including RDF, JSON and CSV. Additionally, only a subset of the data – filtered by the general search module – can be downloaded. In other words: every search result set can be downloaded as CSV, etc. This ensures that all users have fast access to flexible data and can re-use it in their spreadsheet program, maps, etc. according to their needs.

4.2.4 Editing

In order to find entities in the OER Data Hub, they must obviously have been entered in the system at some point. While some data will be seeded initially, it is necessary to keep it up to date and to enter new data through an easy to use user interface. Following the general logic of our approach, this functionality will also be supplied by a component.
The field definitions are derived from the data model so that the forms can automatically be generated. Since linking entities is of special importance in our approach a simple yet powerful auto-completion field type will allow for this to be done in a comfortable way. A special feature of the editing component will be that they are expandable by users: if a user finds that a certain form lacks a field for a data element that is important to her, she can easily add a field. These custom fields can then be analyzed regularly and data elements for frequently added fields can flow back into the data model.

4.2.5 Statistics

reports

Figure 5: Statistics Module

The data included in the OER Data Hub can be used to create a first skeletal structure of meaningful statistics for the OER movement. Wherever possible, statistics will be represented visually in the form of diagrams. Based on the initial data model, statistics could at least give basic information on the following questions:

  • How many institutions are dealing with OER?
  • How large is the increase in the number of institutions compared with the previous year?
  • How many institutions have an OER Policy?
  • How many OER services exist in which area of education?
  • How many OER projects are there in the different regions of the world?

During the course of the project, it should be clarified with UNESCO, OECD and EU, what statistical information should be collected in order to develop further meaningful statistics. Important data that cannot be collected without additional funding relates to the micro or document level (the individual OER). However, the OER World Map could support the collection of these and other important data in different ways: firstly, it can provide an address list which could make data-acquisition by surveys much easier and cheaper. A second way to generate statistics which include the micro-level, could be to connect the OER Data Hub directly with the API of repositories in the future. This possibility will be explored by looking for potential candidates, analyzing their APIs and publishing a white paper on the findings.

4.2.6 Calendar & Timeline

A common calendar where important OER-related events (conferences, project kick-offs, beginning of funding programs, beginning and end of MOOCs, etc.) are displayed could have a huge practical value for the OER movement. It would make it easy for anybody to find upcoming OER-related events in their region. Furthermore, without such a simple coordination tool, the planning of OER events can easily bring about scheduling collisions (such as in 2014 when two major OER events took place at the same time). A calendar could also be a helpful tool for learners, for example if a learner wants to participate in a MOOC within his holidays. Similar to a calendar, but with a more retrospective character, OER related events (projects, conferences, scholarship programs, declarations) could be displayed in a timeline that visualizes the history of the movement.

4.3 Data Model

The data model – referred to as “application profile” in phase I – is the most important intellectual component of the OER World Map project. The data model must describe all entities and their relations which are necessary for the realization of the user stories identified. It will be expressed in form of an (OWL) ontology. This approach has been used when building the prototype for the OER World Map and has been proven successful as it provides a common ground for all stakeholders. Data elements can easily be added and modified as needed, and the data can be published and interlinked between different web sites at web-scale. Also, a wide range of established RDF vocabularies exists that can be used to store and more importantly expose the OER World Map data in a reusable manner. The most common vocabulary for structured data on the web currently is schema.org. Thus and if possible, we will use terms from this vocabulary, using selected terms from a range of vocabularies when applicable and carefully craft new ones only if there is no other option available.
datamodel

Figure 6: Initial Data Model

The data model developed in phase I was extended according to the initial user stories. This revised data model will serve as a starting point for phase II – for the sake of brevity only entity types and their relations are listed here. It will be amended continuously and iteratively, regularly incorporating any necessary changes derived from additional accepted user stories. Both the architecture of the front-end and the back-end will allow to easily reflect changes made to the data model in the actual OER World Map and Data Hub.
While conceptually all entities described in the data model are in the domain of the OER World Map, the related data is not necessarily and primarily held in the OER Data Hub. Following the concept of Linked Data, external data sources will be connected where appropriate. These include but are not necessarily limited to LinkedGeoData for Countries, Regions and Cities and classifications for languages (e.g. ISO 639-2: Codes for the Representation of Names of Languages), grade levels (e.g. International Standard Classification of Education) and subjects (e.g. UNESCO nomenclature for fields of science and technology).
As can be seen in the data model, the current system boundary will be OER services (i.e. collections / repositories). We are aware that a user who is looking for OER on a particular topic must thus

  1. first determine which OER services contain relevant contents for him by looking on the service overview,
  2. click on the included links to each promising service and then
  3. use the search tools of the service to look for interesting resources.

For the user it would obviously be more comfortable if she could carry out a single search on all available resources and directly find the best results, regardless of which repository they are located in. To do so, it would be necessary to include an OER search engine that indexes the contents contained in the various sources (worldwide!) and offers a convenient search on all available content. As part of phase II of the OER World Map project, such a connection from the macro level (i.e. institutional repository services) and the micro level (i.e. individual OER) cannot be realized. We believe that the OER Data Hub represents an important precondition for building such an OER search engine by documenting the links to the individual services. As part of the OER World Map project, the future integration of search engines shall be considered and prepared (“search engine ready”), so that a later connection can be realized as easy as possible.

4.4 Back-end / Application Programming Interface (API): OER Data Hub

As in the prototype, we will follow a decoupled approach, separating the OER World Map (user interface) from the OER Data Hub (API). The API is a piece of software providing access to the data without exposing details of the underlying technology stack. This means that there is no long-term dependency on any specific storage technology as the API-layer abstracts from the concrete database technology used. In effect, if one wants to switch to another technology in the future, API-users would at best not even notice this change. The decoupled approach also has the advantage that data from external systems can be seamlessly brought into the system, given a conversion utility such as OpenRefine maps the structure of the external data to the data model described above.
Since the OER World Map as the initial target for the API functionality is a web application, it is clear that the OER Data Hub will be HTTP-based and RESTful. More specifically, it will serve the OER World Map data as 5-star Linked Open Data. Using HTTP-URIs as identifiers and linking resources together will make the OER Data Hub part of the growing web of data. This has several advantages.
Firstly, all major search engines incorporate structured data as soon as it is provided using the schema.org vocabulary. These terms will also be embedded in the HTML (using RDFa or JSON-LD) to be harvested by search engines like Bing and Google. This enables better results for users searching for OER agents, activities and services in the search engine of their choice.
Furthermore, an essential part of describing an entity with the Linked Data approach is to define the relations it has to other things – whether these things are described on the same web service or in a different context. This greatly facilitates re-use of external data such as country descriptions and classification schemes. Thus, Linked Data is the best fit for describing and interlinking different kind of objects/entities that are relevant to map the development of the OER community.
Finally, the use of Linked Open Data implicates that any entity described in the system (service, project, person, institution, .etc.) is identified by a stable and unique identifier. This opens up the possibility to use the OER Data Hub as an authority file. As such it could become be a point of reference for anybody in the realm of OER, especially since the API enables lookups and autosuggest to be included in third-party web applications.

In general, we will build upon the API developed during phase I and extend it to fulfill the needs following from the user stories identified. Features to be added include but are not necessarily limited to:

  • support for complex queries (e.g. for statistics)
  • validation of inbound data
  • provision of additional data formats (CSV, GeoJSON, …)
  • versioning / provenance

See below for information on which technologies we plan to utilize to implement these features.

4.5 Technology Stack

4.5.1 Overview

tech_stack
Figure 7: Overview Technology Stack

4.5.2 OER World Map

While Drupal allowed us to rapidly build the prototype of the OER World Map in phase I, we also learned a lot about its limitations. Firstly, it was in parts unnecessarily hard to customize, especially when it came to the forms for data editing. Secondly, it adds unnecessary overhead when used in scenario with a separate data backend. Lastly, it is hard to build components that are to be re-used in a non-Drupal environment. In phase II we have thus decided to build a pure HTML/CSS/Javascript front-end, i.e. a web application made up of the individual components described above. In order to do so, we plan to use the following open source technologies:

  • AngularJS is a JavaScript framework used to build modular single page web applications. An even more sophisticated approach would be Polymer, but this is currently still in developer mode.
  • Bootstrap is a CSS framework that provides responsive design templates to create web applications.
  • d3.js is a powerful JavaScript library for data visualization. It uses web standards like SVG, HTML and CSS to create dynamic, interactive graphics.

4.5.3 OER Data Hub

The technology stack we plan to use to realize the features described above consists of the following components:

  • The Play Framework is a framework for developing HTTP-centric web applications and APIs. It combines the rapid development process of frameworks like PHP or Ruby on Rails with the performance and security of Java for server-side development.
  • Apache Jena is a Java framework for building Semantic Web and Linked Data applications.
  • Elasticsearch is a search engine and NoSQL data store based on Lucene. While easy to administer it has rich and complex functionalities like high availability, near linear scaling, near real time search, built in geo location support, APIs for many languages and a REST API, lots of plugins and a fast growing user base.
  • While Metafacture is a very versatile tool for extraction, transformation and loading of (meta-) data, it is also quite complex. Thus in phase II, we will evaluate the use of OpenRefine to convert external data in order to load it into the OER Data Hub.

Custom components will be developed for data validation and additional data formats.

4.5.4 Open Data Services

Additionally, these open data services will be used:

  • Nominatim is used for geocoding based on data from the Open Street Map.
  • LinkedGeoData provides data from the OpenStreetMap as Linked Open Data.
  • LC Linked Data Service of the Library of Congress provides authorities and vocabularies as Linked Data, e.g. ISO 639-2.

While building upon external services has the advantage of re-using expertise, it also introduces the risk of losing functionalities when those services are not available for some reason or another. To counter this risk, we will cache the external data so that it can be provided even if the original source is unavailable.

4.6 Business Model Generation

The formulation of a sustainable business model is a key requirement for the long-term success of the OER World Map. During the next year the development of a complete business model will probably not be possible. Nevertheless, it should be a goal of the project to analyze suitable business models and to document existing options in a working paper. The following models are conceivable and should be analyzed in greater detail:

  • Additional funding, for example by the Hewlett Foundation but possibly also by other funding programs like the ongoing Horizon 2020 programme of the European Union.
  • Ongoing support from governmental and supranational institutions, such as UNESCO, the OECD, the EU or individual countries that need the data contained in the OER Data Hub for your statistics and policy making.
  • Establishment of an association or a scientific society for Open Educational Resources, which collects membership fees that can be used for the operation and development of the platform.
  • Crowdfunding

Another work package in this area should be an initial legal analysis of the operation of the OER World Map to assure the compliance of the system with data protection laws.

6. Risks

Following risks have been identified:
risklog
Table 1: Initial Risk Log

7. Activities and Time Planning

Please note that we take an agile approach to this project, so that the presented list of milestones is rather exemplary and open for future modification. Each sprint – i.e. iteration – will take four weeks and will result in a set of user stories being implemented and usable. After each sprint, we will publicly deploy the current state of the software so that progress can transparently be monitored by any interested party.
The project includes four major releases:

roadmap
Table 2: Project Road Map
The associated work packages as well as their time planning can be seen from following table:

workplan-oerwm

Figure 8: Project Time Planning

No. Work Package Description
1. A product backlog that ensures we deliver a viable product, sprint planning and a timetable of regular on- and offline meetings, an updated issue log, reports to project board, a project documentation, fast escalation if needed. This includes feeding user stories contributed by the community into the backlog.
2. A well organized, modular software architecture and a sustainable development process
3. An active community of contributors and users of the OER World Map platform
4. Assurance that the acceptance criteria of the components developed for the current release are fulfilled
5. A set of prioritized user stories for the front-end components, including acceptance criteria
6. A website wireframe for individual front end components as well as one that shows the positioning and display of current and future components on the page
7. An appealing landing page that describes the project goal and allows users to leave their email address to be informed about the project progress
8. An organizational analysis of the needs and facilities of existing OER data curation projects which outlines strategies to reuse editorial work
9. A set of user roles for accessing front- and back end functionalities and permissions associated with them
10. A widget that allows users to register an account, log-in to the OER World Map and create a simple initial profile page
11. A set of core data elements needed to describe the entities derived from accepted user stories. This includes controlled vocabularies for languages, countries, grade level and subject.
12. A defined editing process, which combines possibilities to crowdsource editing with possibilities to implement quality control by editing-teams.
13. A white paper outlining that legal requirements of the OER World Map are met
14. A well designed and clean looking geographical map which allows filtering of the included data according to country, language, topic and grade level
15. A widget that displays an individual country and its details, including a tabular overview of people, institutions, projects and services and pins on the map for individual entries
16. A general search functionality for people, institutions, projects and services which allows faceting of search results
17. Initial entries for the OER World Map , derived from an external data source
18. A white paper describing a sustainable business model beyond the funding phase
19. Easy and intuitive to use web forms, including data validation according to the data model, which can be used for the editorial process defined in 12.
20. Statistics widgets which display basic information about the included data in form of diagrams and tables
21. A calendar which allows to display OER-related events like conferences, project kick-offs and the beginning of funding programs
22. A time line which visualizes the history of the OER-development
23. Content for the front-end help system created in 24.
24. A context-sensitive help system displaying parts of the user documentation created in 23. in appropriate areas of the front-end
25. A short screencast describing the features of the OER World Map and explaining how to use the interface
26. A German OER editorial team and a collection of data concerning the German OER-movement
27. A converter / importer for the data of the selected OER data curation project
28. An API that is able to output data at least in RDF, HTML/RDFa, CSV and where applicable GeoJSON
29. A converter / importer for the data of the selected OER data curation project
30. A converter / importer for the data of the selected OER data curation project
31. A white paper describing how individual OER can be included in the OER world map
32. A report with 10-15 pages, describing the course of the project, major outcomes as well as lessons learned and possible future additions.

Table 3: Work Package Descritions

8. Evaluation and Quality Assurance

In general, we believe that an iterative and agile approach is best suited to guarantee the success of the project. In each four weekly iteration, we will collect user stories that are added to the project backlog. The product owner then selects the user stories to be implemented in the next iteration. These are then discussed with all team members in a brief meeting, at the beginning of which each member also summarizes the work completed and any difficulties encountered. While the development team implements the next set of features, the acceptance criteria of completed user stories are tested by the product owner in cooperation with an appropriate stakeholder. This ensures that project management will always have a good overview of the project and will be in a position to detect potential problems at an early stage and to find pragmatic solutions in cooperation with the technical lead. From a more technical point of view, all software components will of course be accompanied by a set of automated tests.
In order to practically test the intended editorial workflow as well as the platform as a whole, data on all German OER projects will continuously be added to the OER World Map in the course of the project. Lessons learned from this will be collected in order to refine the process and the technical platform.

9. Budget

Project Budget Details Projected Project Budget Proposed Hewlett Foundation Expenditures
Year 1 Year 1
Salaries 128.302 128.302
Payroll Taxes & Employee Benefits
Other Professional Services 26.776 26.776
Travel 4800 4.800
Conferences & Meetings 2.500 2.500
Capital Expenditures over $5,000
Other – e.g. postage, office supplies, etc.
Overhead 38.632 37.622
Total Budget 201.011 200.000

Table 4: Project Budget
For complete budget planning please see the attached budget sheet.
Please note that hbz offers to develop a backend system which can be used for the OER World Map platform within its related project “Semantic OER search” at own costs. All software components will be published as open source software according to the demands of the Hewlett Foundation.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s