Our last team meeting for 2016 from the 24th to the 25th November took place at Cologne. Here in holy city of Cologne is the North Rhine-Westphalian Library Centre (hbz) where four people of the OER World Map team are located. So the other members of our geographically dispersed team had to travel this time. The two days were filled with constructive discussions and effective teamwork. Continue reading
In case you couldn’t make it, here are the slides from today’s presentation at the Open Education conference. All feedback welcome!
The following project update was sent to the Athabasca OER-community list at the 4th of October 2016.
Dear friends of the OER World Map,
I hope you all had a great summer break! I would like to give you an update of the development of the OER World Map project. As many of you will know, this work followed from the initial discussion in this community 2012. 2013 the Hewlett Foundation decided to fund the project. After an initial development of several prototypes, the North-Rhine Westphalian Library Service Centre (hbz) based in Cologne, Germany, started developing a production system in 2015. All proposals are available online.
The OER World Map can be seen as an Social Education Management Information System which aims at accelerating the evolution of the global OER ecosystem by strengthening the ability of the OER community to organize itself. It combines elements of a social networking platform, a business information system, a geoinformation system and a library catalogue and will contribute to overcoming the challenge of mainstreaming OER by collecting and visualizing the building blocks of the global OER ecosystem. By doing so, it connects OER actors with each other, facilitates sharing of experiences and resources between them and fosters collective learning. At the same time it provides a sound operational information basis for developing infrastructure and policies in favor of Open Education.
During the summer break we launched version 1.1 of the OER World Map platform. Besides many improvements “under the hood”, in the backend of the system, the new version includes a completely reworked layout as well as the opportunity for users to register and create personal profiles. For technical details please have a look at our blog. Since the success of the project depends very much on the participation of the community, we warmly invite all of you to register on the site and create your personal profile on the map!
Besides improving the technical system, we also worked hard on increasing the database during the last months. Our constantly growing Country Champion Network includes more than 30 institutions and experts from all over the world. Together we have already collected more than 270 OER services, like repositories and aggregators. Many of them are carrying qualified information on included subjects and used licenses.
At the beginning of the year we also published the “OER Atlas”, a printed version of the map listing OER actors and activities from Austria, Germany and Switzerland. By publishing the Atlas, which targets especially at OER policy makers, we proved that it is possible to collect the data for a complete country with a reasonable amount of effort. We hope to export this model to other countries in the near future.
All in all I guess it is fair to state, that the platform entered its adolescence and will start to provide increasing value from now on. Our roadmap includes a bunch of great new features like subcategories for our data types, the “Openness Indicator”, as well as the “Fields of activity radar chart”. Another goal is to extend social networking functionality continually; For example we just recently included the opportunity to comment on entries for registered users.
You can contribute to the success of the project not only by creating a profile, but also by becoming an editor or even a country champion. If you want to join us in mapping the world of OER, please write us a mail to email@example.com. We are looking forward to hear from you soon!
Best open wishes
Jan Neumann & the OER World Map team
OER World Map collects a lot of data. This is essential for making data centrally available, but as more is collected, the difficulty of finding a specific item increases, regardless of license or data content. Therefore, as data in OER World Map increases, it is very important to implement efficient and targeted search and ranking algorithms.
There are search algorithms, whose complexity, efficiency and confidentiality are impressive. The major search engines in the world are clear examples. Of course, a relatively small non-profit project as the OER World Map can not develop a such complex search algorithm from its own resources. This is also not desirable because the platform is built on the principles of transparency and openness.
Why is transparency so important?
We can assume that the user has the ability to comprehend the search behaviour and what caused the respective ranking of a search result. Furthermore, and as long as they feel that, in determining the rankings, no topics, authors, vendors, interests or similar parameters are preferred, the user can trust the result. However, once parts of the algorithm are hidden in the proverbial ‘black box’, is at least a theoretical possibility that some searchable items might receive preferential treatment (or be discriminated against).
Like the entire code of OER World Map, our ranking mechanism is implemented as open source. In this way, the OER World Map demonstrates that the same rules and conditions are applied to all resources (services, organizations, people etc.), and that no differences of treatment are existent.
Of course, every search algorithm includes factors that lead to the higher weighting of individual results – otherwise there could be no ordered ranking at all. (These factors are just not dependent on specific content but on universal features like morphological matching or the length of an entry for example.) In the following, the most important search ranking constituents are illuminated (as of September 2016).
The code of the OER World Map
The search for the OER World Map is based on Elasticsearch as the main container for data storage. Elasticsearch is an open source search engine based on Apache Lucene. It allows the configuration of the search mechanisms via a JSON file, called index-config.json within the OER World Map. Within this file you can define whether and how individual data should be searchable. Currently, Elasticsearch is configured as follows:
- “name” and “alternateName” are both indexed, in original spelling and variants in order to ensure that searching with typos could still produce the intended hits.
- All other fields are indexed in their standard format (as written in the database).
- From the data model point of view, all resources can be associated with addresses and geo-coordinates.
Within the OER World Map, a search command to Elasticsearch is triggered by the method esQuery() in the Java class ElasticsearchRepository. The following parameters can be controlled by this method:
- Field Boost: the field-boost determines which data fields get more weight in the search. Classically, in particular the “name” field is greatly boosted. For example, “alternate name” can (somewhat less) also be boosted. (Boostings are concretized below.)
- Limitation to a specific partial result: to scroll through multiple search results pages, it is useful only to display the results of a partial area, so for example, only the hit “1 to 10” or “11 to 20”.
- In very special cases, it may make sense to display search results on ascending order, meaning that the results with the smallest search result value are listed on top. The OER World Map and Elasticsearch basically allow ascending and descending order. The default provided by the OER World Map is “descending”.
- For completeness, it should be mentioned that search results can be omitted entirely from the results list due to geo-filtering. While the source code of this feature is already written, it is corrently not yet activated. As soon as this implemented feature will be activated, a user can limit the search to a specific geographical area (through the display of a particular map section), whereby all results from outside of this area do not appear in the results list.
The global preferences of the OER World Map for field boosting are located in the file search.conf. At present, boosting provides the following weighting of fields:
- “name” by a factor of 9
- “alternateName” by a factor of 6
- “provider.name” by a factor of 5
- “provider.alternateName” by a factor of 4
- “agent.name” by a factor of 4
- “agent.alternateName” by a factor of 3
- “participant.name” by a factor of 2
- “participant.alternateName” by a factor of 1
- “memberOf.name” by a factor of 1
- “memberOf.alternateName” by a factor of 1
- “member.name” by a factor of 1
- “member.alternateName” by a factor of 1
- “article body” by a factor of 1
Due to continuous development of the OER World Map, details (such as boosting factors) are going to evolve over time. New search fields might be added, or existing ones eliminated. It is envisaged that there will be an additional weighting based on “likes” (or some other voting system). The amount of links to a resource is a desirable weighting parameter as well. In any case, the quality and reliability of the OER World Map will always be gauged from the preservation of transparent and evenhanded search. OER World Map users can always check and be certain that search results are determined fair and reasonable.
The code of the OER World Map is hosted on Github. In still more specific questions, the team of OER World Map would refer you first to the source code but are also very happy to answer questions!
Like many other projects the OER World Map is driven by a virtual team, which is working distributed in different cities, which are in our case Cologne, Berlin and Milton Keynes. Though cooperation via the net works pretty well, it does not replace meeting personal for many reasons. Therefore we emphasize meeting personally on a regular basis.
From the 7th to the 9th of september our English colleague Rob was the host for our “physical team meeting”. We met at the Institute of Educational Technology of The Open University in Milton Keynes. Many thanks for the hospitality and for the guided tour through research labs of the institute! Continue reading
It has been a while since the last technical progress report has been published. Luckily, this is not because nothing has happened since, but rather because we were busy building things. This blog post briefly summarizes the most important of those things.
The most prominent changes are naturally related to the user interface. The layout is now based on three interrelated columns; one for the map, one for search / filter result listings and one for individual entries. On top of that, additional information such as a feed of recent additions and statistics are available in a popup window. Less visible, but an enhancement nevertheless, is the fact that navigating the map no longer requires full page loads and is thus much smoother.
The templates that editors of the OER World Map use to input data have also been slighlty reworked. In order to reduce the number of fields that are presented to the users, the inverses of fields for links that have a more or less natural direction have been hidden. In order to clarify the semantics of data elements, descriptions are available via tooltips. The most significant simplification probably is that Markdown is now supported for fields that hold running text.
Finally, first elements of an administrative interface have been implemented. Among these are the administration of roles for registered users, a UI for data migrations and a precise log of all changes that have been made to the database.
During phase II, the emphasis was to win editors for the OER World Map by individually inviting them to collaborate. An important step towards growing a bigger community of OER World Map users now is the possibility for anyone to register a user account. Once registered, it is possible to create a personal profile and thus represent oneself on the map. Also, the possibility to comment on entries is only available to registered users. You are very welcome to participate in editing data beyond that; get in touch if you are interested!
From a technical point of view, we have switched to what can be described as a perimeter security model. User authentification and authorization is now done by an Apache reverse proxy before a request even hits the OER World Map web application. On the one hand, this separation of concerns brings a performance gain. On the other hand, it would be hard to compete with Apache’s battle proven security anyways.
While invisible to most users, there have been very important improvements in the archticture of the back end of the system. While using an Elasticsearch index as our main data sink allowed us to quickly grow the system during phase II, some of the limitations of that approach became evident once more editorial activity was recorded.
On the one hand, data needs to be denormalized quite heavily to fully embrace the features of a document oriented system such as Elasticsearch, especially when it comes to aggregations. On the other hand, the data in the OER Data Hub is highly interlinked. This combination makes write-operations quite expensive because a single update operation often modifies multiple JSON documents in the index. A successful write operation could only be assumed once the data trickled into all places it was supposed to be, which made waiting times unacceptable.
To complement the extremely fast read operations that Elasticsearch provides with equally fast write operations, a special type of relational database, a triple store, was added to the technology stack. It is now our primary data store and the single source of truth in the system which asynchronously feeds the Elasticsearch index after write operations.
Another precondition to gradually open the platform to a bigger circle of editors has been data versioning. In order to ensure data quality, it must be possible to retrace the evolution of the dataset. In other words, it is necessary to completely understand who changed which parts of the data, and when the changes happened. Naturally being familiar with the way source code is versioned, we adopted the structure of Git commits to the RDF data in our triple store:
Author: felix.ostrowski@XYZ.com Date: 2016-07-01T15:55:51.012+02:00 + <urn:uuid:123> <http://schema.org/name> "Felix Ostrowski" . - <urn:uuid:123> <http://schema.org/name> "Felix Ostrowsko" . Author: felix.ostrowski@XYZ.com Date: 2016-06-29T18:01:40.587+02:00 + <urn:uuid:123> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://schema.org/Person> . + <urn:uuid:123> <http://schema.org/name> "Felix Ostrowsko" .
A nice side effect of all our data being a series of commits such as the ones above, is that the back-up strategy is simply a matter of saving plain text commit files. These are the only precondition to completely recover our data set after potential failures.
On the data model side of things, a tag field has been added to all resources. This, along with the corresponding filter, allows editors to create arbitrary custom subsets of the data. With regards to the
Service type, a
license field is now exposed, along with a controlled vocabulary of licenses and the corresponding filter. Finally, the
funder property is now available for Projects.
We now have our first video content available through OER World Map! You can review the entries at the following URLs:
Open Educational Resources in Africa
NOVA’s OER-Based Associate Degree Project
We hope to add more stories in this way over the coming months. If you have a video relating to OER that you would like to share with the community then get in touch.
Content published on YouTube with a licence that permits sharing can easily be repurposed for the map in this way using the embed code provided in the sharing menu.