Posts tagged open data
As a linguist one of the great developments in the way I work has been the development of online resources in recent years.
These vary from dictionaries to terminological databases and it’s one of the latter that this post is about.
Last week, IATE celebrated its tenth anniversary of being accessible to the public and proudly publicised its entry into double figures.
As can be seen from the screenshot IATE is short for InterActive Terminology for Europe; and it’s a great resource. Searches can be conducted in language pairs from the following languages:
Searches can also be refined by picking specialist subject areas from, for example, accounting to the wood industry.
It’s a particularly good resource for terminology involving the European institutions, political matters and international relations in general, but is also no slouch when it comes to specific terms for, say, forestry.
History & background
IATE is the EU’s inter-institutional terminology database. IATE has been used in the EU institutions and agencies since summer 2004 for the collection, dissemination and shared management of EU-specific terminology. The project partners are:
- European Commission
- European Parliament
- Council of Ministers
- Court of Justice
- Court of Auditors
- Economic & Social Committee
- Committee of the Regions
- European Central Bank
- European Investment Bank
- Translation Centre for the Bodies of the EU
The project was launched in 1999 with the objective of providing a web-based infrastructure for all EU terminology resources, enhancing the availability and standardisation of the information.
IATE incorporates all of the existing terminology databases of the EU’s translation services into a single new, highly interactive and accessible inter-institutional database. Legacy databases from EU institutions have been imported into IATE, which now contains some 1.4 million multilingual entries.
The IATE website is administered by the Translation Centre for the Bodies of the European Union in Luxembourg on behalf of the project partners.
Download the database
Having been produced at public expense, the entire database has been opened up to the public can be downloaded, all 1.4 million entries!
Happy birthday, IATE! Here’s to the next 10 years
The Dutch Standardisation Board would like to see the mandatory use of Open Document Format (ODF) for the country’s public sector organisations, according to a report on Joinup giving details of a presentation made by Nico Westpalm van Hoorn to the recent ODF Plugfest held in The Hague.
Van Hoorn stated that over 450,000 documents are transferred each day between the Dutch central
government and citizens or companies.
His presentation contained 3 main messages:
- The only way reuse of document content is achievable for open data is by using the ODF format;
- The only way to ensure sustainable access is by using the ODF format; and
- “This format cannot be opened,” as a remark by a public servant is not acceptable when somebody sends an ODF document.
Within the Dutch government, ODF is used as the default format for editable documents that are posted online. Documents are by default shared as HTML, PDF (for archiving) and as ODF. Furthermore, all central government workstations are capable of working with ODF, suggesting that civil servants who cannot open the format need some IT training.
Speaking at the same event, Steven Luitjes, director of Logius – an agency assisting government organisations in building e-government services, admitted that ODF is often ignored by public sector organisations and that a failure to standardise on formats is increasing the cost of public sector IT.
If the Dutch government does adopt ODF as a standard, this would follow on from the recent announcement of the standard’s adoption by the Italian Ministry of Defence (posts passim) and the UK government’s publication of guidance for the introduction of ODF.
Openwords, the foreign language learning app for the world’s open language data the world’s under-served languages, which was reported on some weeks ago by this blog (posts passim), recently launched a crowdfunding campaign on Kickstarter.
At the time of writing the Kickstarter campaign has 16 days to go and aims to raise $10,000 to take the project to the next stage.
Foreign language learning with open data
There are millions of people around the world who can’t learn the languages in which they’re interested.
While the learning of major languages like Chinese, Spanish and French are supported by large companies, these firms tend to ignore lesser-known languages.
Openwords is doing things differently to solve this problem. Openwords is mining data from the public domain assets like the Wiktionary to provide educational content for all the languages, large and small.
So far Openwords has mined data for 1,000 languages.
The Openwords app has various learning modules for vocabulary, hearing, typing, amongst others. In addition, the Openwords developers are working on simple sentence translation problems. Furthermore, learners have control over the content they want to learn.
Finally, Openwords will be an open source project.
The aim of the Kickstarter campaign is to raise $10,000, which will be enough to develop a beta model of the Openwords app.
Sonia Montegiove writes on the Libre Umbria blog that a workshop on openness is being organised on Thursday 12th March between 3 pm and 5 pm in Hall 20 in the Faculty of Economics of the University of Perugia as part of the initiatives linked to the Umbria Digital Agenda organised by the Umbria regional government.
After the introduction by Loris Maria Nadotti of the Department of Economics of the University of Perugia, Giovanni Gentili, the regional government’s digital agenda officer will speak about openness in the digital agenda. He will be followed by Francesca Sensini on open government, Sonia Montegiove on open source and finally Cristiano Donato and Tommaso Vicarelli on open data.
Each talk will last a maximum of fifteen minutes to allow time for a final debate with the lecturers and students attending.
Opensource.com reports that Linux purveyor Red Hat is now accepting nominations for the Women in Open Source Award. Created to highlight the achievements women making major contributions to an open source project, to the open source community or through the use of open source methodology, this award is the first of its kind.
The award celebrates all different kinds of contributions to open source, including:
- Code and programming;
- Quality assurance, bug triage and other quality-related contributions;
- Involvement in open hardware;
- System administration and infrastructure contributions;
- Design, artwork, user experience (UX) and marketing;
- Documentation, tutorials and other forms of communication;
- Translation and other internationalisation contributions;
- Open content;
- Community advocacy and management;
- Intellectual property advocacy and legal reform;
- Open source methodology.
Nominees can qualify for one of two tracks:
- Academic award: open to women enrolled in college or university; and
- Community award: open to all other women.
The Women in Open Source Academic Award winner will receive:
- $2,500 stipend, with a suggested use of supporting an open source project or efforts; and
- A feature article on Opensource.com.
The Women in Open Source Community Award winner will receive:
- Ticket, flight and hotel accommodation for the Red Hat Summit to be held in Boston, Massachusetts on 23rd-26th June 2015;
- $2,500 stipend, with a suggested use of supporting an open source project or efforts;
- A feature article on Opensource.com; and
- Speaking opportunity at a future Red Hat Women’s Leadership Community event.
Nominations are open until 21st November. Judges from Red Hat will whittle down the nominees to a subset of finalists for both the Academic and Community awards, from whom the public will decide the winners. The winners will be announced in June during an awards ceremony at the 2015 Red Hat Summit in Boston, Massachusetts.
The Open Knowledge Foundation is doing marvellous work in the fields of open data and open content.
The Foundation has just published version 2 of its Open Definition. This definition is released under a Creative Commons Attribution licence and is reproduced verbatim below (complete with US spellings and punctuation throughout. Ed.).
The Open Definition makes precise the meaning of “open” with respect to knowledge, promoting a robust commons in which anyone may participate, and interoperability is maximized.
Summary: Knowledge is open if anyone is free to access, use, modify, and share it — subject, at most, to measures that preserve provenance and openness.
This essential meaning matches that of “open” with respect to software as in the Open Source Definition and is synonymous with “free” or “libre” as in the Definition of Free Cultural Works. The Open Definition was initially derived from the Open Source Definition, which in turn was derived from the Debian Free Software Guidelines.
The term work will be used to denote the item or piece of knowledge being transferred.
The term license refers to the legal conditions under which the work is made available. Where no license has been offered this should be interpreted as referring to default legal conditions governing use of the work (for example, copyright or public domain).
1. Open Works
An open work must satisfy the following requirements in its distribution:
1.1 Open License
The work shall be available as a whole and at no more than a reasonable one-time reproduction cost, preferably downloadable via the Internet without charge. Any additional information necessary for license compliance (such as names of contributors required for compliance with attribution requirements) must also accompany the work.
1.3 Open Format
The work must be provided in a convenient and modifiable form such that there are no unnecessary technological obstacles to the performance of the licensed rights. Specifically, data should be machine-readable, available in bulk, and provided in an open format (i.e., a format with a freely available published specification which places no restrictions, monetary or otherwise, upon its use) or, at the very least, can be processed with at least one free/libre/open-source software tool.
2. Open Licenses
A license is open if its terms satisfy the following conditions:
2.1 Required Permissions
The license must irrevocably permit (or allow) the following:
The license must allow free use of the licensed work.
The license must allow redistribution of the licensed work, including sale, whether on its own or as part of a collection made from works from different sources.
The license must allow the creation of derivatives of the licensed work and allow the distribution of such derivatives under the same terms of the original licensed work.
The license must allow any part of the work to be freely used, distributed, or modified separately from any other part of the work or from any collection of works in which it was originally distributed. All parties who receive any distribution of any part of a work within the terms of the original license should have the same rights as those that are granted in conjunction with the original work.
The license must allow the licensed work to be distributed along
with other distinct works without placing restrictions on these other works.
The license must not discriminate against any person or group.
The rights attached to the work must apply to all to whom it is redistributed without the need to agree to any additional legal terms.
2.1.8 Application to Any Purpose
The license must allow use, redistribution, modification, and compilation for any purpose. The license must not restrict anyone from making use of the work in a specific field of endeavor.
2.1.9 No Charge
The license must not impose any fee arrangement, royalty, or other compensation or monetary remuneration as part of its conditions.
2.2 Acceptable Conditions
The license shall not limit, make uncertain, or otherwise diminish the permissions
required in Section 2.1 except by the following allowable conditions:
The license may require distributions of the work to include attribution of contributors, rights holders, sponsors and creators as long as any such prescriptions are not onerous.
The license may require that modified versions of a licensed work carry a different name or version number from the original work or otherwise indicate what changes have been made.
The license may require copies or derivatives of a licensed work to remain under a license the same as or similar to the original.
The license may require retention of copyright notices and identification of the license.
The license may require modified works to be made available in a form preferred for further modification.
2.2.6 Technical Restriction Prohibition
The license may prohibit distribution of the work in a manner where technical measures impose restrictions on the exercise of otherwise allowed rights.
The license may require modifiers to grant the public additional permissions (for example, patent licenses) as required for exercise of the rights allowed by the license. The license may also condition permissions on not aggressing against licensees with respect to exercising any allowed right (again, for example, patent litigation).
The Free and Hanseatic City of Hamburg has put a transparency portal online, heise reports. Data and documents from the city administration and publicly-owned companies are being made available in the schedule of information. The portal also comprises the data from the former Hamburg Open Data Portal. Amongst other things, the transparency portal makes available decisions by Hamburg’s Senate, minutes and resolutions, budget and management plans, policies and specialist guidelines, official statistics and progress reports, geodata, the tree protection register, environmental measurement data and commercial data.
The Free and Hanseatic City of Hamburg is therefore complying with the requirements of the Transparency Law, which became effective in the city in October 2013. According to this legislation, Hamburg must publish its reports, contracts and Senate decisions on the internet. Under the previous Information Freedom Law, it only had to provide information upon request.
Yesterday, while David Cameron was rearranging the deckchairs on his governmental re-enactment of the SS Titanic, one significant piece of news (apart from the DRIP Bill. Ed.) seems to have escaped the personality-obsessed British media.
The news was the Department for Business, Innovation and Skills announced that Companies House is to make all of its digital data available free of charge. It has hitherto charged users for anything but the most basic company information on its website.
This will make the UK the first country to establish a truly open register of business information.
As a result, it will be easier for members of the public and businesses to research and scrutinise the activities and ownership of companies and their directors. Last year (2013/14), users searching the Companies House website spent £8.7 million accessing company information on the register.
The release of company information as open data will also provide opportunities for entrepreneurs to come up with innovative ways of using the information.
This change will come into effect from the second quarter of 2015 (April – June).
Bristol City Council is working with the Future Cities Catapult and the Connected Digital Economy Catapult on a new open data initiative that will help Bristolians improve their city with the help of local authority data.
The partners are working together to release Bristol civic data sets such as traffic management and land use databases to citizens. The collaboration will support developers to use the data to create new products and services to improve how the city of Bristol works, making it easier to get around, reduce waste, save energy or improve the city’s air quality.
Once the data sets are made available online in late summer, citizens and businesses will be invited to explore around one hundred data sets, supported by a series of Catapult-run events and competitions. Bristolians will be supported in testing, prototyping and commercialising their ideas.
Following a successful initial data release, the Catapults and the Council will then create a schedule to release further useful city data sets in consultation with the developer community. The programme’s outcomes will be shared with local authorities, developers and organisations in other UK cities to spread the benefits to the citizens of other cities.
The Environment Agency has announced that it is releasing a whole raft of information as open data.
Environment Agency datasets that are already available as open data include:
- Flood Alert Areas;
- Flood Warning Areas;
- Flood Warnings (Live Feed);
- Real-time and Near Real-time River Levels (Live Feed);
- Real-time and Near Real-time Air Temperature (Live feed);
- 3 day Flood Forecast (Live Feed);
- Water Framework Directive (WFD) River Waterbodies;
- Water Framework Directive (WFD) Groundwater Classification Status and Objectives; and
- Water Framework Directive (WFD) Measures.
The Agency is now increasing its commitment and will soon public as much of its data as possible, including flood data, as open data. This means that over time more EA data will be made freely available to developers, technology companies and individuals.
To assist the release of open data, the Agency is setting up a user group to advise it on which data it is most important to concentrate on making open.
The group will be made up of external parties with an interest in EA data, its current data customers and people with an open data background; the group will also receive input from the Agency and Defra. Anyone interested in joining this group should email OpenData@environment-agency.gov.uk.