Open main menu

Changes

5,059 bytes added ,  18:10, 6 February 2017
Created page with "The wp:DBpedia project aims to interconnect the world's open data sets. (There are other similar projects like wp:Freebase.) One main dataset they "translate" into RD..."
The [[wp:DBpedia]] project aims to interconnect the world's open data sets. (There are other similar projects like [[wp:Freebase]].)

One main dataset they "translate" into RDF is the Wikipedia data. Although the freeform content of Wikipedia is not data, the extensive use of 'infoboxes' does give structure to the content (and visual styling too). Then this content ''can'' be mapped into actual ontologies (if they aren't already semantically mapped in Wikipedia itself <ref>[https://www.mediawiki.org/wiki/Extension:Page_Forms Page Forms] are often used to make it easy for users to enter the data, which is mapped into a semantic template.</ref>.)

One such infobox is the [https://en.wikipedia.org/wiki/Template:Infobox_software infobox for software]. It's [https://en.wikipedia.org/wiki/Special:WhatLinksHere/Template:Infobox_software?limit=500&namespace=0 used extensively] on the English Wikipedia, for instance, it's used on the [[wp:GIMP|article for GIMP]]

DBpedia has already [http://mappings.dbpedia.org/index.php/Mapping_en:Infobox_software mapped the Infobox_software template] so that all this data is contained in DBpedia.

== Goal ==

Make Free Software Directory (FSD) part of the (Semantically) Linked web of Open Data; [[Linked Data]] for short, also called the LOD, or "LOD cloud"

== InfoBox Approach ==

The FSD could incorporate the Infobox_software template so as to gain the ability to link the FSD dataset into the DBpedia. In other words, the FSD form should incorporate the Infobox_software template as a subset of the datapoints that go into an FSD listing.

== Debian Packaging System ==

The [https://packages.qa.debian.org/common/index.html Debian Package Tracking System] produces RDF metadata and is already included in DBpedia. For example, here's a 'Turtle' representation of the GIMP package https://packages.qa.debian.org/g/gimp.ttl

If all Debian packages are not in the FSD, they could be added by consuming their RDF. If we incorporate their data systematically, then our data becomes easily updated and synchronized by bot.

== WikiData ==

Wikidata, a project of the WikiMedia Foundation, is a free and open knowledge base that can be read and edited by both humans and machines. Wikidata acts as central storage for the structured data

The FSD should integrate with Wikidata. Not just in a reciprocal link manner, but in a real compatible data sharing way.

For example, here is the entry for the GIMP https://www.wikidata.org/wiki/Q8038 Notice that the WikiData for this entry contains [https://www.wikidata.org/wiki/Property:P2537 a property for the corresponding link in the FSD]

Also note that one of the ways that WikiData is composed and curated is through the use of bots like FLOSSbot. See https://www.wikidata.org/wiki/Wikidata:WikiProject_Informatics/FLOSS FSD should make use of bots to patrol and edit the directory. A [https://phabricator.wikimedia.org/diffusion/PBFB/browse/master/FLOSSbot/fsd.py plugin to the FLOSSbot] was used to browse the FSD, and add several hundred entries to WikiData.

== Benefits ==

When the data is machine readable, you get much more varied ways of representing and consuming the data. Examples:
* http://147.228.127.146:9220/search/software?query=%22gimp%22
* http://dbpedia.org/fct/facet.vsp?cmd=text&sid=3770

And, the FSD becomes part of the LOD Cloud https://en.wikipedia.org/wiki/File:LOD_Cloud_2014.svg

Being machine readable should make the directory more visible and useful.

It also puts emphasis on the quality of the data.

== Status ==

; What is the current status?

== Questions ==

These are areas of ambiguity; or just notes for action/follow-up.

; Do we use any semantic markup today in the forms?
: Yes (See [[:Template:Entry]]), however it appears to be a proprietary vocabulary meaning it's not a standard schema.
; Is this an initiative that the FSF supports?
: Unknown

=== Vocabulary / Schema ===

; Is there a currently adopted schema for describing ''software''?
: At [https://schema.org/docs/schemas.html schema.org] they show a schema for [https://schema.org/SoftwareApplication Software Application] and variants.
; Is that used in either Wikipedia's template, or in the final mapping at DBpedia?
: Unknown
; What is the schema used by Debian, and how does that map to our needs?
: Short answer: they use ADMS.SW ([https://joinup.ec.europa.eu/asset/adms_foss/description Asset Description Metadata Schema for Software]) See [http://www-public.tem-tsp.eu/~berger_o/weblog/2012/08/29/debian-package-tracking-system-now-produces-rdf-description-of-source-packages/ Olivier Berger's blog] for more detail.

== Requirements ==

The FSD would almost certainly have to be upgraded, to take advantage of the improved Semantic capabilities of MediaWiki since the currently installed version is 1.20 whereas the currently available version is 1.29.0-wmf.7 at the time of this writing. See [https://freephile.org/wikireport/?url=https://directory.fsf.org/wiki wiki report]


{{References}}

[[Category:Semantic]]
[[Category:Data]]
4,558

edits