Print:User manual
This "book" is the Semantic MediaWiki User's Guide at https://wiki.hackerspaces.org/Print:User_manual formatted for print.
User manual[edit]
Semantic MediaWiki (or SMW for short) is an extension to the well-known MediaWiki software. The purpose of SMW is to allow users to improve the structure and organization of the knowledge in a wiki by adding simple, machine-processable information to wiki articles. With this additional information, you can greatly improve searching, browsing, and sharing the wiki's knowledge; both within the wiki's pages and from external computer programs.
This is the user manual of SMW that focusses on information for users of an SMW-enabled wiki. If you are new to SMW, then the introduction to Semantic MediaWiki is a good starting point. Information for site administrators and for SMW developers (including information on the spinoff extensions of SMW) can be found on the main page.
The main sections of this handbook explain the most important aspects of using SMW:
These sections are relevant to different user groups, and need not be read in sequence. Detailed navigation is provided by the table of contents on the right. Other relevant parts of this handbook include:
- Getting support – who can help out with questions?
- Reporting bugs – making your requests for improvements heard
- SMW Project – the makers of Semantic MediaWiki
There are other resources that you might find useful:
- The SMW Quick Reference Guide, available in both PNG and PDF form - a one-page "cheat sheet", suitable for printing, that includes information on the syntax of SMW, plus those of the spinoff extensions Semantic Forms, Semantic Result Formats, Semantic Google Maps and Semantic Drilldown.
- The SMW Community Wiki, an SMW-based site that lists people, organizations, sites etc. that are related to Semantic MediaWiki. Includes a filterable list of all known SMW-based wikis.
- The public sandbox site - lets you try out SMW online
- Referata, a wiki hosting site that includes support for SMW and most of its extensions. It also includes the scratchpad wiki, which is very similar to the sandbox but also offers support for the other extensions.
Introduction to Semantic MediaWiki[edit]
Semantic MediaWiki (SMW) is a free extension of MediaWiki – the wiki-system powering Wikipedia – that helps to search, organise, tag, browse, evaluate, and share the wiki's content. While traditional wikis contain only texts which computers can neither understand nor evaluate, SMW adds semantic annotations that bring the power of the Semantic Web to the wiki.
Introduction to Semantic Mediawiki[edit]
Wikis have become a great tool for collecting and sharing knowledge in communities. This knowledge is mostly contained within texts and multimedia files, and is thus easily accessible for human readers. But wikis get bigger and bigger, and it can be very time-consuming to look for an answer inside a wiki. As a simple example, consider the following question a user might have:
- «What are the hundred world-largest cities with a female mayor?»
Wikipedia should be able to provide the answer: it contains all large cities, their mayors, and articles about the mayor that tell us about their gender. Yet the question is almost impossible to answer for a human, since one would have to read all articles about all large cities first! Even if the answer is found, it might not remain valid for very long. Computers can deal with large datasets much easier, yet they are not able to support us very much when seeking answers from a wiki: Even sophisticated programs cannot yet read and «understand» human-language texts unless the topic and language of the text is very restricted. The wiki's keyword search does not help either in discovering complex relationships.
Semantic MediaWiki enables wiki communities to make some of their knowledge computer-processable, e.g. to answer the above question. The hard problem for the computer is to find out what the words in a wiki page (e.g. about cities) mean. Articles contain many names, but which one is the current mayor? Humans can easily grasp the problem by looking into a language edition of Wikipedia that they do not understand (Korean is a good start unless you are fluent there). While single tokens (names, numbers, …) might be readable, it is impossible to understand their relevance in the article. Similarly, computers need some help for making sense of wiki texts.
In Semantic MediaWiki, editors therefore add «hints» to the information in wiki pages. For example, someone can mark a name as being the name of the current mayor. This is done by editors who modify a page and put some special text-markup around the mayor's name. After this, computers can access this information (of course they still do not «understand» it, but they can search for it if we ask them to), and support users in many different ways.
More information can be found in the user manual.
Where SMW can help[edit]
Semantic MediaWiki introduces some additional markup into the wiki-text which allows users to add "semantic annotations" to the wiki. While this first appears to make things more complex, it can also greatly simplify the structure of the wiki, help users to find more information in less time, and improve the overall quality and consistency of the wiki. To illustrate this, we provide some examples from the daily business of Wikipedia:
- Manually generated lists. Wikipedia is full of manually edited listings such as this one. Those lists are prone to errors, since they have to be updated manually. Furthermore, the number of potentially interesting lists is huge, and it is impossible to provide all of them in acceptable quality. In SMW, lists are generated automatically like this. They are always up-to-date and can easily be customised to obtain further information.
- Searching information. Much of Wikipedia's knowledge is hopelessly buried within millions of pages of text, and can hardly be retrieved at all. For example, at the time of this writing, there is no list of female physicists in Wikipedia. When trying to find all women of this profession that are featured in Wikipedia, one has to resort to textual search. Obviously, this attempt is doomed to fail miserably. Note that among the 20 first results, only 5 are about people at all, and that Marie Curie is not contained in the whole result set (since "female" does not appear on her page). Again, querying in SMW easily solves this problem (in this case even without further annotation, since existing categories suffice to find the results).
- Inflationary use of categories. The need for better structuring becomes apparent by the enormous use of categories in Wikipedia. While this is generally helpful, it has also led to a number of categories that would be mere query results in SMW. For some examples consider the categories Rivers in Buckinghamshire, Asteroids named for people, and 1620s deaths, all of which could easily be replaced by simple queries that use just a handful of annotations. Indeed, in this example Category:Rivers, Property:located in, Category:Asteroids, Category:People, Property:named after, and Property:date of death would suffice to create thousands of similar listings on the fly, and to remove hundreds of Wikipedia categories.
- Inter-language consistency. Most articles in Wikipedia are linked to according pages in different languages, and this can be done for SMW's semantic annotation as well. With this knowledge, you can ask for the population of Bejing that is given in Chinese Wikipedia without reading a single word of this language. This can be exploited to detect possible inconsistencies that can then be resolved by editors. For example, the population of Edinburgh at the time of this writing is different in English, German, and French Wikipedia.
- External reuse. Some desktop tools today make use of Wikipedia's content, e.g. the media player Amarok displays articles about artists during playback. However, such reuse is limited to fetching some article for immediate reading. The progam cannot exploit the information (e.g. to find songs of artists that have worked for the same label), but can only show the text in some other context. SMW leverages a wiki's knowledge to be useable outside the context of its textual article. Since semantic data can be published under a free license, it could even be shipped with a software to save bandwidth and download time.
Contact and user support[edit]
For contacting the SMW Project, see the contact page. For comments and questions, there is an active user mailing list that you can join. See Help:Getting support for further information about support for SMW.
Bugs and feature requests for SMW can als be filed at MediaZilla, see the documenation on reporting bugs.
Browsing interfaces[edit]
Semantic MediaWiki provides various simple interfaces for browsing the data of the wiki, which should often be very simple to use.
The Factbox[edit]
The factbox is a box at the bottom of wiki pages which summarises the semantic data that was entered into the page. This also helps editors to check whether Semantic MediaWiki actually «understood» the supplied information as intended. Users can read the Factbox for getting a quick overview, and for using its links to further information. Note that the Factbox might be switched off by the site administrator, as some wikis do not wish to show the added information on each page (see Help:Configuration).
Factboxes show information in two columns: the left column displays the property that some information belongs to (e.g. population), while the right column shows the value of that property (e.g. 3,410,000). Each property name is a link to the property's article in the wiki, where one can normally find more information about a property's meaning and usage. Annotations in the Factbox usually provide links to involved wiki pages, and for properties that support [Help:Custom units|units of measurement]], the Factbox also shows converted values in other units.
The icon next to each property value links to a simple search at Special:SearchByProperty (see below). For example, if an article contains the annotation [[is located in::Germany]] then its Factbox links to a search listing all pages with the same annotation (i.e. everything located in Germany). Similarly, the header of the Factbox shows an icon that links to a simple semantic browser for the given page (see below).
The Factbox may also show an icon that when clicked links to external web services that provide more information about a property value. For example, a property for specifying geographic coordinates might link to online mapping services that provide aerial images and maps of the chosen location. Wiki administrators can choose the links that each property displays; see Help:Service links for details on how to do it.
Special properties, i.e. built-in properties that are relevant to SMW, are displayed in italics and show a tooltip when hovering the mouse over them (requires JavaScript). This emphasises their special meaning and helps editors to spot errors.
Finally, the factbox contains a link to retrieve the Factbox contents in the machine-readable OWL/RDF format as explained in the section on Semantic Web technologies in SMW.
Editors can put the magic words and on any page to hide or display (if nonempty) the Factbox on any page, independently of the global wiki configuration.
Semantic Browsing[edit]
Special:Browse offers a simple browsing interface for the computer-readable data of Semantic MediaWiki. Users start by entering the name of a page. The special then displays all semantic properties of that page (similar to the content of the Factbox), and all semantic links that lead to that page. By clicking on the icon , the user can browse to another article. For a more detailed description and some configuration settings, see Help:Browse.
Simple search interfaces[edit]
Semantic MediaWiki provides a number of very simple search forms that allow users to find specific information. These search features are accessed in various special pages:
- Special:SearchByProperty has a simple search form for finding semantic backlinks. Users enter a property name and target value. The search returns a list of all pages that have that property with that value. If you search for a property of a numeric type, and there are only few results, nearest results will be shown as well. This can be switched off by setting $smwgSearchByPropertyFuzzy to false in your local settings. This service is directly accessible through the link within the Factbox or the browse page.
- Special:PageProperty displays all values some page has for some property. Users enter a page and a property name. The search displays a list of all values of the property on that page. If Factboxes are enabled in the wiki, the same information can also be read off the Factbox of the page.
Viewing all properties, types, and values[edit]
Each property has an own page in the Property namespace, similar to the category pages in MediaWiki's Category namespace. These property pages show all pages using the property together with the property's value(s) on that page, possibly with service links
Besides the links in the Factbox and the normal MediaWiki search, there are also special ways of finding properties of a wiki:
- Special:Properties lists properties that appear in annotations ordered by the frequency of their usage. Most frequently used properties are displayed on top.
- Special:UnusedProperties lists property pages that are not in any annotations. This may indicate that a property was abandonded and should be deleted.
- Special:WantedProperties lists properties that are used but do not have a descriptive page. This is not desirable, since users then cannot find any documentation on such properties, so that confusion may arise regarding their proper use.
- Special:Types lists the available datatypes for properties. See Help:Editing for an introduction into datatypes and their relevance in SMW.
These special pages are particularly useful for wiki gardeners and editors, since they give some overview of how properties are used in the wiki.
These approaches all search for semantic properties of pages. To simply view existing pages about properties, categories, and types, even if unused or garbled, use MediaWiki's Special:Allpages to display all pages within these namespaces.
External tool support[edit]
Semantic MediaWiki makes semantic knowledge available to external tools via its OWL/RDF export, and it therefore is possible to write external tools that implement further advanced browsing and searching functionality. Currently, many tools that work on RDF output are not very user-friendly yet, but the supplied data format RDF is easy to process and could be integrated in much more elaborate web or desktop applications. For more information, there is a list of tools that have been tested with Semantic MediaWiki so far. Feel free to add your own tool there as well.
Semantic search[edit]
Semantic MediaWiki includes an easy-to-use query language which enables users to access the wiki's knowledge. The syntax of this query language is similar to the syntax of annotations in Semantic MediaWiki. This query language can be used on the special page Special:Ask, in concepts, and in inline queries. This page provides a short introduction to semantic search in general. More detailed explanations are found on other pages of this manual:
- Help:Selecting pages: explains the basic way to describe what pages should appear in a query result. This is the core of SMW's query language.
- Help:Displaying information: introduces printout statements as a way of showing additional information in queries, such as property values or category assignments.
- Help:Concepts: shows how queries can be saved in concepts, which are a kind of «dynamic categories» offerend by SMW.
- Help:Inline queries: explains ways of including query results into wiki pages, and shows how to format the query results for display. This is the purpose of the SMW parser functions #ask and #show.
- Help:Inferencing: explains how one can specify general schematic knowledge in SMW (and what this is in the first place). This feature is used by SMW to smartly deduce facts that were not directly entered into the wiki.
Naturally, answering queries requires additional resources, and the administrators of some sites can decide to switch off or restrict query features in order to ensure that even high-traffic sites can handle the additional load.
Introduction[edit]
Semantic queries specify two things:
- Which pages to select
- What information to display about those pages
All queries must state some conditions that describe what is asked for. You can select pages by name, namespace, category, and most importantly by property values. For example, the query
[[Located in::Germany]]
is a query for all pages with the "Located in" property with a value of "Germany". If you enter this in Special:Ask and click "Find results", SMW executes the query and displays results as a simple table of all matching page titles. If there are many results, they can be browsed via the navigation links at the top and bottom of the query results, for example a query for all persons on semanticweb.org.
The second point is important to display more information. In the example above, one might be interested in the population of the things located in Germany. To display that on Special:Ask, one just enters the following into the printout box on the right:
?Population
and SMW displays the same page titles and the values of the Population property on those pages, if any. Printout statements may have some additional settings to further control how the property is displayed.
Selecting pages[edit]
The most important part of the Semantic search features in Semantic MediaWiki is a simple format for describing which pages should be displayed as the search result. Queries select wiki pages based on the information that has been specified for them using Categories, Properties, and maybe some other MediaWiki features such as a page's namespace. The following paragraphs introduce the main query features in SMW.
Categories and property values[edit]
In the introductory example, we gave the single condition [[Located in::Germany]] to describe which pages we were interested in. The markup text is exactly what you would otherwise write to assert that some page has this property and value. Putting it in a semantic query makes SMW return all such pages. This is a general scheme: The syntax for asking for pages that satisfy some condition is exactly the syntax for explicitly asserting that this condition holds.
The following queries show what this means:
- [[Category:Actor]] gives all pages directly or indirectly (through a sub-, subsub-, etc. category) in the category.
- [[born in::Boston]] gives all pages annotated as being about someone born in Boston.
- [[height::180cm]] gives all pages annotated as being about someone having a height of 180cm.
By using other categories or properties than above, we can already ask for pages which have certain annotations. Next let us combine those requirements:
[[Category:Actor]] [[born in::Boston]] [[height::180cm]]
asks for everybody who is an actor and was born in Boston and is 180cm tall. In other words: when many conditions are written into one query, the result is narrowed down to those pages that meet all the requirements. Thus we have a logical AND. By the way: queries can also include line breaks in order to make them more readable. So we could as well write:
[[Category:Actor]] [[born in::Boston]] [[height::180cm]]
to get the same result as above. Note that queries only return the articles that are positively known to satisfy the required properties: if there is no property for the height of some actor, that actor will not be selected.
When specifying property values, SMW will usually ignore any initial and trailing whitespace, so the two conditions [[height::180cm]] and [[height:: 180cm ]] mean the same. Datatypes such as number may have additional features such as ignoring commas that might be use to separate the thousands. SMW will also treat synonymous page names the same, just like MediaWiki would usually consider Semantic wiki, Semantic_wiki, and semantic wiki to refer to the smae page.
Property values: wildcards and comparators[edit]
In the examples above, we gave very concrete property conditions, using «Boston» and «180cm» as values for properties. In many cases, one does not look for only one particular values, but for a whole range of values, such as all actors that are taller than 180cm. In some cases one may even just look for all pages that have any values for a given property at all. For example, the deceased people could be those which have a value for the property «date of death». Such general conditions are possible with the help of comparators and wildcards.
- Wildcards are written as "+" and allow any value for a given condition. For example, [[born in::+]] returns all pages that have any value for the property «born in».
Comparators are special symbols like < or >. They are placed after :: in property conditions. SMW currently supports the following comparators:
- > and <: greater than/less than or equal
- !: unequal
- ~: «like» comparison for strings (disabled by default)
Comparators work only for property values, but not for conditions on categories. A wiki installation can limit which comparators are available, which is done by the administrator by modifying the value of $smwgQComparators as explained in the file SMW_Settings.php.
Greater than or equal, less than or equal[edit]
With numeric values, you often want to select pages with property values within a certain range. For example
[[Category:Actor]] [[height::>6 ft]] [[height::<7 ft]]
asks for all actors that are between 6 feet and and 7 feet tall. Note that this takes advantage of the automatic unit conversion: even if the height of the actor was set with [[height::195cm]] it would be recognized as a correct answer (provided that the datatype for height understands both units, see Help:custom units). Note that the comparator means greater/less than or equal – the equality symbol = is not needed.
Such range conditions on property values are mostly relevant if values can be ordered in a natural way. For example, it makes sense to ask [[start date::>May 6 2006]] but is is not really helpful to say [[homepage URL::>http://www.somewhere.org]].
If a datatype has no natural linear ordering, Semantic MediaWiki will just apply the alphabetical order to the normalised datavalues as they are used in the RDF export. You can thus use greater than and less than to select alphabetic ranges of a string property. For example, you could ask [[surname::>Do]] [[surname::<G]] to select surnames between «Do» and up to «G». For wiki pages, the comparator refers to the name of the given page (without the namespace prefix).
Here and in all other uses of comparators, it might happen that a searched for value really starts with a symbol like <. In this case, SMW can be prevented from interpreting the symbol as a comparator if a space is inserted after ::. For example, [[property:: <br>]] really searches for pages with the value «<br>» for the given property.
Not equal[edit]
You can select pages that have a property value which is unequal to a given value. For example, [[Area code::!415]] will select pages that have an area code which is not «415». Note that this is query description does not look for pages which do not have an area code 415. Rather, it looks for all pages that (also) have a code unequal to 415. In particular, pages that have no area code at all cannot be the result of the above query.
As with the (default) equality comparator, the use of custom units may require rounding in numeric conversions that can lead to unexpected results. For example, [[height::!6.00 ft]] may still select someone whose height displays as «6.00 feet» simply because the exact numeric value is not really 6. In such situations, it might be more useful to query for pages that have a property value outside a certain range, expressed by taking a disjunction (see below) of conditions with < and >.
String comparisons: Like[edit]
The comparator ~ only works for properties of Type:String. In a like condition one uses '*' wildcards to match any sequence of characters and '?' to match any single character. For example, one could ask [[Address::~*Park Place*]] to select addresses containing the string «Park Place», or [[Honorific::~M?.]] to select both «Mr.» and «Ms.».
Unions of query results: disjunctions[edit]
Disjunctions are OR-conditions that admit several alternative conditions on query results. SMW has two ways of writing disjunctions in queries:
- The operator OR is used for taking the union of two queries.
- The operator || is used for disjunctions in values, page, and category names.
In any case, the disjunction requires that at least one (but maybe more than one) of the possible alternatives is satisfied (logical OR). For example, the query
[[born in::Boston]] OR [[born in::New York]]
describes all pages of people born in Boston or New York. This can also be written with || as as [[born in::Boston||New York]]. In the latter case, «Boston||New York» describes a value that may be either of the two alternatives. Writing queries with || is usually more concise, but not all disjunctions can be written in this way. The following is an example that can not be expressed with ||:
[[born in::Boston]] OR [[Category:Actor]]
The || syntax can be used not only in property values, but also with catgories, like in the query [[Category:Musical actor||Theatre actor]].
Describing single pages[edit]
So far, all conditions depended on some or the other annotation given within an page. But there are also conditions to directly select some pages, or pages from a given namespace.
Directly giving some page title (possibly including a namespace prefix), or a list of such page titles separated by ||, selects the pages with those names. An example is the query
[[Brazil||France||User:John Doe]]
which has three results (at least if the pages exist). Note that the result does not display any namespace prefixes; see the hover box or status bar of the browser, or follow the links to determine the namespace. Restricting the set based on an attribute value one could ask, e.g., «Who of Bill Murray, Dan Aykroyd, Harold Ramis and Ernie Hudson is taller than 6ft?». But direct selection of articles is most useful if further properties of those articles are asked for, e.g. to simply print the height of Bill Murray.
To select a category in this way, a : must be put before the category name. This avoids confusing [[Category:Actor]] (return all actors) and [[:Category:Actor]] (return the category «Actor»).
Restricting results to a namespace[edit]
A less strict way of selecting given pages is via namespaces. The default is to return pages in every namespace. To return pages in a particular namespace, specify the namespace with a «wildcard», e.g. write [[Help:+]] to return every page in the «Help» namespace. Since the main namespace usually has no prefix, write [[:+]] to select only pages in the main namespace.
Disjunctions work again with the || syntax as above. For example, to return pages in either the main or «User» namespace, write [[:+||User:+]]. To return pages in the «Category» namespace, a : is again needed in front of the namespace label to prevent confusion.
Subqueries and property chains[edit]
Enumerating multiple pages for a property is cumbersome and hard to maintain. For instance, to select all actors that are born in a Italian city one could write:
[[Category:Actor]] [[born in::Rome||Milan||Turin||Florence||...]]
To generate a list of all these Italian cities one could run another query
[[Category:City]] [[located in::Italy]]
and copy and paste the results into the first query. What one would like to do is to use the city query as a subquery within the actor query to obtain the desired result directly. Instead of a fixed list of page names for the property's value, a new query enclosed in <q> and </q> is inserted within the property condition. In this example, one can thus write:
[[Category:Actor]] [[born in::<q>[[Category:City]] [[located in::Italy]]</q>]]
Arbitrary levels of nesting are possible, though nesting might be restricted for a particular site to ensure performance. For another example, to select all cities of the European Union you could write:
[[Category:Cities]] [[located in::<q>[[member of::European Union]]</q>]]
In the above example, we essentially have cosntructed a chain of properties «located in» and «member of» to find things that are located in something which is a member of the EU. Queries can be written in a shorter form for this common case:
[[Category:Cities]] [[located in.member of::European Union]]
This query has the same meaning as above, but with much less special sybols required. In general, chains of properties are created by listing all properties separated by dots. In the rare case that a property should contain a dot in its name, one may start the query with a space to prevent SMW from interpreting this dot in a special way.
NOTE: It is not possible to use a subquery to obtain a list of properties that is then used in a query. See #Subqueries for properties below.
Using templates and variables[edit]
Arbitrary templates and variables can be used in a query. An example is a selection criteria that displays all future events based on the current date:
[[Category:Event]] [[end date::>{{CURRENTYEAR}}-{{CURRENTMONTH}}-{{CURRENTDAY}}]]
Another particularly useful variable for inline queries is {{FULLPAGENAME}}
for the current page with namespace, which allows you to reuse a generic query on many pages. For an example of this, see Property:Population.
Read about inline queries for more information.
Sorting results[edit]
It is often helpful to present query results in a suitable order, for example to present a list of European countries ordered by population. Special:Ask has a simple interface to add a sorting condition to a query. The name of the property to sort by is entered into a text input, and ascending or descending order can be selected. SMW will usually attempt to sort results by the natural order that the values of the selected property may have: numbers are sorted numerically, strings are sorted alphabetically, dates are sorted chronologically. The order therefore is the same as in the case of the < and > comparators in queries. If no specific sorting condition is provided, results will be ordered by their page name.
It is possible to provide more than one sorting condition. If multiple results turn out to be equal regarding the first sorting condition, the next condition is used to order them and so on. A query for actors, e.g., could be ordered by year of birth and use the last name of the actor as a second ordering condition. All actors that were born in the same year would thus be ordered alphabetically by their last name instead of appearing in random order.
Sorting a query can also influence the result of a query, because it is only possible to sort by property values that a page actually has. Therefore, if a query is ordered by a property (say «Population») then SMW will usually restrict the query results to those pages that have at least one value for this property (i.e. only pages with specified population appear). Therefore, if the query does not require yet that the property is present in each query result, then SMW will silently add this condition. But SMW will always try to find the ordering property withint the given query first, and it is even possible to order query results by subproperties. Some examples should illustrate this:
- [[Category:City]] [[Population::+]] ordered by «Population» will present the cities with population in ascending order. The query result is the same as without the sorting.
- [[Category:City]] ordered by «Population» will again present the cities with population in ascending order. The query result may be modified due to the sorting condition: if there are cities without a population given, then these will no longer appear in the result.
- [[Category:City]] [[has location country.population::+]] ordered by «Population» will present the cities ordered by the population of the countries they are located in. The query result is not changed, but «population» now refers to a property used in a subquery.
If a property that is used for sorting has more than one value for some page, then this page will still appear only once in the result list. The position that the page takes in this case is not defined by SMW and may correspond to either of the property values. In the above examples, this would occur if one city would have multiple population numbers specified, or if one citiy is located in multiple countries each of which has a population. It is suggested to avoid such situations.
Query results displayed in a result table can also be ordered dynamically by clicking on the small sort icons found in the table heading of each column. This function requires JavaScript to be enabled in the browser and will sort only the displayed results. So if, e.g., a query has retrieved the twenty world-largest cities by population, it is possible to sort these twenty cities alphabetically or in reverse order of population, but the query will certainly not show the twenty world-smallest cities when reversing the order of the population column. the dynamic sorting of tables attepts to use the same order as used in SMW queries, and in particular orders numbers and dates in a natural way. However, the alphabetical order of strings and page names may slightly vary from the wiki's alphabetic order, simply because there are many international alphabets that can be ordered in different ways depending on the language preference.
Linking to Semantic Search Results[edit]
Links to semantic query results on Special:Ask can be created by means of the inline query feature in SMW as explained in its documentation. It is not recommended to create links directly, since they are very lengthy and use a specific encoding. Developers who create extensions that link to Special:Ask should also use SMW's internal functions for building links. Understanding the details of SMW's encoding of queries in links is therefore not required for using SMW.
Things that are not possible[edit]
Subqueries for properties[edit]
It is not possible to use a subquery to obtain a list of properties that is then used in a query. One can, however, use a query that returns a list of properties, and copy and paste the result into another query. Alternatively, one can use the template results format to pass properties directly to another query.
Queries with special properties[edit]
SMW currently does not support queries for the values of any of SMW's built-in Special properties such as «Has type», «Allows value» or «Equivalent URI».
Displaying information[edit]
Queries in Semantic MediaWiki return a list of pages, and the default result of a query therefore simply displays the selected pages' titles. Additional information such as a page's property values or categories, can be included into a query result by using additional printout statements that are introduced here. In Special:Ask, printout statements can simply be entered into the input box on the right, with one statement per line.
There are different kinds of printout statments, but all of them can be recognised by the question mark ? that they start with. The important difference between printout statements and query descriptions is that the former do not restrict the result set in any way: even if some printout has no values for a given page, an empty field will be printed, but the page is still part of the result.
Printing property values[edit]
The most common form of printout statements are property printouts, that make SMW display all values assigned to a certain property. These are written simply as a question mark followed by the property name, e.g.
?population
prints the values for «population» of all query results. On Special:Ask, the result of each printout is shown in a table column that is labelled by the name of the property. It is possible to change that label for a printout, and this will be very useful when using queries on wiki pages (it is not really relevant on Special:Ask of course). The equality symbol is used to change the label:
?population = Number of inhabitants
The above still prints population, but with the modified label in the table header. As mentioned above, property printouts may have an empty result for some pages, e.g. if something does not have any population. Property conditions with wildcards (see above) can be used to ensure that all elements in a query result have some value for the printed property, if this is desired.
Printing categories[edit]
There are two ways to print category information: either SMW prints all categories assigned to some page, or SMW checks for one particular category. The first case is achieved by the printout
?Category
where «Category» is the name of the Category namespace in the local language. This printout will show all catgories that are directly used on a result page. The other option is to ask for one particular category, such as
?Category:Actor
The result then will contain a column «Actor» that contains X for all pages that directly belong to that category, and is empty otherwise. Again, one can change the label using equality:
?Category:Actor = A
will merely display an «A» as the header of the result column which might be more sensible given that the entries in that column are very short. It is also possible to change the way in which this kind of category queries are formatted, as described below.
The main result column[edit]
All queries by default display the main list of result pages in the first column. In some cases, it can be useful to move that to another position. This is not relevant for Special:Ask, but can be quite useful in inline queries. A special printout statment is available for this purpose:
?
This single question mark addresses the «unlabelled result column» that shows the main result list. As before, different labels can be assigned with the equality symbol, e.g.
? = Results
Display format[edit]
Many printout statements can be further customised by giving a printout format which can be given after a property name, separated by the symbol #.
For properties that support units, queries can thus determine which unit should be used for the output. To print the height in cm, e.g., one would use the following:
?height#cm
this assumes that the property height is aware of the unit «cm». Datatypes other than Type:Number may have different printout formats. See the types documentation for details.
For printouts of the form ?Category:Actor, the display format can be used to modify what SMW will display for cases where a page is (or is not) in the category. The following is an example:
?Category:Actor#an actor, not an actor
This will show the text «an actor» for all pages that are actors, and the text «not an actor» otherwise. This can, for example, also be used in combination with small images to display icons for certain categories.
Concepts[edit]
It is possible to store queries in Semantic MediaWiki on dedicated pages, called concepts. These pages can be viewed as «dynamic categories», i.e. as collections of pages that are not created manually, but that are computed by SMW from the description given by a query. An example could be the concept of European cities. In traditional MediaWiki installations, one may have a category called European cities that holds all such cities. In SMW, one would instead define the concept «European cities» by saying that it contains all cities that are located in Europe. No city page needs to be changed, and yet one can create many concepts about cities (such as «capital», «Italian city», or «large coastal city located at a river»).
Creating a concept[edit]
A concept is a page in the Concept: namespace that is always described by a semantic query, as explained in Help:Semantic search. For example, the Concept:Semantic Web events 2008 describes certain events in 2008. Its concept page contains the following text to do that:
{{#concept: [[Category:Event]] [[start date::> Jan 1 2008]] [[start date::< Dec 31 2008]] | Events in the year 2008 that have been announced on semanticweb.org. To add more events, go to the page "Events" on semanticweb.org. }}
The parser function #concept is used to define concepts. Its first parameter is a concept description. Its second parameter is a short text that describes the concept. This description is optional and can also be left away. It is exploited in some uses of concepts in SMW to have a concise short description of the concept (e.g. as a default description in RSS feeds). The complete concept page will then show this data, and give a preview of the results.
It is possible to have other content on the concept page as well. Any normal wiki text can go before and after the use of #concept but it will not have any effect on the definition of the concept. The #concept parser function can only be used on pages in the Concept: namespace, and it can only be used once on each such page.
Using concepts[edit]
Concept pages as such can be browsed to view the contents of some concept, similar to category pages. But they can also be used in other semantic queries just like categories. For example, the following query would show all pages in the above concept of events which are also located in Germany:
[[Concept:Semantic Web events 2008]] [[located in Germany]]
Note that this would look almost the same if we would have a category called «Semantic Web events 2008». therefore, concepts are also like stored queries that can be reused in other queries if desired.
SMW's inline queries may also use concepts, and in some cases even the concept description is used to beautify an output. Concept descriptions are also included in SMW's RDF export in form of OWL class descriptions, so that other Semantic Web tools can download and reuse the concept descriptions.
Inline queries[edit]
Semantic MediaWiki includes a simple query language for semantic search, so that users can directly request certain information from the wiki. Readers who do not wish to learn the query syntax can still profit from this feature: inline queries dynamically include query results into pages. So queries created by a few editors can be consumed by many readers.
Inline queries are similar to other semantic search features, and can also be restricted on a site in order to ensure sufficient performance. Since inline queries exploit the existing caching mechanisms of MediaWiki, most requests for a page with such dynamic contents can be served without any performance impact whatsoever.
Introduction to #ask[edit]
The basic way of writing an inline query is to use the parser function #ask. The query string (See selecting pages for syntax) and any printout statements are directly given as parameter, like in the following example:
{{#ask: [[Category:City]] [[located in::Germany]] | ?population | ?area#km² = Size in km² }}
Here we query for all cities located in Germany, and two additional printout statements are used (a simple one and one with some extra settings). This displays the following result on a page:
It is common to put the query as the first parameter behind #ask:. All other parameters are separated by |, just like for other parser functions. The exact formatting of the inline query is not essential, but it is good to use line breaks to make it more readable to other editors. As with all templates, one line per parameter, starting with the | is most accepted in practice.
Note that all the arguments to the #ask: function are ignored by the page parsing, hence the above example does not add a category or a «located in» property annotation to this page. A few more things to note are:
- The pipe '|' symbol is used to separate the conditions from the property to display.
- The conditions for display are a single argument to the #ask function, so there are no '|' symbols between them.
- White space and line breaks can be used within the #ask function, SMW is fairly flexible there.
- The format of the results display changes when you request display of additional properties. SMW picks an appropriate default format for query results, but you also have detailed control of the appearance of query results.
Knowing the basics of query string and printout statements therefore is enough to write many kinds of queries. But there are many cases where the standard table output of a query may not be the best choice, or where further settings are desired (like the maximum number of results that should be displayed). For this purpose, inline queries have a number of other possible parameters that one can use to control their appearance in detail. The general syntax for #ask therefore is the following:
{{#ask: argument 1 | argument 2 | … }}
Most of this page explains the various arguments one may use in inline queries.
Prior to Semantic MediaWiki 1.0 there was a different syntax for inline queries using an <ask> tag, which is still enabled on some wikis. Please see the old documentation page for details on this feature. It is strongly recommended to use only the new syntax now.
The #show parser function[edit]
A common usage of queries is to display only a single property value for a single page. For example, one could insert the population of Berlin into some article, and use a query instead of manual copying to achieve this. SMW has a special shortcut to make such queries simpler. For example, one can write
{{#show: Berlin | ?population}}
to display the population of Berlin (Result: «»). The function otherwise works like an inline query, and all parameters available for inline queries can also be used on #show if desired. The above function can also be written as an #ask query as follows:
{{#ask: [[Berlin]] | ?population = }}
Here the equality symbol assigns another label for displaying the property, and this label is empty. Without this, the result would display «Population:» before the actual number.
Standard parameters for inline queries[edit]
In general, an inline query is a request to find a number of pages that satisfy certain requirements. The query must answer three questions:
- Which pages are requested? (query description)
- What information should be displayed about those pages? (printout statements)
- How should the results be formatted within the page?
The first two points are explained in their respective manual pages. The third point is important to smoothly include query results in pages, yet is largely independent of the first two. Without further settings, queries often produce tables like above or simple lists (if no additional printouts are used). An example of another possible format are bulleted lists, which one can create with the parameter format=ol:
{{#ask: [[Category:City]] [[located in::Germany]] | ?Population | format=ul }}
This will produce the following output:
- Aalen
- Amberg
- Andernach
- Ansbach
- Aschaffenburg
- Augsburg
- Aurich
- Backnang
- Bad Oeynhausen
- Bamberg
- Bayreuth
- Berlin
- Bielefeld
- Bochum
- Bonn
- Braunschweig
- Bremen
- Burghausen
- Calw
- Chemnitz
- Coburg
- Cologne
- Cottbus
- Darmstadt
- Deggendorf
- Dinslaken
- Dortmund
- Dresden
- Duesseldorf
- Duisburg
- Ebersberg
- Einbeck
- Erfurt
- Erlangen
- Essen
- Esslingen am Neckar
- Flensburg
- Frankfurt
- Freiburg
- Fürth
- Georgsmarienhütte
- Gera
- Gerlingen
- Gerolzhofen
- Goettingen
- Gotha
- Greifswald
- Halle
- Hamburg
- Hannover
SMW implements a wide variety of output formats for inline queries, and allows you to futher control results display using a MediaWiki template. The parameter format is one of the most important parameters for selecting the appearance of query results and will be explained in more detail below. The following table gives an overview of common parameters that can be used in basically all queries:
Parameter | Possible values | Description |
---|---|---|
format | a format name (see below) | selected output format; some formats allow further parameters (see #Result formats) |
limit | non-negative number | maximal number of pages selected (in the case of a table: rows) |
offset | number | where to start |
sort | property name or a list of property names separated by , | name of properties to use for sorting queries (see Help:Selecting pages) |
order | ascending/asc, descending/desc/reverse, or a list of those if more than one property is used for sorting | defines how results should be ordered, only applicable if sort is used, ascending is the default (see Help:Selecting pages) |
headers | show, hide | shows or hides the labels/headers used in some output formats such as «table», hide is default |
mainlabel | plain text | title of the first column (the one with the page titles in it), default is no title; set to - to suppress printing the page titles |
link | none, subject, all | defines which article names in the result are hyperlinked, all normally is the default |
default | plain text | if, for any reason, the query returns no results, this will be printed instead |
intro | plain text | initial text that prepends the output, if at least some results exist |
outro | plain text | text that is appended to the output, if at least some results exist |
searchlabel | plain text | text for continuing the search (default is «… further results») |
In addition to the above, some formats have their own parameters that control special aspects of the format. These special settings are described in the documentation of each format.
Result limits and links to further results[edit]
You can set the parameter limit to restrict the maximum number of results that are returned. For example, the query
{{#ask: [[Category:City]] [[located in::Germany]] | limit=3 }}
displays at most 3 cities in Germany. Even if you do not specify a value for limit, SMW always applies some limit to the results a query returns. Depending on a site's settings, it might be possible to increase the number of displayed results by specifying a higher value for limit. However, there is usually a maximum limit that cannot be exceeded, set by the wiki administrator based on performance considerations.
Running the above query produces: Aalen, Amberg, Andernach... further results
This shows that whenever a query does not display all results due to a limit, it will normally show a link to «further results». The text of this link can be modified by setting the parameter searchlabel. If the value of searchlabel is empty, then no link to further results appears. Some output formats (see below) never display the search link, or display it only if a searchlabel was specified.
An interesting application of limit and searchlabel is to display only a link to the results of a search, without showing any result inline. You achive this by specifying a limit of «0» or «-1». For instance, the query
{{#ask: [[Category:City]] | limit=0 | searchlabel=Click to browse a list of cities }}
displays: Click to browse a list of cities. this link will only appear if there are any results at all. In other words, SMW will still compute the query to check if there are any results. If this is not needed, or if a link should be shown in any case, one can use the limit «-1». SMW will then only print a link to further results, even if no results exist at all. This also saves some computation time on the server.
Introduction and default text[edit]
If no articles satisfy the conditions of a query, nothing is shown. This is sometimes a useful behaviour, but often certain texts should be shown or not shown depending on whether the query has results or not. For example, one may want the query to show an output of the following form:
Upcoming conferences: ISWC2008, IJCAI2007, …
where the list of conferences is generated by a suitable query. If the query (for whatever reason) would not return any results, the page would look as follows
Upcoming conferences:
which is not desirable. Two parameters exist to prevent this.
- default: this parameter can be set to a default text that should be returned when no results are obtained. In the above example, one would probably write something like
Upcoming conferences: {{#ask: ... | default=none}}
- so that, if no result is obtained, the article will display
Upcoming conferences: none
- intro: this parameter specifies a text that should be prepended to the output of a query, but only if one or more results exist. In the above example, one could write
{{#ask: ... | intro=Upcoming conferences:_}}
- so that, if no result is obtained, nothing will be printed at all. Note that we use «_» to encode the final space. This is needed for initial and final spaces in any parameter, since those are otherwise removed internally (this is always the case in MediaWiki and is not specific to SMW).
Both of the above solutions will show the intended output if results are found. It is also possible to combine both parameters if desired. The parameters can also include MediaWiki markup, such as links or templates, as long as this does not confuse MediaWiki in recognising the #ask function.
Also note that if the set of pages selected in a query is empty, no header row or blank line, not even any blank space, is produced. This can also be useful to «hide» queries that are not applicable. However, it is not recommended to insert great amounts of queries into every page, based on the assumption that this can do no harm since no output is generated. Indeed, answering queries requires much computational resources and should not be done without a purpose.
Using default texts for queries is also a good habit in general, since it may happen that a query will no longer have any results in some future, e.g. due to changes in the way the wiki organises its data. Such queries that once worked properly may be forgotten so that nobody notices the query on a page labouring to display nothing.
Sorting results[edit]
It has been explained in Help:Selecting pages that query results can be ordered by one or more properties. As explained there, Special:Ask has additional input fields to specify sort properties and ordering. In inline queries, sort properties are defined with the parameter sort, and the order can be set with the parameter order. The value of order should be ascending or descending, or one of the short forms «asc» and «desc», or «reverse». An example is the following query for the three largest cities in Germany:
{{#ask: [[Category:City]] [[Located in::Germany]] | ?Population | sort=Population | order=descending | limit=3 }}
As explained in Help:Selecting pages, sorting conditions may impose restrictions on the set of query results. In the above case, only German cities that have a value for population are considered. If more than one property is used for sorting, the parameters sort and order can be set to lists of property names and orders, repsectively, separated by commas. The following is an example:
{{#ask: [[Category:City]] [[Located in::Germany]] | ?State | ?Population | sort=State,Population | order=ascending,descending }}
This query would return all German cities for which a state and population was specified. These results will be ordered by the name of the state they are located in (ordered alphabetically). Cities that are located in the same state will be ordered by their population, largest first («descending»).
Configuring labels/table headers[edit]
Queries that return more than just the selected articles (e.g. the population in the above example) will display labels that describe the various output fields. By default, the label just displays the name of the requested property, or the text «Category» if categories are displayed. Labels for properties normally display as a link to the respective pages in the Property: namespace.
In the table format, the labels appear as column headers. In other formats, the labels might appear right before the output fields. The texts used for these labels can be controlled as explained in Help:Displaying information, using the equality symbol after printouts. Example:
{{#ask: [[Category:City]] |?Population= | ?Area#km²=Size in km² | ?Category=Category memberships | format=table | default=nothing found in Category:City }}
This query will produce:
It is possible to use empty printout labels to have no label for a result column at all. In tables, however, the table header will still be shown even if all printouts use empty labels. To remove the header of a table completely, the parameter headers can be used. Two values are possible:
- show: display labels (default)
- hide: hide all labels and table headers
This setting works for tables as well as for other outputs. In the latter case, the value hide will hide all printout labels, even if they have a non-empty label set in the query.
Changing the first result column[edit]
Most queries by default display the actual result pages in the first result position, e.g. as the first column in a table. The header of this column is usually blank. To change the label, or to hide the whole first column, the parameter mainlabel can be used. Normally, the text given to that parameter will simply be used as a header for the first column, for example in the query
{{#ask: [[Category:City]] [[Located in::Germany]] | mainlabel=City | ?Population=Number of inhabitants | limit=3 }}
This will produce the table:
City | Number of inhabitants |
---|---|
Aalen | |
Amberg | |
Andernach | |
... further results |
The parameter mainlabel can also be used to completely hide the first column. This happens if the value of this parameter is set to «-» (minus symbol). To insert the list of main results at another position, the printout statement «?», i.e. the question mark without any additions, can be used. For example, modifying the example above to display the city name after Population,
{{#ask: [[Category:City]] [[Located in::Germany]] | ?Population=Number of inhabitants | ?=City | mainlabel=- | limit=3 }}
This results in the table:
Number of inhabitants | City |
---|---|
Aalen | |
Amberg | |
Andernach | |
... further results |
Result formats[edit]
The parameter format determines how the results of a query are displayed in the article. If it is omitted, all queries are displayed as tables (format table), unless there would be only one column, in which case the results are displayed as a comma-separated list (format list). In addition to the formats provided by SMW, extensions can provide additional formats; see Semantic Result Formats and Semantic Google Maps for two such extensions. The following formats are available in SMW by default:
Format | Description | Additional parameters (usually optional) |
---|---|---|
list | Comma-separated list, with additional outputs shown in parentheses | sep, template |
ol | Ordered list, with additional outputs shown in parentheses | sep, template |
ul | Bulleted list, with additional outputs shown in parentheses | sep, template |
table | Tabular output | |
broadtable | Tabular output, where the table is as wide as the article. | |
embedded | Embed selected articles. | embedonly (if set, don't show article link), embedformat (can be ol, ul, h1, h2 ..., h6) |
template | Print results by passing result fields as parameters to a given template. | template (mandatory) |
count | Just the number of results (a count of the number of matching pages), instead of the results themselves | |
debug | Debugging information for analysing problems in query answering. | |
rss | Print links to RSS feeds for query results. | title, description |
csv | Export result table as CSV (comma-separated values), available since SMW 1.2.1 | sep |
Exporting query results: RSS, etc.[edit]
Some of the above formats especially Semantic Result Formats enable data export from the wiki. Since Semantic MediaWiki release 1.4.2 the iCalender format and vCard format is part of the Semantic Result Formats. Two aspects of those formats are special:
- They do not show any results at the place where they are inserted.
- They use fixed standard formats for exporting (non-fixed, free-form) wiki content. Hence it must be explained which wiki properties belong to which part of the data export format.
The first point means that only a link to Special:Ask will be shown at the place where the query is inserted. This link is similar to the normal «further results» link, but will use a more adequate default text (something like «RSS»). Yet it is possible to change the link text with the parameter searchlabel as for other queries.
The second point makes it necessary to relate printout statements (properties) to the data fields available in the export format. For example, vCard is a data format that can encode many kinds of contact data about a person, but it cannot represent arbitrary properties. To specify which wiki properties belong to which of the available data fields, the label of the property printout is used. For example, vCard supports (among many others) the data fields «firstname», «lastname» and «homepage». A query could thus be
{{#ask: [[Category:Person]] | ?firstname | ?lastname | ?url = homepage | format=vcard }}
Here the wiki would have properties called «firstname» and «lastname», but the homepage of a person is stored in a property called «url». The label «homepage» is given to the latter so that vCard recognises the special meaning of this property. With this method, wikis can use arbitrary property names (in any language) and still export to standard formats. See the pages of the above formats for details on the data fields they support.
Using templates for custom formatting[edit]
Some of the above result formats support the use of wiki template to fully control the display of an inline query. This works for the formats template, list, ol and ul. If a template is specified, all result «rows» are formatted using this template. The name of the template (without the initial «Template:») is given in the parameter template, so the query has the following general form:
{{#ask: ... | format=template/list/ol/ul | template=templatename }}
For each result in an inline query, SMW then calls the specified template, where the result and printout values are used as numbered template parameters. So a query that would display three columns when formatted as a table, will set three template parameters. These values can then be used in the template in the normal way writing {{{1}}}, {{{2}}}, etc. Each parameter refers to one "field" or column in the results that would be displayed by the inline query for each selected page. Normally the first field a query displays is the page title (see #Changing the first result column), so parameter {{{1}}} is the page title, and {{{2}}}, {{{3}}}, ... are the other properties displayed by the query. A number of examples are given below.
The template feature allows greater flexibility in the display of the query, including:
- Changing the order in which output is displayed, or omitting or duplicating output;
- Displaying images depending on query results;
- Creating links for property values;
- Using CSS styles to vary font-size, alignment, background-color, etc. by column in tables.
If you do use a template to adjust the appearance of links, you will probably need to set the parameter | link=none | to disable SMW's automatic linking of article names; your template will then have to add [[ ]] around anything you want to be a link.
To understand how to create a template for formatting some query, it is useful to look at the query with format=table first. For example, queries that refer to a single page only (like the ones one would use with #show) hide the page title of the result page, so that the parameter {{{1}}} refers to the first printout statement. Using the printout statement ? or specifying any value for mainlabel willchange this.
The following examples all use Template:Query output demo that basically contains the following wiki text:
{{{2}}} people squeeze into the {{{3}}} of {{{1}}}.
The following queries illustrate the effect of this template:
{{#ask: [[Category:City]] [[Area::+]] [[Population::+]] | ?Population=Inhabitants | ?Area#km²=Size in km² | format=template | template=Query output demo | limit=3 }}
Result:
In the above example you can see how the template ignores any header labels specified in the query such as «Size in km²». Yet the values the template displays do use the units specified in ?Area#km²=Size in km², and will similarly respect all given display formats (see Help:Displaying information).
Below is a similar query sorted by population that uses format=ol to produce a numbered list.
{{#ask: [[Category:City]] [[Area::+]] [[Population::+]] | ?Population | ?Area#km² | format=ol | template=Query output demo | limit=3 | sort=population | order=desc }}
Result:
If we directly specify a single page, then normally the query results do not include the page, so to reuse the same template in the query below we must tell the query to display the page title as the first column by adding |?
{{#ask: [[Berlin]] | ? | ?Population | ?Area#km² | format=template | template=Query output demo }}
Result: people squeeze into the of Berlin.
The same can be accomplished using #show even though this may not be the most typical use of this function:
{{#show: Berlin | ? | ?Population | ?Area#km² | format=template | template=Query output demo }}
Result: people squeeze into the of Berlin.
Templates may also include other parser functions such as conditionals and even queries. Examples of complex query formats can be found on the following pages (external links, may change):
- Upcoming events on semanticweb.org's Main Page: the events section on this page displays only certain major events. Each such event is formatted with a template that uses another inline query to find sub-events (co-located workshops, tutorials, etc.) for a given event.
- Publications on korrekt.org: all lists on this page are created with templated queries. Conditionals (#if and #ifeq) are used to change the format of a result depending on its publication type and provided data (majorpublications have bold-faced titles).
Inferencing[edit]
Semantic search can be used to find pages based on the information that users have entered about them into the wiki. This simplifies many tasks, but it still requires that semantic information is entered manually in the first place. In some cases, one would expect the wiki to be «smarter» and to deduce some facts even if they were not entered directly. In some cases, SMW can draw such inferences automatically, as described in this article.
Subcategories[edit]
MediaWiki supports a hierarchy of categories that helps to simplify the category statements needed on each page. As an example, assume that a wiki uses categories «Person», «Woman», and «Man». It is obvious to a human that every page that belongs to the category «Woman» should also belong to the category «Person». But «Woman» is clearly more specific, and many wikis (including Wikipedia) have a policy to use only the most specific categories on some page — otherwise the page would often have to contain dozens of categories that are hard to maintain. To indicate that one category is more specific than another, MediaWiki supports a category hierarchy where users can state that one category is a subcategory of another, simply by putting a category on the subcategory's page, e.g. the page Category:Woman could contain the text
[[Category:Person]]
For details, see the MediaWiki handbook.
By default, SMW uses this subcategory information in semantic queries: when asking for all pages that belong to a category, it will also find all pages that belong to any subcategory of that category. In the above example, the query [[Category:Person]] would also return the pages in categories «Man» and «Woman». In other words, the actual query corresponds to
[[Category:Person]] OR [[Category:Woman]] OR [[Category:Man]]
If the category hierarchy is deeper, then SMW will also include further subcategories. For example, one may have a category «Mother» of all women that have children, and this again would be a subcategory of «Woman». Then the above query would retrieve all pages in category «Mother» as well.
SMW's mechanism of subcategory inferencing can be restricted or disabled by the site administrator. Normally, it supports only a certain maximal depth of category hierarchies, so it may not return all results if there are very long chains of subcategories involved. Using a manually created query with OR as above is a work-around in this case, but it does of course not take into account any changes in the category hierarchy.
In some cases, wikis use categories and category hierarchies that are not suitable for being treated in the above way. For example, Wikipedia uses a category called «Cities» not for collecting all cities but for all articles that are related in some ways to cities. Even the category «Cities in Canada» is used to collect all pages that have some relationship with that topic. This is not an actual problem of categories or category hierarchies: the semantic query [[Catgegory:Cities]] still returns all pages related to that topic, it just does not return actual cities only. So one might argue that the name of the category is confusing in some sense, but this is merely a matter of how to organise a wiki. If a wiki has no category for actual cities as such, then no semantic query can produce all cities directly.
A more serious problem in large wikis might be what is called «semantic drift». This occurs if the exact intention of some category is not really specified, e.g. because it lacks a detailed description on its page. Different users then may have slightly different readings of the categories meaning, and this may influence how they use sub-category statements. For example, some editors may reasonably say that «Priest» is a subcategory of «Religious office» (referring to the job category), while others may deem «Female priest» to be a subcategory of «Priest» (referring to the class of people having that job) – but this would imply that all pages in «Female priest» are also implicitly categorised in «Religious office», thus confusing people and job occupations. It is therefore important to always clearly describe on a category page what should go into a category and what shouldn't, and also to point to alternative catgegories that may be suitable.
Subproperties[edit]
Just like categories, also properties can be more specific than one another. For example, a wiki may have the property «capital of» to relate cities to countries, and a property «located in» that generally describes that some city is located in some country. Now it happens to be the case that every capital city also must be located in the country that it is capital of. In other words, «capital of» is a subproperty of «located in». Whenever a user states that a page is a capital of some country, SMW should then also conclude that the page has an (implicit) «located in» relation to that country as well. To say that in the wiki, the following can be entered on the page Property:Capital of:
[[subproperty of::Property:located in]]
Once this has been stated, a query [[located in::Germany]] will also return the capital Berlin even if no «located in» property is given on that page. Similar considerations as in the case of cateogries apply, and detailed descriptions on property pages are a good method for avoiding semantic drift.
Equality of pages: redirects[edit]
It often happens that a thing can be referred to by different names, such as in the case of Semantic MediaWiki which is synonymous with SMW. In MediaWiki, this is solved by redirects that forward readers from one page to another. But synonyms may be even more important in a semantic wiki, where one wishes to organise content and make it more accessible. If different editors use different page names in annotations, then it is hard to create queries which will still display a unified view on the wiki content.
SMW therefore treats all redirects between pages as synonyms. This means that it does not matter at all whether a redirect page or the actual target page is used in a query or annotation. SMW internally uses only redirect targets to work with, and all functions will take the redirect structure into account. This mechanism works only for immediate redirects: redirects that point to other redirect pages are not supported and should be resolved (this is also the case in MediaWiki anyway).
Since SMW 1.2, it is also possible to use redirect on properties and categories with the same effect, so multiple synonyms for properties can be created. It is not suggested to use that feature for the case of categories though, simply because MediaWiki's category functions will still ignore category redirects such that some wiki features will not work as expected. Redirects between normal pages and properties, properties and categories, etc. are not supported in a special semantic way. They still create normal MediaWiki redirect pages but nothing else.
Inferencing and printout statements[edit]
Printout statements do generally not perform any inferencing, i.e. they will only return the statements that are explicitly made for some page. This is desired in some situations, and it may be a limitation in others. A work-around can be to use a tempalte for annotation, and to give two property values explicitly in that template, essentially by writing something like
[[capital of::located in::Germany]]
which is the same as writing [[capital of::Germany]] and [[located in::Germany]], but it will show only one link to Germany.
Inferencing features that are not supported[edit]
It sometimes happens that ambitious contributors in a wiki will create properties that also suggest a specific meaning for automated deduction. It should therefore be noted that SMW does not support any of the following features:
- Transitivity
- Inverse properties
- Domain and range restrictions
- Number restrictions and functional properties
Even if properties that sound like the above are introduced, and even if these are linked to well-known properties in ontology languages such as OWL, RDFS, SKOS, etc., SMW will not use these annotations to perform smarter queries. To prevent confusion, it is suggested to not use names that resemble established notions in existing ontology languages, or at least to clearly document this limiation on the property pages.
To some extent, one may be able to craft queries to achieve a similar effect. The sample pages Germany and California show examples of queries for inverse relationships; the sample page Germany shows an example of a subquery that approximates a transitive relationship to some extent.
Editing[edit]
This section explains how to edit pages in Semantic MediaWiki. As explained in the introduction, SMW introduces special markup elements which allow editors to provide «hints» to computer programs on how to interpret some piece of information given in the wiki. Such hints are called semantic annotations and they are created with a special markup of SMW. Besides this, editing in SMW is just the same as in MediaWiki. Users who are not familiar with basic editing yet, should first read about how to edit pages in MediaWiki. Editors may or may not provide annotations on wiki pages as they like – it is an added feature that is completely voluntary.
Overview of SMW editing features[edit]
Annotations in Semantic MediaWiki can be viewed as an extension of the existing system of categories in MediaWiki. Categories are a means to classify articles according to certain criteria. For example, by adding [[Category:Cities]] to an article, the page is tagged as describing a city. MediaWiki can use this information to generate a list of all cities in a wiki, and thus help users to browse the information.
Semantic MediaWiki provides a further means of structuring the wiki. Wiki pages have links and text values in them, but only a human reader knows what the link or text represents. For example, «is the capital of Germany with a population of 3,396,990» means something very different from «plays football for Germany and earns 3,396,990 dollars a year». SMW allows you to annotate any link or text on the page to describe the meaning of the hyperlink or text. This turns links and text into explicit properties of an article. The property capital of is different from on national football team of, just as the property population is different from annual income.
This addition enables users to go beyond mere categorisation of articles. Usage and possible problems with using these features are similar to the existing category system. Since categories and properties merely emphasize a particular part of an article's content, they are often called (semantic) annotations. Information that was provided in an article anyway, e.g. that Berlin is the capital of Germany, is now provided in a formal way accessible to software tools.
The user manual explains basic annotations with properties, the creation of custom units for numerical properties, and the use of MediaWiki templates to simplify annotation.
Besides annotations, SMW also allows editors to embed semantic queries into articles. Thereby, readers of the wiki can view ready-made query results without having to learn the SMW query language. This feature is explained in the section on inline queries.
Categories[edit]
Categories are an editing feature of MediaWiki, and the main reference for their use is the MediaWiki documentation on categories. Categories are used as universal "tags" for articles, describing that the article belongs to a certain group of articles. To add an article to a category Example category, just write
[[Category:Example category]]
anywhere in the article. The name of the category (here: Example category) is arbitrary but, of course, you should try to use categories that already exist instead of creating new ones. Every category has its own article, which can be linked to by writing [[:Category:Example category]]. The category's article can be empty, but it is strongly recommended to add a description that explains which articles should go into the category.
MediaWiki's categories have many different interpretations. For example, the category City might comprise all articles about particular cities, i.e. a member of this category is a city. Or it might describe the topic area of articles, such as articles on city squares, urbanism, etc. Or both. MediaWiki encourages this practical usage of categories: a category forms a collection of articles that are considered useful or interesting for users, and categories are organized so users can browse narrower or broader groupings and find related concepts.
Ad hoc use of categories does not break Semantic MediaWiki, but may lead unintended modelling results when interpreting the formal semantics of SMW's OWL/RDF export. the latter applies precise semantics to categories, as described in its help page, that might be unsuitable for some uses.
The advanced search functions of Semantic MediaWiki makes some categories superfluous, so that an SMW-enabled wiki might achieve a high degree of organization with fewer categories. For example, the subcategory Large cities could be replaced by a query for articles with Category:city with an area larger than 10 km², or a population larger than 1,000,000.
Properties and types[edit]
Properties and types are the basic way of entering semantic data in Semantic MediaWiki. Properties can be viewed as «categories for values in wiki pages». They are used by a simple mark-up, similar to the syntax of links in MediaWiki:
- [[property name::value]]
This statement defines a value for the property of the given property name. The page where this is used will just show the text for value and not the property assignment.
Existing links can be directly augmented with such property information, while other types of data (such as numbers or calendar dates) need an additional editing step.
Turning Links into Properties[edit]
Consider the Wikipedia article on Berlin. This article contains many links to other articles, such as «Germany», «European Union», and «United States». However, the link to «Germany» has a special meaning: it was put there since Berlin is the capital of Germany. To make this knowledge available to computer programs, one would like to «tag» the link
[[Germany]]
in the article text, identifying it as a link that describes a «capital property». With Semantic MediaWiki, this is done by putting a property name and :: in front of the link inside the brackets, thus:
[[capital of::Germany]]
In the article, this text still is displayed as a simple hyperlink to «Germany». The additional text capital of is the name of the property that classifies the link to Germany. As in the case of categories, the name of the property is arbitrary, but users should try to re-use properties that already appear elsewhere.
To simplify this re-use, every property has its own article in the wiki, just as every category has an article. You can see all the properties in use in the wiki with the Special:Properties page. Just as category articles are prefixed with Category:, all property articles are prefixed with Property: to distinguish them from other articles. So you can also also use MediaWiki's Special:Search page to find existing properties. As with categories, a property's article can be empty, but it is strongly recommended to add a description that explains the intent of the property and its proper usage.
There are various ways of adding properties to pages:
What it does | What you type |
---|---|
Classify a link with the property "example property." | Classify a [[example property::link]] with the property "example property." |
Make alternate text appear in place of the link. | Make [[example property::link|alternate text]] appear in place of the link. |
To hide the property from appearing at all, use a space as the alternate text. | To hide the property [[example property::link| ]] from appearing at all use a space as the alternate text. Note: The space after | is necessary. If left away, the MediaWiki pipe trick applies, but rarely with desirable effects. Even if a space is given, SMW will not print anything, which should be the desired result in most cases (to make a space appear in the text, use – as a space symbol). |
To make an ordinary link with :: without creating a property, escape the markup with a colon in front, e.g. The C++ :: operator. |
The [[:C++ :: operator]]. |
To assign one value to multiple properties, add :: between each name, e.g. link. |
e.g. [[property1::property2::link]]. |
Turning values in text into Properties[edit]
There is other useful information in wiki articles besides links to other articles. For example, there is a number in the Berlin article giving its population. To make this knowledge available to computer programs, one would like to "tag" the text
3,396,990
in the article, identifying it as a value for the "population property". With Semantic MediaWiki, this is done by putting the property name and :: in front of the text and surrounding it with [[ ]] brackets, thus:
[[population::3,396,990]].
This works fine. However, it creates a link to a 3,396,990 page, and having an article for every population value probably does not make sense. Furthermore, if you wanted to create a list of all German cities ordered by population, numeric order is different from the alphabetical order that you would expect for article names. For example, in alphabetical order, "1,000,000" comes before "345".
Types of Properties[edit]
We want to be able to tell Semantic MediaWiki that "population" is a number, not a link to a page in the wiki. The way to do this is to specify a type for the "population" property. Semantic MediaWiki has several built-in datatypes that we can choose for properties. For our population example, the appropriate type is called Type:Number; the prefix "Type:" is again a separate namespace that distinguishes descriptive articles about types from normal pages. We want to give property "population" a special property that specifies it has "type:number". To support this Semantic MediaWiki has a built-in special property called Property:Has type. We use the same syntax for this special property as for any other property, so in the Property:Population article, we write:
[[has type::number]]
(You don't need to specify the Type: namespace here.)
Semantic MediaWiki knows a number of special properties like Property:has type. Regardless of whether these properties have their own articles in the wiki, they have a special built-in meaning and are not evaluated like other properties.
Datatypes[edit]
Datatypes are very important for evaluating properties. Firstly, the datatype determines how tools should handle the given values, e.g. for displaying values and sorting values in search results. Secondly, the datatype is required to understand which values have the same meaning, e.g. the values "1532", "1,532", and "1.532e3" all encode the same number. Finally, some datatypes have special behavior, as will be described below. For these reasons, every property has a datatype, listed on the Special:Properties page.
The reason we didn't have to specify a datatype for the "capital of" property above is that the default datatype is Type:Page, which displays as a link. (Note that if you change the datatype of a property later on it does not affect the annotations of existing articles until they are saved again or purged.) Even though Type:Page is the default, you should explicitly specify a datatype for every property, just to prevent confusion or later redefinition with an unintended type.
The same mark-up for properties that are links to pages also works for properties of other datatypes. Here are some more examples.
What it does | What you type |
---|---|
Assign the value 1,234,567 to the property "example." | Assign the value [[example::1,234,567]] to the property "example." |
Assign a numeric value, but showing different text in the article. | Assign a value of [[example::999,331|about a million]], but showing different text in the article. |
Specifying the type in a property's article, e.g. This property is a number. |
This property is a [[has type::number]]. |
Combining MediaWiki markup with property values, e.g. John's username is john Hint: Use a template for this. |
John's username is [[username::john|[mailto:john@example.com john]]]. |
Datatypes and units of measurement[edit]
Using different types, properties can be used to describe very different kinds of values. A complete list of available types is available from Special:Types. Basic types include:
- Type:String (text strings)
- Type:Number (integer and decimal numbers with optional exponent)
- Type:Page (links to pages, the default)
These can be used creatively for very different purposes. For instance, properties of type string can be used for encoding phone numbers (which could contain non-numeric symbols).
Units[edit]
Type:Number allows a unit after the numeric value to distinguish values (e.g. "30.3 mpg" versus "47 km/liter"), but does not know how to convert between them. To support automatic conversion and multiple unit formats, you can define your own datatype with custom units. These automatically convert values to and from standard representations, so that users are free to use their preferred unit in each article yet still query and compare with property values in other articles.
Special datatypes[edit]
There are some special built-in datatypes which support more complicated formats and unit conversions.
- Type:Boolean restricts the value of a property to true/false (also 1/0 and yes/no).
- Type:Code (new in SMW version 1.2) is like Type:String but displays its value in a HTML pre-formatted box. The value displays as regular text everywhere else (query results, factbox, "Pages using the property", etc.).
- Type:Date specifies particular points in time. This type is still somewhat experimental, but may feature complex conversions between (historic) calendar models in the future.
- Type:Geographic coordinate describes geographic locations. It recognizes different forms of geographic coordinates. Using service links it can dynamically provides links to online map services.
- Type:Temperature can't be user-defined since converting temperature units is more complicated than multiplying by a conversion factor.
- Type:Text is like Type:String but can have unlimited length; the tradeoff is values of this type cannot be selection or sort criteria in queries..
For specifying URLs and emails, there are some special variations of the string datatype:
- Type:URL displays an external link to its URL object.
- Type:Annotation URI: properties of this type are interpreted as relations to external objects, denoted by the URI. They are special since they are interpreted as annotation properties on export. See the type's page for documentation.
- Type:Email displays an e-mail address as a link (with mailto:).
SMW does not have an "enumerated" datatype; instead, for any property, you can limit its possible values by using the special property Property:Allows value to enumerate its permitted values. This works for every datatype.
Properties with multiple types[edit]
In human language it is easy to introduce multiple facts at once. For example, "John F. Kennedy was the 35th president of the U.S.A., serving from 1961 until his assassination in 1963." This is information about John F. Kennedy that belongs in his wiki page, but it shifts to information about his presidency. You could simply have a property "Presidency_details" of Type:String and put the text in it. But it will only be meaningful to humans, you can't query on it or sort it to produce a list of presidents.
You can't nest semantic annotations, so you cannot have a string property that contains additional annotations.
Often the best way to represent this is to create an article for the object of the property, so this can be annotated with the additional information. So property "Has_presidency" would be of Type:Page, and then the article "Presidency of JFK" has properties Of_country::U.S.A., Count:35, Start_date:1961-01-20, End_date:1963-11-22, etc. Wikipedia frowns on so-called "stub" articles, but in a semantic wiki they are appropriate as they provide information for semantic queries and browsing.
It is also possible to create a property in Semantic MediaWiki that takes multiple values, these are sometimes called "n-ary relations". So "Has_presidency" could have Type:Page; Type:Number; Type:Date; Type:Date, where the four values are the country, the count of the presidency, the start date, and the end date. See Help:Many-valued properties for more information.
Special properties[edit]
We mentioned the special property Property:Has type that you use to tell SMW the datatype of a property. SMW has other predefined special properties that have special meaning (even if you do not create property pages for them in your wiki). You cannot use these names for your own properties, but since SMW 1.4.0 you can use them in browsing and querying interfaces just like all other properties. For more information, see Category:Special property and the individual property pages.
"Relations" and "Attributes" in earlier versions[edit]
In earlier versions of SMW, properties with Type:Page were known as relations and only those used double colons (::) as the separator between property name and link text. All other properties (numbers, strings, etc.) were known as attributes and had to use colon equals (:=) as the separator.
SMW 1.0 unifies relations and attributes, calling them properties, and uses a single namespace "Property:". The default datatype for undeclared properties is Type:Page, but it is strongly recommended that you declare every property's type to clarify its intended use for other editors. SMW still supports := but it is recommended that you use :: for all property annotations. The reason is that the equality symbol contained in := cannot be used properly within MediaWiki template parameters, whereas :: causes no problems in most environments.
See Upgrading from SMW 0.7 to SMW 1.0 for other changes in SMW 1.0; if you're still using the older version of SMW, see see ow:Help:Annotation (SMW0.7) for documentation of Annotations in version 0.7.
Custom units[edit]
This page explains ways that pages can have more control over the display and conversion of units. Units can be used for properties of Type:Number, and make annotation more flexible: everybody can view and enter data in his or her preferred unit without restricting mutual understanding. For example, some people might prefer a distance given in "miles" over one given in "km". In other cases, it might not be suitable to display a distance in "km" if "microns" are more appropriate.
Custom types with unit conversions[edit]
SMW has built-in support for some types that can handle units (e.g. Type:Temperature), but many more can easily be added. Types that support units can accept, convert, and display values in several different units. You can see this in the factbox of articles like "ow:Berlin", where the area is given in multiple units.
In order to support such features, SMW needs to know how to convert values from one unit into another. This is rather easy in many cases, but can also involve more complex computation in other situations. We distinguish two cases:
- Conversion between the desired units is proportional, i.e. you just have to multiply one value with a fixed conversion factor to get the value in another unit. For example, this is the case for converting between kilometres and miles.
- Conversion between units is not proportional and more complex computations are needed. For example, this occurs for temperatures, since you need to add and multiply in order to get from °C to °F.
For all unit conversions of the first kind, you can easily create custom types that support those units. For the second situation, we discuss some possibilities below.
Creating new datatypes with propertional unit conversion[edit]
Before thinking about creating a new datatype, make sure that the type does not already exist by consulting Special:Types.
If the desired type does not exist, you can create a new one easily. First, you need to create a page in the Type namespace. For example, you might want to create Type:Power. In the new article, you should first write some sentences on the purpose and use of this new type. In our case this would say that we mean the physical quantity that is usually measured in Watt. This also helps others to find and reuse your type when searching for keywords.
To specify the supported units, you use a special property corresponds to. For example, to specify the main unit of the new datatype for power, we add
[[Corresponds to::1 W]]
The value "1 W" states two things: (1) the type understands the unit "W" and (2) the unit "W" is its main unit (that is what the "1" is for). Intuitively, the statement says "One quantity of this type corresponds to 1W." It is easy to specify further units, e.g.
[[Corresponds to::0.001 kW]] [[Corresponds to::0.0013410220 hp]].
This says that the type also understands the units "kW" and "hp", and that 1 quantity of the main unit corresponds to 0.001 kW and to 0.0013410220 hp. In this way, you can support arbitrary units, so long as their relationship to the main unit can be described in this easy manner.
In many cases, there are multiple ways of referring to one unit. E.g. we would like to allow users to write "W" as well as "Watt" and maybe even "Watts". A short way of doing this is to separate additional units with "," instead of making multiple "corresponds to" statements with the same factor. For example we would write:
[[Corresponds to::1 W, Watt, Watts]] [[Corresponds to::0.0013410220 hp,bhp,horsepower]]
The very first symbol in the "corresponds to:: 1 xx" is the main unit. After saving the page for the new type, the "corresponds to" statements will appear as special properties in its factbox. The type now can be used like any other type. For example, we could make a new Property:Engine power that is of type power, by adding to its page:
[[Has type::Power]]
This property will understand all the units defined in "corresponds to" statements in its datatype, and will show conversions between them (without duplicates, i.e. SMW does not dipslay "W" and "Watt"). Internally, the values will all be converted to the main unit, and the RDF export will show the value in this unit as well. The display of units within the wiki is highly customizable and need not involve the main unit, see below.
Unit conversions that are not proportional[edit]
Note that you can only specify a proportional conversion factor, a multiplier. So you cannot have different bases for different units, logarithmic scale conversions, etc. Thus you can't create a custom type for temperature which converts degrees Celsius to Fahrenheit. In the case of temperature, SMW already supplies a built-in Type:Temperature that handles this conversion, but this might not be true in other cases.
SMW does not allow specifying customized non-proportional units in the wiki. One workaround for this is to use Type:Number which also accepts unit strings after a given number. The type does not know how to convert between those values, but it still recognizes the unit and will be able to distinguish different units. If all users of a certain exotic type agree on using the same unit, the functionality would be similar to having real unit support. If someone still uses another unit, then the given value will at least not be confused with values in other units.
A more elaborate way of fixing the situation is to write a small script that implements the required conversion. It is not difficult to extend SMW in this fashion, and one could simply copy and adjust the code for Type:Temperature (which is below 70 lines, including comments). Upon implementing such a custom type, properties that needed to use Type:Number can be changed to the new type without any negative effects on existing articles. When confronted with unsupported units, custom types will still behave like simple number datatypes.
Customizing the display of units[edit]
Through the use of floating-point numbers, a single type can support a broad range of units. For example, a single length type can easily support both light years and nanometers. However, if someone uses a property "Elevation" to specify the height of a mountain, then it would hardly be useful to display this value in light years or nanometers.
SMW allows you to specify which units a property should display from all the units its type supports. This information is specific to the property : two properties can both use Type:Length while still having different appearances in the wiki. If no preferences on display are given, a property will display all of its type's units, with the main unit (the one with conversion factor 1) first.
To specify custom display units, add the special property display units to the property's page, mentioning each unit you want displayed separated by commas. For example, the article Property:Height could contain the statements:
[[display units::km,ft,miles]]
This results in the factbox displaying only those three units for values of the property Height, even though its Type:Length might support a dozen other units. Similarly, the tooltip for each such value will show those conversions. This customization works for all properties that use types with unit support, no matter whether the type was customized or built-in.
If you change the first display unit, consider displaying the type's "Standard Unit" to users as one of the other display units, since SMW still converts values to the standard unit when storing them.
Semantic templates[edit]
Semantic templates are a method of including (parts of) the additional mark-up that Semantic MediaWiki introduces into MediaWiki templates. This has several advantages:
- users specify annotations without learning any new syntax,
- annotations are used consistently, i.e. users do not have to look for the right properties or categories when editing a page,
- since templates also have other functions, such as rendering flashy infoboxes in an article, users are motivated to use them.
For these reasons, we recommend you use semantic templates when introducing semantics into a wiki.
Simple semantic templates – an example[edit]
Templates, with or without semantics, can have a very simple form. For example, when giving the value for surface area of an astronomical object in a wiki page, you might want it to display as
- 6.088 × 1018 m²
which you can achieve by writing
6.088 × 10<sup>18</sup> m²
This is cumbersome to write, so you might develop a Template:Surface_area for areas so that editors can simply write
{{surface area|6.088|18}}
and the template expands to your desired markup.
Thus MediaWiki templates have immense value for normalizing and simplifying (once users understand the template syntax in general and particular) display in any wiki.
With the introduction of Semantic MediaWiki, you would probably want values for surface area to become semantic annotations so that they appear in the factbox and you can query them. So you might create a semantic property named Property:Surface area that uses or reuses a custom datatype Type:Area. Obviously you would like to have both the annotation and the visual appearance. You could write the following:
[[Surface area::6.088e18 m²|6.088 × 10<sup>18</sup> m²]]
the semantic annotation uses the scientific format for a number that Semantic MediaWiki can parse, and the alternate text after the pipe '|' symbol is the complex display you want.
But this is even less user friendly, and highly error-prone. Using semantic templates, you can write or adapt the Surface_area template to hide the complicated markup and perform the semantic annotation. Then, just as before editors can write
{{surface area|6.088|18}}
which is much more readable. To achieve this, Template:Surface area is coded as follows:
[[Surface area::{{{1}}}e{{{2}}} m²|{{{1}}} × 10<sup>{{{2}}}</sup> m²]]
See the sample page Sol and view its source to see this semantic template in use.
Note that the "surface area::" property in the template does not annotate the template article itself; it takes effect only on inclusion. This is because the default setting when installing Semantic MediaWiki is that SMW does not parse pages in the Template: namespace for semantic annotations . If this setting was changed (by the site admin) then you should surround the template code with <includeonly> tags to prevent the template's article itself from being annotated. As with a regular MediaWiki template, you can add text within <noinclude> tags to provide some user documentation on the template page.
More advanced semantic templates[edit]
Many MediaWiki sites make use of more complicated templates to present standard information. For example, Wikipedia articles on cities and villages use standard templates in which editors specify common items of information, such as this (from wikipedia:San Diego, California):
{{Infobox Settlement |image_skyline = Sandiego_skyline_at_night.JPG |imagesize = |image_caption = San Diego Skyline | March 31, 2007 |official_name = City of San Diego |settlement_type = [[City]] |nickname = America's Finest City |motto = Semper Vigilans ([[Latin]]: Ever Vigilant) |image_flag = Flag of San Diego, California.svg ...
Usually the template (in thise case, wikipedia:Template:Infobox Settlement) displays this information in a nicely-formatted table. Obviously such regular templatized items of information are ideal to be mapped into properties in Semantic MediaWiki so that articles using the template will acquire semantic annotations without any changes to their pages.
The sample page California shows a simple "infobox" display template adapted to make semantic annotations.
Using semantic templates correctly[edit]
While the above pattern allows you to create all kinds of semantic templates of arbitrary complexity, there are some issues to consider.
Automatic annotation requires strict formats[edit]
You can annotate template fields automatically, but in this case the supplied values must adhere to the expected format. For example, it is a good idea to annotate the population of a city with a property of type number. However, in infobox template such as the one at wikipedia:France, the entry supplied for population is not a single number, or even a set of numbers! Instead, there are multiple numbers and textual explanations of their meaning. Such special cases must be kept in mind when designing semantic templates.
This is also a major reason why semantic templates are not a suitable replacement for annotations in Semantic MediaWiki. There are cases where existing templates can be evaluated in a quite semantic way, but often the user still has to add semantic mark-up to make the data machine-processable. E.g. in the case of France, one might decide to leave "population" a normal text-entry, and leave it to the user to specify [[population::...]] where appropriate in this text.
Optional entries and conditionals[edit]
In templates in general, it is very useful to allow optional parameters. On many articles, users might not want to provide all possible values of a given infobox, and it would be silly to show empty rows in such cases. Even worse, semantic templates would generate warning messages due to the fact that an empty value is annotated in such cases. To prevent this, it is useful to introduce conditionals into the template code, which include a row (and its annotation) only if a non-empty value was provided.
This can be achieved with the help of the ParserFunctions extension to MediaWiki. Using this extension is completely independent of Semantic MediaWiki, and you can refer to the original documentation to this extension or further information. Wikipedia contains many examples for parser functions in templates, as for instance in wikipedia:Template:Taxobox. Using parser functions typically results in difficult-to-read template code, but the simplification for users is substantial.
Annotation in a template[edit]
Support for using annotations in templates has to be enabled by the site administrator.
When an annotation tag is added to a template, then on a page that already included the template the factbox is updated, but queries do not yet take the new annotation into account. For that, for the page that includes the template, edit and save has to be applied (purge does not work). After that, also on the page with the query edit and save has to be applied.
A query in a template[edit]
A query in a template is possible, and it can contain {{PAGENAME}}, see e.g. ow:Template:Ask. However, a query with a template parameter only works with substitution, and cannot be used in computations, see ow:Property:Weekday_number — this limitation may no longer be true with SMW 1.0 when using {{#ask:}} function syntax.
Service links[edit]
Semantic MediaWiki can provide links to online services when printing certain data in the factbox and in property pages. For example, when an article contains a geographic coordinate, it is useful to provide links to online mapping services so that users can retrieve a map of that location with one click. Another example is to provide a link to an online currency converter for financial amounts. This page explains how you can add such features to a semantic wiki (without writing PHP code to support a new datatype).
Service links for properties[edit]
The information for additional links to online services in the factbox is associated with the property used. For example, this wiki's Property:Coordinates will show various links to online maps when it appears in the factbox, whereas other properties that also use Type:Geographic coordinate might not show this. This is crucial in many applications, since the datatype alone usually does not say much about the type of link. For example a property "IMDb number" might be used for a movie's id number at IMDb, but not every property of Type:Number should display a link to IMDb.
To make a property display service links, add the special property provides service on its page. For example, the article Property:Coordinates might include the annotation
[[provides service::online maps]]
Here, "online maps" is the name of a set of service links provided by the wiki. The next section explains how you specify these service links.
After you specify that a property provides service links, its property values in the factbox and on the property's own page will show an icon that displays the service links when clicked.
Providing service links[edit]
In a nutshell, a wiki administrator puts the text specifying the appearance of service links in a special message article in the "MediaWiki" namespace named MediaWiki:Smw service service name. Continuing our example for coordinates, the text for [[provides service::online maps]] is in the message article MediaWiki:Smw service online maps.
Normally only users that have sysop (administrator) privileges in the wiki can add or edit pages in the MediaWiki namespace, hence only they can modify service links. This is a reasonable restriction for most wikis: since service links may appear in thousands of factboxes, they need to be trusted. Adding or modifying services should usually be discussed among many users before an administrator puts the change into practice.
All users, however, are free to associate properties with available services as described above.
The Mediawiki:Smw_service_service_name format[edit]
If you look at MediaWiki:Smw service online maps, though the message might be hard to read due to the long lines, its format essentially is as follows:
label|http://someurl.com label text2|http://anotherurl.org ...
Every line contains one service link. The label is the text that users will see in the service link pop-up. After the pipe symbol '|' is the URL that the link will lead to.
In most cases, you want to provide information from the property value in the link. For example, a link to an online map service has to include the coordinates to display, and a link to a movie web site will have to include the ID of the movie. Since the exact data values are not known in advance, you use placeholders of the form $1, $2, $3, … in the URL. For example, the message text for a service link to IMDb could be:
IMDb|http://www.imdb.com/title/tt$1/
When SMW displays the service links for a property value, it substitutes the property value's information for these placeholders. In this IMDB example, a movie ID of Type:Number will replace $1 with the numeric value, and voilà, the service link for a movie links to its information on IMDB!
Since service links are typically perceived as "trusted resources," administrators must take care when formulating links, keeping in mind that users can accidentally or maliciously pass arbitrary URL-encoded strings to service link URLs in the place holders.
Information passed for each placeholder[edit]
The number and contents of the parameters that replace $1, $2, $3, … depend on the datatype of the property. For instance, a simple integer property replaces $1 with its value, whereas a geographic coordinate provides parameters for latitude, longitude, direction, and much more. In most cases, $1 is the most relevant parameter that just provides a URL-safe string version of the property value.
- Type:Page
- $1: URL-encoded article name (no namespace), with underscores replaced by spaces
- Type:Number (and types with custom units)
- $1: numerical value in English punctuation
- $2: integer version of value, in English punctuation
- $3: From SMW version 1.1 onwards unit, if any.
- Type:String (but not Type:Text)
- $1: URL-encoded string
- Type:URL, Type:Annotation URI and Type:Email
- $1: URL-encoded value of the URL(includes mailto: for Type:Email)
- Type:Geographic coordinate
- $1: latitude integer degrees
- $2: longitude integer degrees
- $3: latitude integer minutes
- $4: longitude integer minutes
- $5: latitude integer seconds
- $6: longitude integer seconds,
- $7: latitude direction string (N or S)
- $8: longitude direction string (W or E)
- $9: latitude in decimal degrees
- $10: longitude in decimal degrees
- $11: sign (- if south) for latitude
- $12: sign (- if west) for longitude
Since geographic coordinates are so complicated, SMW includes a default message for MediaWiki:Smw service online maps — just add [[Provides service:online maps]] to any properties you have of Type:Geographic coordinate.
The other datatypes do not support service links. Many-valued properties could but do not in SMW version 1.1 (bug 14426).
Display of Property:Provides_service[edit]
When adding a service links to a property with the special property "provides service", the property page's factbox should display a link to the message article for the service. However, this feature is not working in SMW 1.0 (bug 12438).
Extended example[edit]
To illustrate the whole process of creating and using a new service, we provide an extended example, also implemented on ontoworld.org. Articles about Semantic Web vocabularies such as ow:FOAF contain information about the vocabulary's "namespace" and the online service ow:Swoogle allows users to search for background information on such namespaces. Our goal thus is to add a new "Swoogle service" to the ow:Property:Namespace that is used on the vocabularies articles.
As a first step, we edit the article ow:Property:Namespace and add the line
As an additional service, this property provides a [[provides service::Swoogle lookup]] of the entered namespace.
After saving, the factbox shows a link to the (still non-existing) service Swoogle lookup. Clicking this link, an administrator gets a new edit field, into which she enters
Swoogle lookup|http://swoogle.umbc.edu/index.php?option=com_frontpage&service=digest&queryType=digest_ns&searchString=$1
The link was retrieved by using Swoogle and replacing the search string (at the end of the link) with the parameter "$1".
After those changes are saved, the new service is fully functional, and each page that uses ow:Property:Namespace will show a suitable link to Swoogle. Some articles will still show the old version, if they are retrieved from cache, but after the next edit or purge, all articles will display the links as expected.
Annotation naming guideline[edit]
Help:Annotation naming guideline
Semantic Web[edit]
Although Semantic MediaWiki is designed to be used without additional technical background knowledge, it is closely related to well-known Semantic Web technologies. These technologies enable the wiki to share its knowledge with external applications by encoding it into the standard OWL/RDF format. Below is an overview of some related resources. These should be instructive for those who are familiar with semantic technologies or who wish to implement tools that use Semantic MediaWiki's RDF export. Normal users should go to Help:Semantics to find help on using the wiki.
RDF export[edit]
The explicit semantic content of Semantic MediaWiki is formally interpreted in the ow:OWL DL ontology language, and is made available in OWL/RDF format. For further details on the exported format, see Help:RDF export.
Reusing vocabulary from external ontologies[edit]
Normally, all statements in the wiki refer to concepts and properties defined in the wiki, but it is also possible to directly use vocabulary from other sources. E.g. this wiki imports a number of ow:FOAF elements for use within the wiki. A detailed description of how to use this feature is given in the article on vocabulary import.
Importing ontologies[edit]
Users with administrator status are allowed to import data from OWL DL ontologies into the wiki. This function is suitable for bootstrapping a wiki with existing semantic knowledge, so that articles about relevant topics are already available and include some basic annotations. Naturally, the import function cannot create sophisticated human-readable texts or import complex ontological statements that are otherwise not representable within the wiki. However, one can achieve good results in populating a wiki with a great number of articles. For further details, see Help:Ontology import.
SPARQL query service[edit]
The site can be configured to provide a SPARQL endpoint that enables expressive querying against the wiki's content. The wiki is primarily used for authoring "ABox" statements (i.e. simple facts concerning given individuals in contrast to complex schema information).
External reuse[edit]
Tools that can process OWL/RDF in a meaningful way (including many RDF tools), can also be used with Semantic MediaWiki. Help:Reuse lists a number of applications that have been tested with the wiki's output on one site or the other.
RDF export[edit]
Based on the user's semantic annotations of articles, Semantic MediaWiki generates machine-readable documents in OWL/RDF format, that can be accessed via Special:ExportRDF. Moreover, there is a maintenance script for automatically generating complete exports of all semantic data. This article explains how annotations are formally interpreted in the OWL ontology language, and how a suitable RDF serialisation is generated.
Using the export functionality[edit]
Users can easily access the generated RDF via the page Special:ExportRDF by entering a list of articles into the input field. The export will contain one OWL/RDF specification with various description blocks for exported elements. In addition to the requested articles, the export will also contain basic declarations for any further elements (such as mentioned instances, properties, and classes). There are two settings that further influence the set of exported articles:
- Recursive export. Every article usually has relations to various other articles. Normally, those other articles are just declared briefly such that tools can find further RDF specifications for them if desired. By enabling recursive export, all information about the encountered objects will be exported right away. Since this process is continued for all further objects, this option can lead to large results.
- Backlinks. The RDF data model is based on directed graphs. When exporting an article, one usually exports only the statements within which the corresponding element occurs as a subject, and the exported document does not include incoming links. This restricts RDF browsers, since they cannot access all elements that have some relationship to something without retrieving the whole RDF first. For this reason, one can enable the export of backlinks. All articles that have relations to any of the exported articles then will also be exported.
The server administrator can restrict the availability of the above options, and can set default values cases where no parameters can be given (see below). The reason is that the above options, especially in combination, can easily lead to the export of major parts of the wiki in RDF, which might overly impair the performance of large sites.
In addition to the form at Special:ExportRDF, one can also retrieve RDF by calling appropriate URLs directly. This is suitable for linking to RDF specifications directly. In its basic form, this is achieved by appending a (URL encoded version of an) article name to the URL of the export service. For instance, one can link to
http://wiki.ontoworld.org/index.php/Special:ExportRDF/ESWC2006
to get this RDF directly. Alternatively, the article name can also be specified as a GET parameter "page" within the URL, e.g.
http://wiki.ontoworld.org/index.php?title=Special:ExportRDF&page=ESWC2006
Additional GET parameters[edit]
In addition to title and page, ExportRDF has additional GET (query string) parameters.
- Backlinks can be enabled or disabled by setting "backlinks" to 1 or 0, respectively.
- Recursive export can be enabled or disabled by setting "recursive" to 1 or 0.
Both settings will be ignored if disabled by the administrator. If no settings are given, site-wide default values apply. For example, the ontoworld.org wiki always exports RDF with backlinks.
The default Content-Type of ExportRDF's output is application/xml (with charset=UTF-8). Content-Type of application/rdf+xml can be set by adding the "xmlmime=rdf" GET parameter; some processing tools require this RDF mimetype to process the output.
Exporting all data[edit]
In addition to the wiki's Special:ExportRDF function, there is also a maintenance script that allows you to export all of the wiki's semantic data at once. The script is called SMW_dumpRDF.php and can be found in SMW's maintenance directory. This directory also contains a README file that describes how to install maintenance scripts in your local MediaWiki installation.
The script SMW_dumpRDF.php can generate full exports, or it can be restricted to certain elements of the schema, e.g. to export only the category hierarchy or only the attributes with their types. Details are described in the script itself.
The script can easily be run automatically as a cronjob to generate RDF dumps on a regular basis. For ontoworld.org, the generated dumps can be obtained from http://ontoworld.org/RDF/.
The exported data in detail[edit]
Mapping wiki-pages to ontology elements[edit]
The export distinguishes the page in the wiki and the "thing" that the page discusses. ...
It is possible to import an external vocabulary (such as foaf or Dublin Core) into Semantic MediaWiki and associate attributes and relations in SMW with the external vocabulary, so that in RDF export the SMW attributes and relations export as properties in the external vocabulary (such as foaf:knows or skos:concept).
Categories[edit]
MediaWiki category relations are exported using existing RDF/RDFS properties. In brief:
- A category assignment in a regular article is exported as rdf:type which states "is an instance of a class". So use of MediaWiki categories is a good match for "is a" in the sense of "San Diego is an instance of the class Cities".
- A category assignment in a Category article is exported as rdfs:subClassOf which states "all the instances of one class are instances of another". So use of MediaWiki categories within categories is a good match for "is a" in the sense of "all instances of Divided cities are Cities".
There are many usages of MediaWiki categories that conflict with these semantics. For example, the article Urban decay might be in category Cities, but it is not a city. And Category:City museums might be in category Cities, but city museums are not cities.
Restrictions for OWL DL[edit]
...TODO
Reuse[edit]
A number of Semantic Web tools can immediately be used with the RDF export of Semantic MediaWiki. This page lists various tools that have been tested with SMW and explains what features they offer.
- The Geocoding Tools for Python provide a simple toolkit for obtaining the geographic coordinates of places in the real world. For example, you can find out the location of some building you are at by providing the buildings name. The name is looked up in various sources, including Semantic MediaWikis, and the location is returned. Further located in statements are used to infer the location of sub-locations.
- Tabulator is an online browser for RDF documents available at http://www.w3.org/2005/10/ajaw/tab.html. Using it on external resources such as ontoworld.org requires a change in security settings of the browser, since JavaScript must be enabled to access external sites. After this is done, RDF of ontoworld can be loaded conveniently by loading the URL of some RDF export, e.g. http://wiki.ontoworld.org/index.php/Special:ExportRDF/ESWC2006. The interface is still somewhat crude, but the tool shows that dynamic interfaces for browsing RDF and RDF-enabled sites are feasible. Tabulator takes advantage of Semantic MediaWiki's feature of creating browsable RDF which includes "backlinks" in RDF triples as well.
- Longwell is a Java-based facetted browser for RDF. It includes features for browsing, querying and query refinement, Timeline support and graphical presentation of RDF graphs. It works very well with the RDF generated by Semantic MediaWiki and its faceted browsing capabilities provide some real added value for the semantic data. It is available online via http://simile.mit.edu/longwell/.
- FOAF Explorer is an online viewer for RDF files that use the ow:FOAF vocabulary. It can be accessed via http://xml.mfd-consult.dk/foaf/explorer/. Like most FOAF tools, FOAF Explorer is still somewhat limited and does not deal well with arbitrary RDF that just contains some FOAF statements. It seems to work well on some pages, like http://wiki.ontoworld.org/index.php/Special:ExportRDF/User:Markus_Krötzsch.
- Piggy Bank is a browser extension for Firefox available at http://simile.mit.edu/piggy-bank/. Semantic MediaWiki refers to a page's RDF specification within its HTML header so that Piggy Bank can find it. In effect, Piggy Bank will collect RDF of all visited articles during browsing. Of course, the further use of this RDF, as in all applications of Piggy Bank, is left to the user.
- KAON2 is a reasoner and ontology management API for OWL DL and SWRL. It can be used to check the validity of the exported OWL specifications. In conjunction with ow:OWL tools, you can also transform the exported RDF into ow:LaTeX sources that show the according DL statements (which tend to be more readable for humans, or at least for DL-aware humans).
- OpenLink RDF Browser is a Browser independent and Javascript based RDF Browser that is part of the OpenLink Ajax Toolkit (OAT). The deliverables includes a live demonstration that works with Semantic MediaWiki RDF Export pages.
Import vocabulary[edit]
In Semantic MediaWiki it is possible to import and reuse vocabulary that belongs to existing Semantic Web documents or standards by associating the vocabulary's elements with wiki terms. An example of this functionality is the use of the FOAF vocabulary in the ontoworld.org wiki. Although the associated terms work like any other annotation in the wiki, the RDF that is exported will directly use the elements of the FOAF specification, thus allowing users to edit FOAF files through the wiki.
Example: importing foaf:knows[edit]
Normally, concepts that can be described (and be used for description) in the wiki are defined by the wiki and thus are local. So the URI that RDF export uses to denote a concept is usually from a specific namespace that should be used only by the wiki. Even if you create a property that seems to come from a well-known vocabulary because you name it foaf:knows, it would still be exported with the URI
http://''your.site''/wiki/Special:URIResolver/Property-3AFoaf-3Aknows
which is the XML-compatible URI derived from the article URL (see Help:RDF export for details). Semantic MediaWiki's mechanism for importing vocabularies solves this problem by allowing the reuse of external vocabulary in a controlled way. After an administrator makes the external vocabulary available, any user can state that a term in the wiki matches an imported element by adding the following text to its article:
[[imported from::foaf:knows]]
The special property imported from tells SMW that the wiki's property really refers to the property http://xmlns.com/foaf/0.1/knows from the FOAF vocabulary.
The wiki article doesn't have to be named "Property:foaf:knows", the user could choose any name to represent the property within the wiki. (ontoworld.org uses "foaf:knows" here since the community in this wiki is expected to have some technical background; In another wiki, one might prefer a less technical name.)
In order to interpret the above "imported from" statement, the wiki needs to understand the meaning of "foaf:knows" as a shorthand for "http://xmlns.com/foaf/0.1/knows". This is the case for "foaf:knows" since the wiki administrator has made this property available for this purpose. The section on importing further vocabularies describes how you can find out which elements are available, how new elements can be added, and what the idea behind this "control mechanism" is.
Importing properties[edit]
Basically, all known elements can be imported as described above. When importing properties, there is a minor additional effect: the imported element will also define the datatype of the property. In this example, the property foaf:name is declared as a string merely through its import statement. If you import a property, remove its "has type" statement.
Importing further vocabularies[edit]
As explained above, not every element of some external vocabulary can be imported. The easiest way for finding out whether something is available for import is to try importing it with a statement as the above. If this does not work, a message in the factbox will inform you about the problem. Possible reasons are:
- The namespace (e.g. "foaf:") is not available at all. The wiki does not know what it refers to, and it does not support any elements from this range.
- The specific element (e.g. "foaf:knows") is not available in the (known) namespace "foaf:".
- The element is known, but it cannot be imported for the specific kind of page you are trying to use it on. For example, you cannot import foaf:knows on a page in the wiki-namespace "Category:" since it should always be a "Property:".
In the last case, you can fix the problem by using an appropriate article for import. In the other cases, you only users with administrator privileges can add the unknown vocabulary elements. This can be done very quickly by modifying the wiki (see below), but there are also cases in which some vocabulary is not made available on purpose. Possible reasons for this are:
- The element you wish to use is not recommended for public use yet, or its correct use is still discussed in the community. An example of this is foaf:OnlineEcommerceAccount.
- The specification of the chosen vocabulary is too ambiguous or preliminary for current use. It might become more precise later which would affect the usage in the wiki (or render its earlier usage incorrect).
- The vocabulary is not widespread and standard, and the community of the wiki (represented by its admins) does not endorse its use at the moment. Not everyone's home-made ontology needs to be imported in a wiki.
- The specification of the vocabulary is tailored towards another ontology language than that of the wiki (OWL DL), and it is not clear how to map it correctly. Since importing some vocabulary also entails a mapping to OWL DL it must be specified how the mapping should be done. It is also questionable whether exporting such elements to OWL/RDF would be of much use.
- The imported element is part of the ontology vocabularies used for export.
- In this case, using the element also directly could impair the sanity of the export. E.g. if owl:Nothing would be imported as a category into the wiki, one could easily make the whole output inconsistent by adding some article to this class, which is not desirable.
- Alternatively, the element could already be sufficiently represented implicitly in the wiki. E.g. in order to state that something is an element of a class, one does not need to import the property rdf:type — it suffices to use the available category mechanism.
A quick way to find out what elements are available for some namespace abbreviation is to go to the article MediaWiki:smw_import_foaf, with foaf replaced by the namespace identifier you are interested in. The rationale behind this system is that the community can decide on a sane use of desired vocabularies, but each user still has the power to decide in which cases the available elements are used and by what name they should be represented within the wiki.
Making vocabularies available for import[edit]
Wiki-users with administrator status can make new elements available by simply editing a specific page for each vocabulary with a "magic" name. The page is in the Mediawiki namespace with the prefix smw_import_. For example, the page for the foaf vocabulary is named MediaWiki:smw_import_foaf. It contains something like
http://xmlns.com/foaf/0.1/|[http://www.foaf-project.org/ Friend Of A Friend] name|Type:String homepage|Type:URI Person|Category knows|Type:Page ...
The first line tells the wiki that
- the abbreviation "foaf:" refers to "http://xmlns.com/foaf/0.1/" and that
- "Friend Of A Friend" should be given as an additional human-readable label for elements of this vocabulary (e.g. in the factbox).
After this, there is a line that declares each vocabulary element that can be reused within the wiki. For instance, "name" (referring to "foaf:name") can only be a property of Type:String. The text after the "|" declares the (unique) context in which some element can be used. Elements that can be imported as properties are declared by specifying their type with Type:some datatype, elements that can be imported as categories are declared by specifying the "Category" namespace identifier. (Note that in SMW 1.0 the type and namespace depends on your language setting!). Moreover, one can also declare other elements by writing anything else than the above; however, we strongly recommend you use one meaningful string; we suggest "Main", although you can use such elements in namespace other than "Property" and "Category" as well.
Changing import statements[edit]
It can easily happen that some existing article of the wiki should be modified to represent (another) imported concept. For example, a wiki community that already uses a category "Person" might decide to map this category to "foaf:Person" in the future. Luckily, this is not a problem. Import statements in existing articles can be changed at any point in time without requiring additional updates in the wiki. The exported RDF will immediately be modified accordingly. Of course, in the case of properties, the datatype should usually not be changed without updating also the articles that use this property. This is mildly related to importing since an "imported from" declaration for properties will also define the datatype.
Ontology import[edit]
Semantic MediaWiki has a feature (still in beta, and disabled in version 1.0) to import ontologies into a Semantic MediaWiki installation. The ontologies have to follow certain formats, in order to be useful. Further down you will find a description of an alternative way of importing data from ontologies (or, actually, other formats as well).
Ontology format[edit]
The ontology elements -- i.e. classes, properties and individuals -- should all have labels. The labels will be used to name the relations, the categories, and the article pages, and also to create the appropriate annotations -- i.e. typed links or categorization links -- on the article pages. The mapping of the import naturally follows the mapping of the export, so it looks like this:
OWL Construct | Semantic MediaWiki |
---|---|
Class | Category |
Datatype property | Property |
Object property | Property also (??) |
Class instantiation | Page categorization (e.g. [[Category:X]]) |
Subclass of | Category subcategorization (e.g. [[Category:X]] on a category page) |
Individual | Article (in Main namespace) |
Instantiated datatype property | Attribute annotation (e.g. [[X:=Y]]) |
Instantiated object property | Typed link (e.g. [[X::Y]]) |
Note that the ontology needs to be in OWL DL in RDF-serialization, not just general RDF or RDFS, and that all properties and classes have to be defined as such in order to be recognized. Only explicit statements get imported, i.e. no reasoning occurs. So even if you can infer from the ontology that Adam is a man, he will not be put into the Category:Man unless such a triple is in the ontology, and Man is defined as an OWLClass. If you want to import implicit statements, you have to make them explicit first. You can use any reasoner that allows you to materialize such statements.
Note also that all further constructs from OWL DL, like inverse relations, complement classes, class union, etc., will not be imported into the ontology. If you want to use more complex ontologies with the Semantic MediaWiki, check out the publication ow:Reusing Ontological Background Knowledge in Semantic Wikis.
How it works[edit]
On the page Special:Import ontology you can upload the ontology file. The special page is only available to users with admin privileges. After you have chosen the ontology file the system parses it (using RAP, thanks!), you will be presented with a list of all importable statements, i.e. especially statements that are not within the wiki already (though this display is a bit buggy, sorry for that). Here you can choose every statement to import, and you can enter a small text to be imported alongside the import (for example a template that will resolve to a message telling the user that this information was imported from a particular ontology).
After you have chosen the appropriate statements and set all other options, click on the import button at the far end of the page, and wait. A few moments later, the statements should have been imported (check Recent changes).
Note that this part is still somewhat buggy. You may want to try smaller portions of the ontology first, or even single statements, to see if it works as you want it to.
Alternative import[edit]
As an alternative to this experimental feature, you can use pre-processing tools to annotate wiki pages with the wiki text for SMW properties, then import those pages using MediaWiki import tool(s).
A more robust way to import ontologies, is to use a framework like the Python Wikipedia Bot. It should work with other wikis as well, not just with Wikipedia, but you will have to create a new family file in order to get access to your wiki. In this case you are not constrained to using OWL DL compliant ontologies.
For example, on the Ontoworld wiki we imported the delegates list from the ow:ESWC2006 ontology. We sketch the program in the following. It uses the rdflib library for the RDF parsing, and it uses the Wikipedia Bot framework to work with Wikipedia. It creates a template out of the RDF. It could also create sentences with typed links inside (see towards the end of the code for an example), or check the output of the page first if the triple to be added is already included (and thus may be skipped).
from rdflib import Graph, URIRef, Literal, Namespace, RDF import wikipedia, login, category family = "ontoworld" # note that you need to setup the appropriate family file i = Graph() i.bind("foaf", "http://xmlns.com/foaf/0.1/") RDF = Namespace("http://www.w3.org/1999/02/22-rdf-syntax-ns#") RDFS = Namespace("http://www.w3.org/2000/01/rdf-schema#") FOAF = Namespace("http://xmlns.com/foaf/0.1/") i.load("eswc.rdf") ow = wikipedia.Site('en') login.LoginManager('password', False, ow) unchanged = list() # in order to safe those that already have a page # iterates through everything that has the type Person # (note, only explicit assertions -- rdflib does not do reasoning here!) for s in i.subjects(RDF["type"], FOAF["Person"]): for n in i.objects(s, FOAF["name"]): # reads the name p = wikipedia.Page(ow,n) # gets the page with that name if p.exists(): unchanged.append(n) else: # create the instantiated template h = '{{Person|' '\n' h = ' Name=' n for hp in i.objects(s, FOAF["workplaceHomepage"]): h = '|' '\n' hp = hp[7:] h = ' Homepage=' hp if len(hp)>23: # if the name of the page is too long, h = '|' '\n' if hp.find("/"): # make something shorter hp = hp[0:hp.find("/")] h = ' Homepage label= at ' hp for hp in i.objects(s, RDFS["seeAlso"]): h = '|' '\n' h = ' FOAF=' hp h = '\n' '}}' # end Person template # write a sentence h = '\n' "'''" n "''' attended the [[delegate at::ESWC2006]]." # add a category h = '\n' '\n' '[[Category:Person]]' print n ' changed' p.put(h, 'Added from ontology') wikipedia.stopme() print unchanged
As you have the full power of Python available, you can basically parse any machine-readable document and process it any way you like. As of 2006, and as long as the ontology import is still not perfect, this is the recommended way to import data into the ontology (especially since it allows you much more freedom in stating the facts and reusing templates than the ontology import ever will).
There is an alternative description for importing data with a script.
Administrator manual[edit]
The following pages describe how to download, install, and configure Semantic MediaWiki (SMW) as a site administrator. If you want to learn about using SMW as a wiki user (with or without administrator rights in the wiki), then the user manual is the right place for you. But basic installation and maintenance of SMW does not necessarily require to be familiar with its usage.
Since SMW is an extension to MediaWiki, a current release of MediaWiki is needed first. This can be obtained in various ways as described on the MediaWiki homepage. Please install this software and make sure that it works as expected before installing SMW. As an extension to MediaWiki, SMW usually requires only very little modifications of the basic system to run. Adjustments might be useful for customising its behaviour, or for tweaking its performance.
The main sections of this handbook explain the most important aspects of administratnig SMW:
- Downloading SMW from various sources
- Installation and upgrade
- Configuration
- Useful extensions of SMW
Detailed navigation is provided by the table of contents on the right. Other relevant parts of this handbook include:
- Getting support – who can help out with questions?
- Reporting bugs – making your requests for improvements heard
- SMW Project – the makers of Semantic MediaWiki