Help:Repairing SMW's data

From HackerspaceWiki
Jump to: navigation, search

SMW admin manual

Download

Installation

Configuration

Concept caching

Troubleshooting

Repairing SMW's data

Extensions

Basic extensions

Semantic extensions

SPARQL endpoint

SMW user manual

All data that SMW uses is stored in wiki pages. If the data should ever get out of date or contain any errors, then it is always possible to completely rebuild the data from the wiki. No data is ever lost. Refreshing data is also needed on some software upgrades, and after the first installation (since it also gathers some existing metadata).

This page describes way to repair/initialise basically any SMW installation. The data of a single page can be refreshed by simply editing and saving it. If there are many pages it is more convenient to use a feature of Special:SMWAdmin for doing this automatically. There is also a maintenance script for doing this from the command line: SMW_refreshData.php

To make sure that all wiki pages display the new data after the repair, you can run touch LocalSettings.php (or, if there is no command line access, edit it in some trivial way). This will invalidate any MediaWiki page caches that may otherwise make you see old versions of wiki pages.

Using Special:SMWAdmin[edit]

The administration special page Special:SMWAdmin offers a feature for repairing all data. This page is only available to wiki users with administrator status. Moreover, the update process can only be started or stopped online if the configuration option $smwgAdminRefreshStore is set to true (default).

Once initiated, the update takes time. The progress can be viewed on Special:SMWAdmin. Even if the option $smwgAdminRefreshStore is disabled after starting the update, the ongoing process will continue and can be tracked online. Stopping the process is only possible if $smwgAdminRefreshStore is enabled.

The time the update will take varies from wiki to wiki. The update progresses during each page view. If many people view your wiki, then the update progresses more quickly. If there are a large number of pages, then the update will take longer. It is normal that the update progresses faster until it reaches 50%, since only property and type apges are refreshed during that part. The actual update of all wiki pages starts at 50%. To speed up the process, you can use the MediaWiki maintentance script runJobs.php. Please consider specifying a parameter --maxjobs 1000 or similar so that each run of the script is bounded in duration. Otherwise the script tends to occupy increasing amounts of memory.


Using the SMW maintenance script[edit]

While the above method can also be combined with a maintenance script, there is also a script SMW_refreshData.php that directly refreshes selected portions of the wiki without any prior web access. The basic operation of SMW_refreshData.php is to go though all pages of the wiki and to re-store the semantic data for each. Normally, the script can be run by changing to the directory [SMW_path]/maintenance of your SMW installation, and the executing

php SMW_refreshData.php -v

where of course php needs to be installed on the command line. If this does not work on your site (e.g. due to unusual directory structures), read the file [SMW_path]/maintenance/README in that directory.

The above script goes through all pages in the order they are stored in your wiki database, and refreshes their data. The parameter -v makes sure that the script's progress is printed. The script can be aborted by CRTL-C as usual. The index numbers shown by the script refer not only to page indices as used in MediaWiki, but also to indices SMW uses in its semantic data. For this reason, the script may process indices that are higher than the maximal page index in the wiki.

If you have a large number of pages then the script may consume a lot of memory during its execution, and it is better to stop after, say, 2000 pages. This is due to a PHP memory leak. As a workaround, the script can be run for only part of the pages at a time: use the parameters -s and -e to give a first and last page id to be refreshed, e.g.

php SMW_refreshData.php -v -s 1000 -e 1999

Mutliple runs of this script might be needed, e.g. since data for properties can only be stored when the datatype of the property was stored. You can run the script with parameters -tp to refresh only type and property pages at first, so that these are already available when doing the second refresh. Overall, more than two refreshs should not be required in normal cases.

To make sure that all wiki pages display the new data after the refresh, you can run touch LocalSettings.php. This will invalidate any MediaWiki page caches that may otherwise make you see old versions of wiki pages.

Rebuilding everything[edit]

The above methods should be able to fix data records in SMW in most cases. However, it is conceivable that some erroneous content of the SMW storage still persists for some reason. In this case, it makes sense to completely delete and reinstall the database structures of SMW before refreshing all data.

To completely delete all SMW data, the setup script can be used with parameter --delete:

php SMW_setup.php --delete

After this, proceed as if re-installing SMW anew by first running php SMW_setup.php again, and then triggering the repair of all data using one of the above methods.

The refresh script SMW_refreshData.php can be also used with parameter -f to delete and recreate all data in one step. In this case, it is suggested to first rebuild the records for all properties and types, and to process the remaining data afterwards. So one would run:

php SMW_refreshData.php -ftpv
php SMW_refreshData.php -v

Note that of course only the first run uses -f. On large wikis, the parameters -s and -e can again be used as explained in the previous section.

Automatic repair features[edit]

Some changes on wiki pages require that the data of other pages is updated as well. For example, if a template that contains semantic annotations is changed, then the data for all pages using this template might also require update. Likewise, if the datatype of some property is changed, all pages using this property should be refreshed. SMW usually takes care of such updates automatically. As in MediaWiki, it may take some time until all required updates are completed. In this case, there is no convenient way to review the progress, but the number of jobs (see Special:Statistics) indicates the current background activity of the wiki.

This documentation page applies to all SMW versions from 1.4.0 to the most current version.
Other versions: 1 – 1.3".0" can not be assigned to a declared number type with value 1.4.       Other languages: ".0" can not be assigned to a declared number type with value 1.4.

Help:Repairing SMW's data en 1.4.0".0" can not be assigned to a declared number type with value 1.4.