Migrate from MediaWiki to GitBook

Volcano 2018-06-06 15:18

Export data from MediaWiki

Using ‘Special:Export’ to export all pages at MediaWiki

  • Go to Special:Allpages and choose the desired article/file.
  • Copy the list of page names to a text editor
  • Put all page names on separate lines. I did this job with Vim.
  • Go to Special:Export and paste all your page names into the textbox, making sure there are no empty lines.
  • Click ‘Submit query’
  • Save the resulting XML to a file using your browser’s save facility.

Now you can use this XML file to next step

Convert MediaWiki documents to markdown format

We have a big XML file now. It contains all pages in MediaWiki. All the pages are in MediaWiki format.pandoctool can convert the format to markdown directly. Butpandoccan’t parse XML file.

I use toolMediaWiki to Markdownto do the job. It’s written in PHP. The conversion uses an XML export from MediaWiki and converts each wiki page to an individual markdown file.

The tool is easy to use. An example convert all pages to Github Flavoured Markdown format.

php convert.php --filename=mediawiki.xml --output=export --format=gfm 

Import data to GitBook

I got many md files in previous step.

  • Create two foldersenandzhfor multiple languages support
  • Copy all md files to the two folders. Cleanup files for each language.
  • CopyMain_Page.mdtoSUMMARY.md. It’s the index file for gitbook. I rewrited this file carefully.
  • Create languages file for gitbook
* [English](en/)
* [Chinese](zh/)
  • Create book file for gitbook, Content for book.json.
{
    "plugins": [
        "language-selected",
        "expandable-chapters-small", 
        "-search", 
        "-livereload", 
        "-fontsettings", 
        "-lunr", 
        "-sharing"
    ]
}
  • Install gitbook and preview

Rewrite rule for nginx

I wrote these rewrite rules for nginx. It make the new URLs compatible with original MediaWiki.

        rewrite ^/wiki/Main_Page$ /_book/index.html last;
        rewrite ^/index.html$ /_book/index.html last;
        rewrite ^/gitbook/(.+)$ /_book/gitbook/$1 last;
        rewrite ^/wiki/(.+).html$ /en/$1.html permanent;
        rewrite ^/wiki/(.+)$ /en/$1.html permanent;

        location /en/ {
                try_files $uri /_book$uri /_book$uri.html;
        }

        location /zh/ {
                try_files $uri /_book$uri /_book$uri.html;
        }

Miscs

I have many markdown files from original MediaWiki. It’s too sad gitbook can’t process markdown file outside the SUMMARY.md. You can find more peoples have same problem at the repo of Gitbook. I usethe workgroundto resolve the problem.

[返回] [原文链接]