Converting HTML files for a wiki? Here’s a script

I can’t be the only person who has had these thoughts:

Wow, this intranet is so 1998 I may as well be listening to rap-metal. It’s a random collection of static HTML pages…no one knows what’s out there and none if it is searchable.

My favorite solution to this quandary is a wiki. For the un-familiar, a wiki is essentially simple web site that can be easily added to or changed by anyone who visits the site. Each page usually has an Edit this page link (or somesuch) on it which takes you to a screen where you can type in some new info (or fix what’s there).

For added fun, most wikis support a way of adding formatting to the text you contribute. I prefer one called Textile, but Markdown is an excellent choice as well. But let’s pretend you’re having this thought:

This wiki stuff sounds super, and this Textile thingy looks neat, but we have a lot of worthwhile info in our random collection of static HTML pages…is there an easy way to get that stuff converted onto a wiki?

The answer is: “sorta.” Keep in mind that the syntax used in wikis (i.e. Textile, Markdown, etc.) is not HTML, so basic copy/paste is a no-go. However, you’d think that if there was a way to convert a given HTML page into Textile or Markdown that would get you most of the way there.

Luckily, I found a script that does this pretty well. This script is written in Python and runs on the command line, so you’ll have to be pretty geeky to make use of it. I actually extracted this script from this Mac OS X Service.

You may notices it converts the text to more of a Markdown format. What I’ve done is simply use Instiki as my wiki software of choice, which can be set up to support both Textile and Markdown.

Good luck with your wiki projects.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s