# wiki.bash-hackers.org Extraction of wiki.bash-hackers.org from the Wayback Machine This is targeting pages that have been captured by the Wayback Machine that specifically have `'?do=edit'` on the end of their URL. This gives us the markdown source. See the incomplete script "archive_crawler" to see my working. - TODO: Markdown linting - TODO: Markdown conversion from Dokuwiki "Markup" to GitHub "Markdown" using pandoc - TODO: Parse the already downloaded files for any missing links - TODO: Rinse and repeat ## Extracting the markdown So the pages that have `'?do-edit'` on the end of their URL appear to have a reliable and predictable structure: ```bash [ LINES ABOVE REMOVED FOR BREVITY ]