Update breezewiki URL; update README

This commit is contained in:
Jeffrey Serio 2023-10-02 10:19:59 -05:00
parent 6459a87d17
commit 14d0f4c725
3 changed files with 44 additions and 4 deletions

40
README.md Normal file
View File

@ -0,0 +1,40 @@
# archive-fandom-wiki
This program archives the content of fandom wikis. It doesn't scrape from the fandom.com wiki sites directly; rather, it uses my [BreezeWiki](https://breezewiki.hyperreal.coffee) instance to avoid downloading unnecessary ads, images, and other junk.
Each resulting archive is self-contained, meaning one can extract the contents and browse the wiki snapshot locally (offline). The URLs for CSS, images, and links in each page are replaced by the relative `file:///` URLs for their corresponding pages on the local filesystem.
## Installation
Make sure Python and Pip are installed. Then run:
``` bash
git clone https://git.sr.ht/~hyperreal/archive-fandom-wiki
cd archive-fandom-wiki
python -m venv venv
source venv/bin/activate
pip install -r requirements.txt
```
## Usage
``` bash
archive-fandom-wiki dishonored
```
## Podman/Docker
There is also a Containerfile, also known as a Dockerfile.
``` bash
git clone https://git.sr.ht/~hyperreal/archive-fandom-wiki
cd archive-fandom-wiki
podman build -t localhost/archive-fandom-wiki:latest .
```
To run the container image:
``` bash
podman run --name archive-fandom-wiki --rm -v "${HOME}/archives:/output:Z" localhost/archive-fandom-wiki dishonored
```

View File

@ -1,6 +1,6 @@
#+title: archive-fandom-wiki #+title: archive-fandom-wiki
This program archives the content of fandom wikis. It doesn't scrape from the fandom.com wiki sites directly; rather, it uses my [[https://wiki.hyperreal.coffee][BreezeWiki]] instance to avoid downloading unnecessary ads, images, and other junk. This program archives the content of fandom wikis. It doesn't scrape from the fandom.com wiki sites directly; rather, it uses my [[https://breezewiki.hyperreal.coffee][BreezeWiki]] instance to avoid downloading unnecessary ads, images, and other junk.
Each resulting archive is self-contained, meaning one can extract the contents and browse the wiki snapshot locally (offline). The URLs for CSS, images, and links in each page are replaced by the relative ~file:///~ URLs for their corresponding pages on the local filesystem. Each resulting archive is self-contained, meaning one can extract the contents and browse the wiki snapshot locally (offline). The URLs for CSS, images, and links in each page are replaced by the relative ~file:///~ URLs for their corresponding pages on the local filesystem.
@ -8,7 +8,7 @@ Each resulting archive is self-contained, meaning one can extract the contents a
Make sure Python and Pip are installed. Then run: Make sure Python and Pip are installed. Then run:
#+begin_src bash #+begin_src bash
git clone https://git.hyperreal.coffee/hyperreal/archive-fandom-wiki.git git clone https://git.sr.ht/~hyperreal/archive-fandom-wiki.git
cd archive-fandom-wiki cd archive-fandom-wiki
python -m venv venv python -m venv venv
source venv/bin/activate source venv/bin/activate
@ -23,7 +23,7 @@ archive-fandom-wiki dishonored
** Podman/Docker ** Podman/Docker
There is also a Containerfile, also known as a Dockerfile. There is also a Containerfile, also known as a Dockerfile.
#+begin_src bash #+begin_src bash
git clone https://git.hyperreal.coffee/hyperreal/archive-fandom-wiki git clone https://git.sr.ht/~hyperreal/archive-fandom-wiki
cd archive-fandom-wiki cd archive-fandom-wiki
podman build -t localhost/archive-fandom-wiki:latest . podman build -t localhost/archive-fandom-wiki:latest .
#+end_src #+end_src

View File

@ -24,7 +24,7 @@ class FandomWiki:
def __init__(self, name: str): def __init__(self, name: str):
self.name = name self.name = name
self.canonical_url = f"https://{name}.fandom.com" self.canonical_url = f"https://{name}.fandom.com"
self.breezewiki_url = f"https://wiki.hyperreal.coffee/{name}" self.breezewiki_url = f"https://breezewiki.hyperreal.coffee/{name}"
self.site_dir = Path.cwd().joinpath(f"{name}.fandom.com") self.site_dir = Path.cwd().joinpath(f"{name}.fandom.com")
self.images_dir = self.site_dir.joinpath("images") self.images_dir = self.site_dir.joinpath("images")