archive/compression formats

Putting things places for reasons!

  • torrents
  • usenet
  • shar
    • self extracting shell archive script
  • nar
    • normalized archive used by guix. Comparable to tar.
  • zip
    • ignores invalid data in zipfile (how APE executables work)
    • contains directory structure
    • supports compression
  • rar
  • Apache Parquet format
    • column page store for large datasets with metadata searching
    • Apache Arrow supports operations like querying/reading/writing
  • web archive (warc with pywb)
pip install --user pywb
wb-manager init my-web-archive
wget --recursive --warc mywarcfile https://mysite
wb-manager add my-web-archive <path/to/my_warc.warc.gz>
wayback # http://localhost:8080/my-web-archive/<url>/