Alex Schroeder ๐Ÿ is a user on octodon.social. You can follow them or interact with them if you have an account anywhere in the fediverse. If you don't, you can sign up here.
Alex Schroeder ๐Ÿ @kensanata

Given the URL of a site, how do I download a coherent copy? web.archive.org/web/2009102700
If I add the date to my wget invocation, it only downloads the pages it got at the exact same moment, which won't do. And I can't follow the official instructions because I'm too stupid to use the Internet Archive Advanced Search.
blog.archive.org/2012/04/26/do
How can I search for all my pages from the GeoCities Special Collection 2009 and get the identifiers I need?
archive.org/web/geocities.php

ยท Web ยท 0 ยท 0

@kensanata There's a bittorrent archive available at archive.org. Also they archive pages in the WARC format and there's a proxy to use for viewing that in a browser

@ckeen What do you do with the 600GB torrent: download it all? Do you use a command line bittorrent and give it a URL pattern and only get those files?
I'm not even sure they have individual Geocities "sites" as a collection in single archive. If they do, where can I find a link to "the latest archive of geocities.com/kensanata/ in the Way Back Machine?"

@kensanata That's not easy if possible. First there's no guarantee that such a thing exists. It looks like that some links on your site 404'ed when the crawler came by for whatever reason.

I assumed that the wayback engine offers a way to get all the data from one crawl related to that site but It looks like I am wrong. Maybe send the folks at archive.org a message?

@ckeen Hm. I found a pretty good copy on oocities. Let's see how well that works.
oocities.org/timessquare/6120/

@ckeen Sadly, it's missing some elisp files. ๐Ÿ˜ญ