I learned to program in Python from DiveIntoPython.org. Unfortunately, the author took down this site and all of his other sites(Dive into Python, Python3, HTML5 and others) in one day and deleted his repositories for these sites. I wrote a script to scrape the Wayback machine (using BeautifulSoup) for these sites, and reuploaded them to the .net versions of these domains. I also scoured the web for the original downloads, and the translations of each site. Currently, my mirrors are the most complete that I can find, though they aren’t finished yet. They are hosted on Amazon S3, so they don’t suck up any server resources and don’t go down. I also host them on GitHub and encourage people to fork them, so they are never lost again. In the repositories, there is a scrape.py that scrapes the sites and an upload.py script that uploads them to S3 through boto.
Note: The sites were written under a permissive license, so I’m not committing copyright infringement by rehosting them.