Crawling and Copying Web Pages. #8

September 23, 2021

There are many reasons one may want a copy of a Web page or whole site.

I tend to just use Chrome browser's share > print and pdf printer function to save web pages I hope to be useful information later on. And try to keep data well sorted by creating and nesting file names.

But on a paged site this is not Idea. And while I do use Archive.org and Wayback machine.

Having a offline playable copy of a website is often ideal.

There seems to be two main categories of (free) types of web crawlers besides online and downloaded.

And I most always prefer downloaded ones.

Type One is just a complete as site allows working HTML copy of site you can view offline.

For this I have used www.httrack.com for many years.

I have in Windows 10 and today added Adroid apk to the Tablet.

While it would be Hacking nice to have a copier that disobeyed website crawler rules. I have not had one for many years now.

Recommendations welcomed.

The Second type of crawler seems to deal with SEO tags and other useful to site owner or someone wanting to improve their site's findability.

I will look into these later on and update this blog.

Search This Blog

The Rainbowhat Hacker

Crawling and Copying Web Pages. #8

Comments

Post a Comment

Popular posts from this blog

Software Engineering Institute . #6

Beginners Steps-2. OSINT Open Source Intelligence, #11