PDA

View Full Version : Capturing the web for offline viewing


sequoyah70
2006-01-18, 19:54
I got my 770 yesterday and have been playing around with ways to capture web pages on my desktop for viewing on the 770 while travelling. Finding WiFi or BT/phone connections is not always easy or cheap, and if you already know what you want to read, why not just load up the memory card? Here is my method. Does anyone have a better way of doing this?

1. Navigate to the web page on your desktop.
2. Choose a printable version of the page if possible.
3. Save the web page, complete. In IE this makes an html file and a folder with images etc
4. Drop these on the 770s memory card.
5. Open the file in Opera

Cool thing: If you tell Opera to show the optimized view of the page, the text will reflow on the page when you zoom in and out.

Chainsaw76
2006-01-18, 20:16
Have you tried the "scarpbook" plugin for Firefox? It seems to get a more complete snapshot than I've been able to get with ie.

-Jason

JMills
2006-01-18, 20:30
Have you tried the "scarpbook" plugin for Firefox? It seems to get a more complete snapshot than I've been able to get with ie.

-Jason


If you get a copy of `wget` into your cache-building environment, then you can have it not only pull down full mirrors of content, but also _rewrite_ all of the internal references to be relative rather than absolute, so you can view it all offline.

-also Jason. ;-)

DaveC
2006-01-18, 20:35
You can also do this with Safari on OS X, just save the web page (preferably the formated for Printing version if one is available) as a Web Archive and copy the web archive to the 770, it'll show up as a directory with the HTML and images neatly packaged inside. Open up the HTML file and you'll be able to read off line no problem, I think off page links are intact also, really handy for travel.

Dave

MikeB
2006-01-19, 04:41
I second the ScrapBook extension for Firefox. I have used it to download geocashing information that I was unable to get using IE. (IE didn't handle the logon credentials properly for me.) I just copied the folder to my 770 and browsed from the file system.

Smiley Dan
2006-01-19, 12:37
Scrapbook is great but... what the hell does it have to do with the 770? I guess you could have some kind of workaround using it but for starters you might be browsing the 770 in the first place.

Mike Cane
2006-01-19, 13:38
@ desktop:

Then WTF is the point of having a device with *WiFi*? I did this sort of this on my WiFi-less GENIO PPC. It's just silly go back to that with a 770.

Chainsaw76
2006-01-19, 13:49
If you get a copy of `wget` into your cache-building environment, then you can have it not only pull down full mirrors of content, but also _rewrite_ all of the internal references to be relative rather than absolute, so you can view it all offline.

-also Jason. ;-)

Has anyone gotten a copy of wget for the 770? The Maemo Wiki pages suggest to get it from a specific distro, but for the life of me I couldn't find wget.

-Jason

Chainsaw76
2006-01-19, 13:58
@ desktop:

Then WTF is the point of having a device with *WiFi*? I did this sort of this on my WiFi-less GENIO PPC. It's just silly go back to that with a 770.
Someone said recently:
It seems to me the entire point of an Internet Tablet is to be able to read the net, on or off line.

-Jason

Hedgecore
2006-01-19, 14:55
The point is I'm in a city of 4 million people and it's still not supersaturated with wireless - - plus I'm on the move so when I do hop on an AP my time's limited.

Usually a quick load of engadget at the streetcar stop is enough to keep me occupied on the subway... the ensuing bus is for eBooks.

fpp
2006-01-19, 17:12
Has anyone gotten a copy of wget for the 770? The Maemo Wiki pages suggest to get it from a specific distro, but for the life of me I couldn't find wget.
I don't know Slackware but for other non-GUI command-line tools (or daemons) I've had some success with Debian ARM binaries.

First you need to find your way through the debian binary packages repository, which can be confusing. Here's a hint :
ftp://ftp.debian.org/debian/pool/main/w/wget/

Then download the binary package for the ARM platforms (of which the 770 is one), here :
ftp://ftp.debian.org/debian/pool/main/w/wget/wget_1.10.2-1_arm.deb

Although theses packages end in .deb they don't work with the maemo installer which needs a specific repackaging. The maemo python pages explain how you can unpack them though, *if you are "root"* :

dpkg -X some-package.deb /
(note trailing slash...)

This will put all necessary files in the right places but may not do some things that the installer would (like creating users etc.) so there may be additional steps left up to you (--> RTFM :-).

I have done this successfully for Privoxy (also on the Maemo wiki wish list) and am in the process of documenting it in the wiki here.

Good luck,
fp

Chainsaw76
2006-01-19, 18:39
Thanks that got me closer. I got it and got it unpacked/installed.. Now I have to hunt down dependencies.. "libssl.so.0.9.8"

-Jason

nomdenok
2006-01-19, 18:57
Firmware release 3.2005-51-13 shut down browser caching. Prior to updating, I I had been quickly browsing sites such as slashdot or BBC News before heading to the bus stop, then using the back arrow during the comute. Since the update, mucking with opera.ini made it sort of work again, though wasn't able to get the slashdot page to cache properly.
I think the ideal would be to have caching normally off, but a browsor icon
to store the current page to rsmmc flash. Should be a configuration menu to indicate how deep into the page to cache. Also a momentary popup indication of how many of the most recently selected pages fit into the space allocated for cache.

Chainsaw76
2006-01-19, 18:58
Thanks that got me closer. I got it and got it unpacked/installed.. Now I have to hunt down dependencies.. "libssl.so.0.9.8"

In a fit of actualy "thinking" I decided to grab an older version of wget and install that (under the assumption it was linked against an older version of libssl. and *yay* it was. For anyone looking for it its:
wget_1.9.1-12_arm.deb and install it with the instructions above.

Link: ftp://ftp.debian.org/debian/pool/main/w/wget/wget_1.9.1-12_arm.deb

-Jason

Mike Cane
2006-01-19, 19:08
@ desktop:

Chainsaw: Did you have a comment or just busting chops?

Chainsaw76
2006-01-19, 19:44
Just busting chops.

-Jason

Mike Cane
2006-01-19, 20:09
@ desktop:

OK, so long as I know.

sandstorm
2006-03-01, 01:33
Scrapbook looks good. I'll have to try that out.

The save feature built into the 770 is not much use if you have many pages or an entire day of forum posts you would like to read offline.


Another app that looks great so far is http://www.httrack.com. I just downloaded and am planning on saving some forums I frequent for offline reading. (including this one)

The nice thing is you can set up saved downloads and run them on your PC when you are ready to go somewhere. It even saves dynamic pages and multiple websites in one go.

You can then copy them wireless or via usb cable to your 770 for offline reading later.

Hopefully someone will eventually create a solution that can be run directly from the 770.

-Sandstorm.