Chrome OS Files app plugin for reading many archives/compression formats
This is a fork of the bundled Chrome OS ZIP Unpacker extension. It enables support for a wide variety of archive and compression formats. It also supports files that have only been compressed (e.g. foo.gz). All of this is thanks to the great libarchive project.
You can install it via the CWS: https://chrome.google.com/webstore/detail/mljpablpddhocfbnokacjggdbmafjnon
Note that we support archives (compressed or uncompressed), and we support
single compressed files (that have no archiving, e.g. foo.gz
).
Here's the list of supported archive formats:
Here's the list of supported compression/encoding formats:
Most archive formats don't include an index. This means we need to decompress
the entire file just to get a directory listing. The formats allow any ordering
by design. For example, it could be ./bar.txt
, ./foo/blah.txt
, ./asdf.txt
.
Or it could be ./asdf.txt
, ./foo/blah.txt
, and ./bar.txt
. The only way we
can produce a complete directory listing is by looking through the entire file.
This slows things down overall (like in tarballs) and there isn't much that can be done about it.
However, there are a some file formats that do have indexes and we don't (yet) support using those. 7-zip is the most notable one here.
A similar issue comes up with single compressed files. Many formats do not know the uncompressed file size, so the only way to calculate it is by decompressing the entire file. If we were to report a fake file size (like zero bytes, or a really large file size) to the Files app, it wouldn't be able to copy the result out. It would try to read the number of bytes that it was told were available. For the few formats that do include the uncompressed size in their header (like the gzip format), we can skip the decompression overhead.
Some formats can be encrypted with passwords, but we don't prompt the user, so the files aren't decrypted. Oops.
Some formats can span multiple files, but we don't yet support those.
"It's complicated."
The WGU extension doesn't support the RAR format today. Chrome OS supports it natively via cros-disks -> AVFS -> official unrar program. We can't replace that stack until we have comparable coverage.
The RAR format has gone through a number of major revisions (at least 5 so far). A smart Russian came up with it long ago and continues to develop it as a company (RARLAB). It's a proprietary format and, while some code has been released by them, they are hostile to reverse engineering. As such, only the v1, v2, and v3 formats are supported. Unfortunately, v4 and v5 formats are common and users tend to use those more.
There is an open source unrar library released by RARLAB, but the API is not documented, and its runtime model does not mesh well with libarchive's runtime model. It's possible, but it's not trivial.
Sometimes people ask, since WGU is based on the official Chrome OS Zip unpacker that is bundled with Chrome OS today, why don't we just merge the two so that Chrome OS supports everything WGU does out of the box?
"It's complicated."
From the product team's perspective, they don't want to support an extensive set of formats if there is not high user demand for them. If users run into problems (and they inevitably will), the engineering costs aren't justified.
Similarly, they don't want to say "ZIP is officially supported, but all other formats are 'best effort'". Most users don't care about those trade-offs -- they just want their system to work. All they see is that they tried to open a 7z file and it didn't work even though opening a different 7z file worked. Trying to explain these nuances doesn't really scale.
Thus the status quo is to not support the formats at all. Users can try and locate alternatives (like WGU), and in the process of doing so, understand that the resulting software might be buggy. And those bugs are not the fault of the Chrome OS product (although some will still complain that Chrome OS should have included support out of the box).
Everyone has a reasonable position taken in isolation. But the end result is that everyone loses. Offering best-effort support makes users unhappy, but offering nothing also makes them unhappy. At least this way, the blow back on the Chrome OS product is lower.
Please use the issues link here to report any issues you might run into.
This is the ZIP Unpacker extension used in Chrome OS to support reading and unpacking of zip archives.
Since the code is built with NaCl, you'll need its toolchain.
$ cd third-party
$ make nacl_sdk
We'll use libraries from webports.
$ cd third-party
$ make depot_tools
$ make webports
First install npm using your normal packaging system. On Debian, you'll want something like:
$ sudo apt-get install npm
Your distro might have old versions of npm, so you'd have to install it yourself.
Then install the npm modules that we require. Do this in the root of the unpacker repo.
$ npm install bower vulcanize crisper
Once done, install the libarchive-fork/ from third-party/ of the unpacker project. Note that you cannot use libarchive nor libarchive-dev packages from webports at this moment, as not all patches in the fork are upstreamed.
$ cd third-party
$ make libarchive-fork
Polymer is used for UI. In order to fetch it, in the same directory type:
$ make polymer
Build the PNaCl module.
$ cd unpacker
$ make [debug]
The package can be found in the release or debug directory. You can run it directly from there using Chrome's "Load unpacked extension" feature, or you can zip it up for posting to the Chrome Web Store.
$ zip -r release.zip release/
Once it's loaded, you should be able to open ZIP archives in the Files app.
Paths that aren't linked below are dynamically created at build time.
Some high level points to remember: the JS side reacts to user events and is the only part that has access to actual data on disk. It uses the NaCl module to do all the data parsing (e.g. gzip & tar), but it has to both send a request to the module ("parse this archive"), and respond to requests from the module when the module needs to read actual bytes on disk.
When the extension loads, background.js registers everything and goes idle.
When the Files app wants to mount an archive, callbacks in app.js
unpacker.app
are called to initialize the NaCl runtime. Creates an
unpacker.Volume
object for each mounted archive.
Requests on the archive (directory listing, metadata lookups, reading files)
are routed through app.js unpacker.app
and to volume.js unpacker.Volume
.
Then they are sent to the low level decompressor.js unpacker.Decompressor
which talks to the NaCl module using the request.js unpacker.request
protocol. Responses are passed back up.
When the NaCl module is loaded, module.cc NaclArchiveModule
is instantiated.
That instantiates NaclArchiveInstance
for initial JS message entry points.
It instantiates JavaScriptMessageSender
for sending requests back to JS.
When JS requests come in, module.cc NaclArchiveInstance
will create
volume.h Volume
objects on the fly, and pass requests down to them (using
the protocol defined in request.h request::*
).
volume.h Volume
objects in turn use the volume_archive.h VolumeArchive
abstract interface to handle requests from the JS side (using the protocol
defined in request.h request:**
). This way the lower levels don't have to
deal with JS directly.
volume_archive_libarchive.cc VolumeArchiveLibarchive
implements the
VolumeArchive
interface and uses libarchive as its backend to do all the
decompression & archive format processing.
But NaCl code doesn't have access to any files or data itself. So the
volume_reader.h VolumeReader
abstract interface is passed to it to provide
the low level data read functions. The volume_reader_javascript_stream.cc
VolumeReaderJavaScriptStream
implements that by passing requests back up
to the JS side via the javascript_requestor_interface.h
JavaScriptRequestorInterface
interface (which was passed down to it).
So requests (mount an archive, read a file, etc...) generally follow the path:
request::*
NaclArchiveModule
-> NaclArchiveModule
-> request::*
Volume
VolumeArchive
VolumeArchiveLibarchive
VolumeReader
VolumeReaderJavaScriptStream
JavaScriptRequestorInterface
JavaScriptRequestor
JavaScriptMessageSenderInterface
JavaScriptMessageSender
-> request::*
Then once VolumeArchive
has processed the raw data stream, it can return
results to the Volume
object which takes care of posting JS status messages
back to the Chrome side.
Here's the JavaScript code that matters. A few files have very specific purposes and can be ignored at a high level, so they're in a separate section.
unpacker.app
unpacker.Volume
objects.unpacker.Volume
unpacker.Volume
instance.unpacker.Decompressor
unpacker.Volume
requests.unpacker.request
protocol.unpacker.request
unpacker.PassphraseManager
These are the boilerplate/simple JavaScript files you can generally ignore.
unpacker.types
unpacker
unpacker.*
namespace.Here's the NaCl layout.
JavaScriptMessageSenderInterface
interface so the rest of
NaCl code can easily send messages back up.JavaScriptRequestorInterface
interface for talking to JS side.Volume
class that encompasses a high level volume.VolumeArchive
.JavaScriptRequestorInterface
interface.VolumeArchive
interface for handling specific archive formats.VolumeArchive
using the libarchive project.VolumeReader
interface for low level reading of data.VolumeReader
.JavaScriptRequestorInterface
to get data from the JS side.To see debug messages open chrome from a terminal and check the output. For output redirection see https://developer.chrome.com/native-client/devguide/devcycle/debugging.
Install Karma for tests runner, Mocha for asynchronous testings, Chai for assertions, and Sinon for spies and stubs.
$ npm install --save-dev \
karma karma-chrome-launcher karma-cli \
mocha karma-mocha karma-chai chai karma-sinon sinon
# Run tests:
$ cd unpacker-test
$ ./run_js_tests.sh # JavaScript tests.
$ ./run_cpp_tests.sh # C++ tests.
# Check JavaScript code using the Closure JS Compiler.
# See https://www.npmjs.com/package/closurecompiler
$ cd unpacker
$ npm install google-closure-compiler
$ bash check_js_for_errors.sh