![]() |
|
Filename extension | .djvu, .djv |
---|---|
Internet media type | image/vnd.djvu, image/x-djvu |
Type code | DJVU |
Developed by | AT&T Labs - Research |
Initial release | 1998 |
Latest release | Version 26 (open) Version 27 (undocumented)[1] / July, 2006 |
Type of format | Image file formats |
Open format? | GPLv2 for DjVu Reference Library and DjVuLibre-3.5; License grants under GPL for several patents that cover aspects of the library[2] |
Website | www.djvu.org |
DjVu (pronounced like déjà vu) is a computer file format designed primarily to store scanned documents, especially those containing a combination of text, line drawings, and photographs. It uses technologies such as image layer separation of text and background/images, progressive loading, arithmetic coding, and lossy compression for bitonal (monochrome) images. This allows for high-quality, readable images to be stored in a minimum of space, so that they can be made available on the web.
DjVu has been promoted as an alternative[3] to PDF, promising smaller files than PDF for most scanned documents.[4] The DjVu developers report[5] that color magazine pages compress to 40–70 kB, black and white technical papers compress to 15–40 kB, and ancient manuscripts compress to around 100 kB; a satisfactory JPEG image typically requires 500 kB. Like PDF, DjVu can contain an OCR text layer, making it easy to perform copy and paste and text search operations.
Free browser plug-ins and desktop viewers from different developers are available from the djvu.org website. DjVu is supported by a number of multi-format document viewers and e-book reader software on Linux (Okular, Evince), Android (VuDroid), Windows (SumatraPDF), and iPhone/iPad (Stanza).
Contents |
History
The Djvu technology was originally developed[5] by Yann LeCun, Léon Bottou, Patrick Haffner, and Paul G. Howard at AT&T Labs from 1996 to 2001.
Due to its declared higher compression ratio (and thus smaller file size) and the ease of converting large volumes of text into Djvu format, and because it is an open file format, some independent technologists (such as Brewster Kahle[6]) have historically considered it superior to PDF.
Release history
The DjVu library distributed as part of the open source package DjVuLibre, has become the reference implementation for the DjVu format. DjVuLibre has been maintained and updated by the original developers of DjVu since 2002.
The DjVu file format specification has gone through a number of revisions:
|
|
Compression
DjVu divides a single image into many different images, then compresses them separately. To create a DjVu file, the initial image is first separated into three images: a background image, a foreground image, and a mask image. The background and foreground images are typically lower-resolution color images (e.g., 100dpi); the mask image is a high-resolution bilevel image (e.g., 300dpi) and is typically where the text is stored. The background and foreground images are then compressed using a wavelet-based compression algorithm named IW44.[5] The mask image is compressed using a method called JB2 (similar to JBIG2). The JB2 encoding method identifies nearly identical shapes on the page, such as multiple occurrences of a particular character in a given font, style, and size. It compresses the bitmap of each unique shape separately, and then encodes the locations where each shape appears on the page. Thus, instead of compressing a letter "e" in a given font multiple times, it compresses the letter "e" once (as a compressed bit image) and then records every place on the page it occurs.
Optionally, these shapes may be mapped to ASCII codes (either by hand or potentially by a text recognition system), and stored in the DjVu file. If this mapping exists, it is possible to select and copy text.
Programs that manipulate djvu files
- DjVuLibre is an open source library and a collection of some standard command line tools (decoders, encoders, utilities). Works on Unix, Mac, Windows.
- Djview4 is the current viewer for djvu files (Unix, Mac, Windows). Open source under GNU license.
- djvusmooth, by Jakub Wilk, is a GUI (graphical user interface) that allows one to add hyperlinks within a djvu file, and edit table of contents (Unix, Mac, Windows). Open source under GNU license.
- pdf2djvu, hosted on Google code, is able to convert PDF files to djvu, keeping the text layer (Unix, Mac, Windows). Open source under GNU license.
- SumatraPDF (Windows)
Format licensing
DjVu is an open file format.[4] The file format specification is published as well as source code for the reference library.[4] The original authors distribute an open source implementation named "DjVuLibre" under the GNU General Public License. The ownership rights to the commercial development of the encoding software have been transferred to different companies over the years, including AT&T, LizardTech, Celartem and Caminova.
In 2002, the DjVu file format was chosen by the Internet Archive as the format in which its Million Book Project provides scanned public domain books online (along with TIFF and PDF).[7]
Proprietary Variations
In 2007 LizardTech company released the proprietary variation of DjVu file format, called "secure DjVu". It is not byte compatible with the open DjVu format, and only the partial specification was released.[8][9] "Secure DjVu" is only supported by company's proprietary software that is only for Windows and MacOS, and is marketed as allowing document publishers the control over content printing, copying, also allowing login/password protection. This creates the situation, when there are two incompatible types of DjVu files, ones based on the open standard and supported by a wide variety of tools on a wide variety of operating systems, and secure DjVu, only supported on few operating systems by the proprietary software of one company.
References
- Notes
- ^ a b c d e f g h i j DjVu File Format Version, By Jim Rile, Posted: Fri Feb 23, 2007 1:08 am, PlanetDjVu
- ^ "DjVu Licensing". DjVu Sourceforge page. Sourceforge.net. 2011-08-17. http://djvu.sourceforge.net/licensing.html. Retrieved 2011-09-21.
- ^ "DjVu: Free alternative to PDF (and a script to convert plain text to DjVu)". 2011-01-19. http://baldwinsoftware.com/blog/2011/01/19/convert-plain-text-to-djvu/. Retrieved 2011-10-09.
- ^ a b c "What is DjVu—DjVu.org". DjVu.org. http://djvu.org/resources/whatisdjvu.php. Retrieved 2009-03-05.
- ^ a b c Léon Bottou, Patrick Haffner, Paul G. Howard, Patrice Simard, Yoshua Bengio and Yann Le Cun (1998). High Quality Document Image Compression with DjVu, 7(3):410-425. Journal of Electronic Imaging. http://leon.bottou.org/publications/pdf/jei-1998.pdf.
- ^ Brewster Kahle (December 16, 2004). "Universal Access to All Knowledge" (Audio; Speech, at approx. 1h38). Conversations Network. http://itc.conversationsnetwork.org/shows/detail400.html.
- ^ "Image file formats—OLPC". Wiki.laptop.org. http://wiki.laptop.org/go/DJVU. Retrieved 2008-09-09.
- ^ Partial specification of the proprietary "secure" variation of DjVu format released by LizardTech.
- ^ Section 6: THE "SECURE DJVU" FORMAT.
External links
- DjVu.org: Commercial website for DjVu-related software.
|
|