mkZiplib 1.0 Manual

Permission to use, copy, modify, and distribute this software and its documentation for any purpose and without fee is hereby granted. The author makes no representations about the suitability of this software for any purpose. It is provided "as is" without express or implied warranty. By use of this software the user agrees to indemnify and hold harmless the author from any claims or liability for loss arising out of such use.
 

 CONTENTS

mkZiplib 1.0 - A package for data compression, based on Zlib 1.1.3 and Minizip 0.15.

Introduction
Commands
Notes
Examples
Installation
Changes
Author

 INTRODUCTION

mkZiplib is essentially a wrapper around the compression libraries Zlib 1.1.3 and Minizip 0.15 (see the notes section below). It provides four new commands that allow to compress and decompress data, and to work with .gz (as produced by gzip) and .zip files. mkZiplib makes use of some of Tcl's newer API functions and therefore requires Tcl 8.2 or higher.

 COMMANDS

deflate ?-level 0..9? Data

The deflate command takes Data as a binary string, compresses it, and returns the compressed Data as a binary string. The -level option influences the compression level and the execution speed. It accepts values between 0 (no compression, fast) and 9 (maximum compression, slower). For short Data strings, the output from deflate can be longer than the original Data, which is due to internal overhead from headers and the way the compression algorithm works.

inflate ?-size numBytes? Data

The inflate command is the counterpart of deflate and decompresses Data. If the size of the uncompressed data is known, then it can be specified with the -size option. If not, buffer space is allocated as needed during decompression. Using the -size option avoids internal reallocation of memory and can hence increase the command's performance.

gz Option ?args ...?

The gz command allows to read and write .gz files. This format is common on Unix and supported by the command gzip (and gunzip). A gz file contains plain compressed data, but no "file directory". It therefore usually represents the compressed version of exactly one file (unlike the zip format, which can contain multiple files).

gz open ?-level 0..9? GzFileName ?AccessMode?
For AccessMode r (the default), the file specified by GzFileName is opened for subsequent reading operations by means of gz read or gz gets. For AccessMode w the file is created, or trunctated if it already exists, for subsequent writing operations with gz write or gz flush. With AccessMode a, the file is opened for writing, and new data will be appended to it. The -level option has the same function as described with the deflate command. If successful, the command returns a file handle required for subsequent calls to the gz command.

gz close GzHandle
Closes the file identified by GzHandle. If the file was openened for writing, all buffered data is automatically flushed prior to closing the file.

gz write GzHandle Data
Compresses and writes Data to the file specified by GzHandle. Data can of course be a binary string. Note that the command does not append a newline character to the Data. The command returns the number of uncompressed bytes written to the file.

gz flush GzHandle
Flushes any buffered output to the file specified by GzHandle. Similar to the standard Tcl flush command.

gz gets GzHandle
Reads the next line from the file specified by GzHandle, up to but not including the end-of-line character. Similar to the standard Tcl gets command.

gz read GzHandle ?NumBytes?
Reads all or NumBytes bytes from the file specified by GzHandle. Similar to the standard Tcl read command.

gz eof GzHandle
Returns 1 if an end-of-file condition occurred during the most recent operation on the file specified by GzHandle, 0 otherwise.

gz handles
Returns a list with the handles of all currently open gz files.

zip Option ?args ...?

The zip command allows to read and write .zip files. This format is common on Windows platforms and supported by many tools (FreeZip, pkZip, Winzip, InfoZip and others). A zip archive can contain multiple files, which makes the usage of the zip command slightly more complicated than that of the gz command.

zip open ZipFileName ?AccessMode?
For AccessMode r (the default), the archive specified by ZipFileName is opened for subsequent reading operations by means of zip read. For AccessMode w the archive is created, or trunctated if it already exists, for subsequent writing operations with zip write. If successful, the command returns a file handle required for subsequent calls to the zip command.

zip close ZipHandle
Closes the archive specified by ZipHandle. If the file was openened for writing, all buffered data is automatically flushed prior to closing the file.

zip comment ZipHandle ?String?
Sets or retrieves the comment string of an archive. If String is specified and the archive was opened for writing, then String is used as the new comment for the archive (it is actually written during gz close). If String is not specified, the archive's comment string is returned, if any.

zip set ZipHandle ?FileName? ?options?
Sets or returns the name of the "current file" within the archive. All calls to zip write, zip read and zip eof refer to the current file. The FileName is actually just a string and does not have to be the name of an existing file (though it's usually the name and path of the file which is about to be zipped). If the archive was opened for reading, options are not allowed. If the archive was opened for writing, the following options are accepted:
-level 0..9 defines the compression level and works like in the deflate command.
-comment String stores a comment string along with the specified file.
-time integer stores the file with a certain timestamp. The option value must be an integer as returned by [clock seconds].
-attributes integer stores file attribute flags along with the file. The flags are platform specific (sorry, I've got no more information).

zip write GzHandle Data
Compresses and writes Data to the archive. Data can be a binary string and always refers to the "current file" as specified by zip set.

zip read GzHandle ?NumBytes?
Reads all or NumBytes bytes from the file specified by GzHandle. Similar to the standard Tcl read command.

zip eof GzHandle
Returns 1 if an end-of-file condition occurred during the most recent operation on the file specified by GzHandle, 0 otherwise.

zip files GzHandle
Returns a list with the names of the files in the archive.

zip info GzHandle FileName
Returns a list with information about the file specified by FileName. The list consists of five elements: The timestamp of the file (as specified during writing with zip set, or -1 if not specified), the compressed size of the file, the uncompressed size of the file, the file attributes (see zip set), and the file comment (see zip set).

zip handles
Returns a list with the handles of all currently open zip archives.

 NOTES

Credits

This extension is based on the great work that was put into Zlib and Minizip. Zlib is a compression library compatible with the gzip format. It is very portable and works for virtually any computer hardware and operating system. Zlib is written by Jean-Loup Gailly and Mark Adler, and is freely available. Minizip, written by Gilles Vollant, is a library for working with zip files. It uses Zlib and is included in the Zlib 1.1.3 distribution.

Limitations

Unfortunately, it is not possible to append files to a zip archive, nor to delete files from a zip archive (the Minizip 1.15 library does not provide functions for that at this time).

And...

A more elegant implementation would have been to create a new channel type for gz and zip files and do IO operations through regular puts, gets and read commands. If you are looking for something like this, check out Andreas Kupries' trf package.

 EXAMPLES

Deflate and inflate

  % set sComp [deflate -level 9 "We all live in a yellow submarine, yellow submarine, yellow submarine..."]
  (...Repetitive patterns are good for compression...)
  % inflate $sComp
  We all live in a yellow submarine, yellow submarine, yellow submarine...
  

The gz command

  % set hGz [gz open test.gz w]
  4457920
  % gz write $hGz "We all live in a yellow submarine, yellow submarine, yellow submarine...\n"
  72
  % gz close $hGz
  %
  % set hGz [gz open test.txt.gz]
  4457920
  % gz gets $hGz
  We all live in a yellow submarine, yellow submarine, yellow submarine...
  % gz close $hGz
  

The zip command

  % set hZip [zip open test.zip w]
  10289776
  % zip set $hZip file1.txt -level 9 -comment "File 1" -time [clock seconds]
  % zip write $hZip "This is the contents of file 1"
  % zip set $hZip file2.txt -level 9 -comment "File 2" -time [clock seconds]
  % zip write $hZip "This is the contents of file 2"
  % zip comment $hZip "My first zip file"
  %
  % set hZip [zip open test.zip]
  10244344
  % zip comment $hZip
  My first zip file
  % zip files $hZip
  file1.txt file2.txt
  % zip info $hZip file1.txt
  988336176 30 30 0 {File 1}
  % zip info $hZip file1.txt
  988336234 30 30 0 {File 2}
  % zip set $hZip file1.txt
  % zip read $hZip
  This is the contents of file1
  % zip set $hZip file2.txt
  % zip read $hZip
  This is the contents of file2
  % zip close $hZip
  %
  

 INSTALLATION

 General

You will need the Zlib 1.1.3 library to run mkZiplib. For the Windows platform this library (zlib.dll) is included in the mkZiplib distribution. On Unix it is very easy to build: "./configure && make && make install" is all it normally takes. The minizip sources (zip.c, zip.h, unzip.c, unzip.h) are part of the Zlib package and are also included herein. They are not built as a library but statically linked.

mkZiplib is written in C and comes with a DLL for Windows. On Unix, the package needs to be compiled into a shared library first (see below). mkZiplib works with Tcl/Tk version 8.2 and higher and is stubs-enabled.

To install, simply place the directory "mkZiplib1.0" into one of the directories contained in the global Tcl variable "auto_path". For a standard Tcl/Tk installation, this is commonly "c:/program files/tcl/lib" (Windows) and "/usr/local/lib" (Unix).

Compiling

If you don't have Zlib 1.1.3 already in your system (libz.a), then download, compile and install it first. All it normally takes is:

  ./configure && make && make install

Next, compile mkZiplib: Provide the correct path to "tcl.h" and link against "tcl83.lib" and "zlib.lib" (Windows) or "libtcl83.a" and "libz.a" (Unix) respectively. If you use stubs, define USE_TCL_STUBS and link against "tclstub83.lib" (Windows) or "libtclstub83.a" (Unix) instead.

For Visual C++, the following command should work:

  cl /I c:/progra~1/tcl/include /D USE_TCL_STUBS /c mkZiplib10.c zip.c unzip.c
  link c:/progra~1/tcl/lib/tclstub83.lib zlib.lib /dll mkZiplib10.obj zip.obj unzip.obj

On Linux 2.2, this here works fine:

  gcc -shared -DUSE_TCL_STUBS -ltclstub8.3 -lz -o mkZiplib10.so mkZiplib10.c zip.c unzip.c

Test

Test the installation by opening a tclsh or wish and entering "package require mkZiplib". The string "1.0" should appear. If it fails, "cd" into the directory "mkZiplib1.0" and load it directly with "load ./mkZiplib10.dll" (Windows) or "load ./mkZiplib10.so" (Unix). If no error occured, it succeeded and something must be wrong with the location of "mkZiplib1.0".

 CHANGES

No changes - Initial version.

 AUTHOR

Michael Kraus
mailto:michael@kraus5.de
http://mkextensions.sourceforge.net