Ghostscript projects seeking developers

Table of contents

For other information, see the Ghostscript overview.


There are many projects that would improve Ghostscript and that we would like to do, but for which we don't have enough resources. If you would like to take responsibility for any of these projects, please contact us. Additional comments on implementation approaches or project goals are in italic type like this.

Windows driver using Ghostscript as a language monitor.

MS Windows has a "language monitor" capability which would allow Ghostscript to be invoked seamlessly to process input files in any language Ghostscript handles and for any printer for which Ghostscript has a driver. Doing this properly would require integrating Ghostscript with Windows' "Add Printer" dialog, using an appropriate PPD.

Russell Lang's RedMon program provides some, but not all, of this capability. See also lib/ghostpdf.ppd.

Netscape browser plug-in.

Currently, Ghostscript can work as a "helper application" for the Netscape browser, but not as a plug-in; the latter would integrate it more closely with the browser. We aren't sure what doing this would involve; we've also heard by rumor that it's already been done.

Ghostscript as an Active-X COM Object.

In order to integrate Ghostscript into XMetaL and other applications it would be convenient for Ghostscript to be distributed as a COM object along with the current gswin32.exe, gswin32c.exe and gsdll32.dll files.

Visual Trace window for X.

Currently Ghostscript implements Visual Trace window for Windows only (see wdtrace.c). An implementation for X would be useful.


Driver Architecture

Improved multi-threaded rendering support.

Currently, drivers can be written so that converting PostScript to a list of graphical objects can run in one thread, and rasterizing the objects can run in another thread. However, drivers must be written specially if they are going to do this. We would like to change the architecture so that any driver can work this way. We would also like to support dual-threaded operation for drivers that produce high-level output, such as the PDF writer. Doing this would require separating banding from the multithreaded logic. Also, currently each thread has its own allocation pool: this is unnecessary in the normal case, since Ghostscript now supports properly locked access to the C heap, but embedded systems still need to use a fixed-size area for the rasterizing thread. With a locked, shared allocator, the rasterizing thread could use the full set of band list functions; with a fixed-size area and a separate allocator, only a subset is available, as is the case now for dual-threaded drivers.

Dynamic run-time loadable devices.

Currently, drivers must be linked into the executable. We would like to be able to load drivers dynamically. Doing this requires defining a platform-independent API (presumably extending the current gp_* APIs) that would work at least on Linux, vendor Unix, MS Windows, and Macintosh. Unix systems should include Sun, HP, AIX, IRIX, DEC; Linux ELF and a.out formats should both be supported. Consider the Netscape plug-in architecture.

Moving 'setpagedevice' into C.

The PostScript 'setpagedevice' function implements matching of media and page size requests to available media, page orientation, and paper handling (duplex, etc.) Currently it is implemented in PostScript code, which means it is not available for use with other input languages. (It is available for PDF, which Ghostscript implements on top of PostScript, but not for the not-yet-freely-available PCL interpreters that use the Ghostscript library, or for possible future SVG or similar interpreters). We would like to move this function into C. The device driver will be required to send page parameters up to PostScript to be stored in a resource. To be included in this project are handling policy implementations in the device drivers. DeferredMediaSelection should also be implemented.

Adding 'tee' for output to multiple devices.

In a few cases, it would be desirable to provide a 'tee' capability for drivers: specifically, for generating small, low-resolution 'thumbnail' images concurrently with other output. Probably the simplest way to do this is to generate a band list and then process it twice. This is not completely trivial, since the band list does include device resolution information and scaling would be required for some constructs.

OutputDevice resource category

Each available output device should provide an instance of the OutputDevice resource category, which gives the available page sizes, resolutions, media classes, process color models, and other information about the device. This would replace the current non-standard use of a 4-element PageSize in the InputAttributes entry of the page device dictionary.

Removing the limit on the length of OutputFile.

Currently, the maximum length of the OutputFile parameter is a compile-time constant, gp_file_name_sizeof. This is appropriate for ordinary file names, since this constant is the platform's limit on the length of a file name. However, if OutputFile is a pipe, the length should not be limited in this way. This is probably a small project: it requires allocating the file name dynamically, and freeing it in the finalization routine that gets called when a driver instance is freed..

Specific drivers

PrintGear and PPA output drivers.

We would like to provide (Adobe) PrintGear and (H-P) PPA output drivers for Ghostscript, but the specifications for these protocols are not published. If you can provide them to us without violating any agreements, please let us know. (Some work has already been done on reverse-engineering these protocols, but we don't have references to it.)

Improve 'pswrite' up to the level of 'pdfwrite'.

We would like to improve the high-level PostScript-writing pswrite driver to bring it up to parity with the PDF-writing driver (including the many improvements in the latter being implemented in Ghostscript 7.xx). Specifically, we want it to write text as text rather than bitmaps, and to consistently write images in their original high-level form. We have already started to factor out code that should be common to these two drivers, specifically for writing embedded fonts and compressed data streams.

There is one small part of this project that would be especially valuable and could be done independently (although it might have to be partly or entirely redone later): compressing images. Currently the driver only compresses character bitmaps, and doesn't compress other images at all. It should use the CCITTFaxEncode filter for 1-bit-deep images, and plane-separated LZWEncode compression for color images. When generating LL3 PS, the Flate compression will work better than miGIF. It may be worth trying several methods on each image and use the one that works best.

High level graphics and text for PCL 5 and PCL XL drivers.

Currently, the PCL 5 drivers produce only bitmaps; the PCL XL driver produces high-level graphics and sometimes high-level images, but low-level text. We would like to improve these drivers to produce higher-level, smaller output. This was a very low-priority project; it has become more important now that H-P's laser printers are shipping with less memory.

Improved high level GDI driver for Windows.

We would like a "GDI driver" for MS Windows that would implement more higher-level constructs (specifically for text). The mswin and mswinprn drivers both do some of this. Some of the the 'xfont' support code for MS Windows should be useful. We were frustrated in the past because the GDI calls for getting font sizes and metrics consistently returned incorrect information and provided no way to get the correct information; perhaps this has been fixed in 32-bit Windows. We believe that H-P, Russell Lang, and perhaps others are working in this area, but we can always use more help.

PDF thumbnail generation.

The PDF writer needs to be able to generate thumbnails (small previews). We might do this through the 'tee' capability mentioned above. However, we currently prefer the idea of implementing a completely separate program to add thumbnails to an arbitrary, existing PDF file: this would allow Ghostscript to add thumbnails to PDF files generated by other programs. Much of the code needed to do this has already been written for Ghostscript's PDF linearizer: see lib/pdfwrite.ps. A user has implemented this as well, using a separate program that calls Ghostscript: see http://www.uni-giessen.de/~g029/eurotex99/oberdiek/.

Consolidate inkjet drivers into a single family.

In addition to factoring out the error diffusion code as described below, we would like to see another attempt at reducing the enormous volume of code for color inkjet drivers. There are three sets of drivers (gdevcdj.c, gdevstc.c, gdevupd.c) with much overlapping functionality. The latter two driver families make good attempts at factoring out things like head geometry and canned control strings, but we think this problem deserves another pass, especially in the hope of consolidating these drivers into a single family.

Download glyph bitmaps (with glyph decaching notification).

See below under "Notification for glyph decaching."

Preserve compression when writing PDF images.

Currently, all images are decompressed by the interpreter before being passed to the graphics library; the PDF writer may then compress them again. Ordinarily, this only slows things down a little, but in the case of DCT-encoded images that are being DCT-encoded in the output, image degradation may occur. Ideally, the implementation should be smart enough to not decode and re-encode the image. However, making this work properly is difficult. This would probably involve extending the library APIs for images so that they could pass a stream, possibly including filters, instead of the (fully decoded) data rows.

Emit warnings when producing PDF output.

Currently, the PDF writer has no way to emit warnings. Users would like to see warnings when fonts cannot be embedded (this is actually required when the value of CannotEmbedFontPolicy is set to /Warning), and for some other questionable situations like non-existent Dests (Feature request #480853). Probably the right way to handle this is with a pseudo device parameter called "Warnings" that is a list of strings: the pdfwrite driver would add strings to this list, and the ps2pdf script (lib/gs_pdfwr.ps) would read them out, print them, and reset them at the end of each page.


Graphics functionality

Support for 64-bit colors on 64-bit platforms.

Currently, the library supports a maximum of 32 bits of data per pixel; we would like to raise this limit to 64 bits on systems where the 'long' data type is 64 bits wide. The gx_color_index type is already defined as 'long', but there are many places where the type bits32 is used for pixel values; there is a 32-bit stored-image "device", but there is no 64-bit device; a few algorithms and tables have knowledge of the 32-bit width built into them, only because the C preprocessor doesn't have any kind of loop or repetition capability.

In-RIP trapping.

The PostScript specification includes an option for the interpreter to implement trapping (adjustments of object boundaries to prevent visual anomalies caused by slight misregistration of different ink layers): we would like to implement this. This is a complex and difficult area; even many Adobe RIPs don't do it.

Improve the font grid fitting and antialiasing.

Ghostscript includes a reduced True Type bytecode interpreter branched from FreeType 1. It performs a grid fitting for True Type glyphs except ones involving instructions patented by Apple. A wanted improvement is to implement a stem recognition algorithm similar to Free Type autohinting. It also would help to poorly designed Type 1 fonts, which have misplaced or missed hints.

Another useful improvement is to implement a font antialiasing with TextAlphaBits other than 1,2,4.

ICC profile support for output.

Ghostscript 7.00 and later supports ICCBased color spaces of PDF using the icclib package from http://web.access.net.au/argyll/color.html but there is no facility to use ICC output (printer) profiles that may be embedded in the PDF. Also it would be useful for PostScript to be able to directly use a specific Intent from ICC profile to convert output colors (as CRD's are now used). The primary difficulty is that the graphics library and PostScript always use CIE XYZ as the connection space, but ICC profiles may use CIELAB as the connection space, requiring conversion for use with the graphics library.

Making halftones into "objects" and adding new types.

Currently, knowledge of the specific data formats and algorithms for halftoning permeates too many places in the library. We would like halftoning to be more "object oriented" (using virtual procedures) so that we could support other halftoning methods such as direct use of threshold arrays, or the double-rectangle approach added in newer PostScript versions. Threshold arrays take much less space than the current representation, generally at the expense of longer rendering time for black-and-white images; double-rectangle representation would give us a better implementation of AccurateScreens. We might want store both threshold arrays and the current representation.

Factor out error diffusion routines, integrate ETS.

Currently, several different inkjet drivers implement their own, very similar but slightly differing error diffusion methods. This has caused severe code bloat as well as tempting future driver writers to contribute to it further. We want to factor out error diffusion into a common set of facilities that drivers can use. We would like to design these facilities so that they can easily interface to the Even-Toned Screening algorithms, to the extent that these will be Open Source.

Improve, or generalize, linearization for stochastic threshold data.

The Ghostscript distribution includes a stochastic threshold array. This array has some gamma correction built into it, which works well for some output devices and not for others. We would like to provide a version of this array without (or with less) gamma correction. We have original data available from which this could be done fairly easily.

Change sampled functions to use new interpreted functions.

The PostScript language defines many functions relevant to graphics rendering as being implemented by arbitrary PostScript procedures: transfer (gamma correction), black generation, undercolor removal, several stages of CIE color space and rendering, and color mapping for Separation and DeviceN spaces. Since the graphics library can't call PostScript procedures, Ghostscript currently samples these procedures at a fixed number of points and interpolates linearly between the samples. As of Ghostscript 6.20, the library can interpret a restricted subset of PostScript procedures directly (basically those that only use arithmetic and comparisons: no loops, sub-procedures, or data structures). Changing the rendering functions to use this approach when possible would greatly improve output quality when the functions are very non-linear (which we have actually seen in practice). This should only be done if the function is, in fact, severely non-linear, since interpreting the function definition will almost always be much slower than interpolating in the table.

Replace PostScript procedures with Function objects.

Currently, there is a lot of tiresome code for doing callbacks with continuations for loading the caches that hold sampled values for the procedures listed under "Change sampled functions ..." above. For the Separation and DeviceN tint transform functions, and only for these, PostScript code associated with the setcolorspace operator actually converts the PostScript procedure to a Function object -- to a FunctionType 4 (PostScript subset) Function if possible, or to a FunctionType 0 (sampled) Function if not. This approach should be used for all the other sampled functions. Doing this would reduce the amount of C code significantly, while only increasing PostScript code slightly.

This change would require touching (and slightly changing) all PostScript operators that currently do such callbacks: for example, rather than a setblackgeneration operator that takes a PostScript procedure as its operand, we would have a .setblackgeneration operator that takes as operands both the PostScript procedure (so that currentblackgeneration can return it) *and* a Function derived from it (which will actually be used when loading the cache, or for sampling directly if desired).

In some cases, this approach has a non-negligible space cost. If the PostScript procedure cannot be represented as a FunctionType 4 Function, it must be sampled and represented as a FunctionType 0 Function. Then the BG / UCR / transfer / ... cache will essentially just hold a copy of the Function data. While it is likely that this situation will be rare in practice, it might be worth looking into changing the internal representation of these caches so that they were the same as the representation of a FunctionType 0 Function with a particular choice of parameters. Then the PostScript code that called .buildsampledfunction when necessary could arrange the parameters to have the same values as the internal representation of the cache, and the cache could use the Function data directly. This is probably not worth the trouble.

Add optional cubic interpolation to RenderTable and other table lookup.

Currently, if a CIE rendering dictionary uses a lookup table for the final step, Ghostscript always interpolates linearly between the entries. Cubic interpolation should be supported as an option. A cubic interpolation option is also needed for general table-lookup Functions.

Add better (SVG-like) alpha channel and compositing to library.

Ghostscript has partial support for alpha channel and for alpha and RasterOp compositing. There is some architectural support for general compositing, but it postdates the RasterOp implementation, and most of the RasterOp code doesn't use it. We expect that the more extensive compositing and alpha capabilities of SVG will find their way into PDF (and probably PostScript as well) in the course of 2000 and 2001, and we will need to implement them.


Performance

Change band list logic to defer halftoning until rendering.

Currently, when Ghostscript uses a band list, it does halftoning before banding. It should do halftoning after banding: this produces smaller band lists and shifts more work to the rasterizer (which is good because the rasterizer can be multi-threaded internally for higher performance on multiprocessors: see the next topic.)

Reduce redundant data for smoothed banded images.

When smoothed ("interpolated") images are written in the band list, extra rows must be written above and below each band in order to provide the data for interpolation. Currently, the number of such rows is computed very conservatively; instead, the final interpolation algorithm should be consulted to provide the correct value. This is a small task.

Multi-threaded rasterizing

For high-resolution devices, rasterization dominates execution time. On multiprocessor systems, Ghostscript can do tasks in parallel:

We would want these facilities implemented so that no conditional compilation was involved: on uniprocessor systems, the locking API would simply have a vacuous implementation.

Notification for glyph decaching.

Currently, drivers can't do a very good job of downloading rendered character bitmaps to the device they manage, because they can't find out when a bitmap is being deleted from Ghostscript's cache and therefore will never be referenced again. Here is a sketch of how we would add this capability to the graphics library:

This facility was requested by the Display Ghostscript project, but it could also be used to improve the output of the PCL XL driver and possibly the X and PCL5 drivers.


Other functionality

OpenStep (Display PostScript + NeXT) extensions to Ghostscript.

There is a project to create a GNU implementation of the OPENStep API, which involves extending Ghostscript to provide the full functionality of Adobe's Display PostScript system with some of the NeXT extensions. For more information, please contact Net-Community <scottc@net-community.com>.

Job Server implementation.

For full Adobe PostScript compatibility, Ghostscript needs a real "job server" to encapsulate the execution of PostScript files. See the section on "Job Execution Environment" in the PostScript Language Reference Manual for details.

SVG (XML Structured Vector Graphics) interpreter.

Ghostscript could be adapted with some work to read SVG. This would be an interesting and challenging project because SVG's graphics model would require extending the library (see above). If SVG turns out to be an important standard, it is important that there be a good free implementation of it.

%font% and other IODevices.

Currently, the %font% IODevice is not implemented. We would like to see this implemented using a general framework for implementing IODevices (%xxxx%) entirely in PostScript, in an "object oriented" manner very similiar to the way Resource categories are implemented. An IODevice would be implemented as a dictionary with the following keys, whose values would be procedures that implemented the corresponding operation:

/File
/DeleteFile
/RenameFile
/Status
/FileNameForAll
/GetDevParams
/PutDevParams

There would only be global IODevices, no local ones; the dictionary keeping track of them would be stored in global VM.

This is an obscure feature that matters only because some PostScript code uses filenameforall with this IODevice, rather than filenameforall with the /Font Resource category, to enumerate available fonts.

Repairing damaged or EOL-converted PDF files.

Adobe Acrobat Reader can scan a PDF file that has had its end-of-lines converted by careless users transferring the file across operating systems as text rather than binary across, and reconstruct the cross-reference table which the PDF interpreter requires. This only works if the file has no binary data in it, which with PDF 1.3 is rarely the case. However, users occasionally receive PDF files that have been damaged in this way, and it might be useful to have a program that can repair them. We think this should probably be done as a separate program, possibly in PostScript, similar to Ghostscript's PDF linearizer.

Implementation improvements

Fully re-entrant code.

Currently, neither the PostScript interpreter nor the graphics library is fully re-entrant (no writable globals). Making them fully re-entrant would make Ghostscript usable in multi-threaded environments, and more easily usable in embedded environments. Note that this is necessary, but far from sufficient, for Ghostscript to allow simultaneous execution of a single Ghostscript interpreter instance by multiple threads: that is probably permanently out of the question. Almost all drivers, including all of the drivers in devs.mak which are maintained as part of the main Ghostscript code, are already fully re-entrant; making the remaining ones re-entrant should really be up to the driver author.

Ghostscript has no %ram% device.

The %ram% device is documented in PS Supplement 3010 and 3011 dated August 30, 1999. This is probably not a major impediment to portability, but it would be handy.

On Unix, the suggested implementation would be to create a subdirectory of the temporary directory (usually /tmp), with the name chosen and the directory created in such a way as to avoid /tmp races and similar problems. Ghostscript should delete the subdirectory when it exits.


Copyright © 2000-2006 Artifex Software, Inc. All rights reserved.

This software is provided AS-IS with no warranty, either express or implied. This software is distributed under license and may not be copied, modified or distributed except as expressly authorized under the terms of that license. Refer to licensing information at http://www.artifex.com/ or contact Artifex Software, Inc., 7 Mt. Lassen Drive - Suite A-134, San Rafael, CA 94903, U.S.A., +1(415)492-9861, for further information.

Ghostscript version 8.71, 10 February 2010