- Introduction
- Architecture
- File roadmap
- Memory management
- Portability
- Troubleshooting and debugging
- Profiling
For other information, see the Ghostscript overview.
This document provides a wealth of information about Ghostscript's internals, primarily for developers actively working on Ghostscript. It is primarily descriptive, documenting the way things are; the companion C style guide is primarily prescriptive, documenting what developers should do when writing new code.
THIS FILE IS A WORK IN PROGRESS. MANY SECTIONS ARE PLACE-HOLDERS.
Ghostscript has the following high-level design goals (not listed in order of importance):
These goals often conflict: part of Ghostscript's claim to quality is that the conflicts have been resolved well.
Part of what has kept Ghostscript healthy through many years of major code revisions and functional expansion is consistent and conscientious adherence to a set of design principles. We hope the following list captures the most important ones.
Ghostscript is designed to be used as a component. As such, it must share its environment with other components. Therefore, it must not require ownership of, or make decisions about, inherently shared resources. Specifically, it must not assume that it can "own" either the locus of control or the management of the address space.
Not owning control means that whenever Ghostscript passes control to its caller, it must do so in a way that doesn't constrain what the caller can do next. The caller must be able to call any other piece of software, wait for an external event, execute another task, etc., without having to worry about Ghostscript being in an unknown state. While this is easy to arrange in a multi-threaded environment (by running Ghostscript in a separate thread), multi-threading APIs are not well standardized at this time (December 2000), and may not be implemented efficiently, or at all, on some platforms. Therefore, Ghostscript must choose between only two options for interacting with its caller: to return, preserving its own state in data structures, or to call back through a caller-supplied procedure. Calling back constrains the client program unacceptably: the callback procedure only has the options of either returning, or aborting Ghostscript. In particular, if it wants (for whatever reason) to multi-task Ghostscript with another program, it cannot do so in general, especially if the other program also uses callback rather than suspension. Therefore, Ghostscript tries extremely hard to return, rather than calling back, for all caller interaction. In particular:
e_NeedInput
code rather than
using a callback. This allows the caller complete flexibility in its
control structure for managing the source of input. (It might, for example,
be generating the input dynamically.)
stdin
and output to stdout
and
stderr
.
The one area where suspension is not feasible with Ghostscript's current architecture is device output. Device drivers are called from deep within the graphics library. (If Ghostscript were being redesigned from scratch, we might try to do this with suspension as well, or at least optional suspension.)
Not owning management of the address space means that even though Ghostscript supports garbage collection for its own data, it must not do any of the things that garbage collection schemes for C often require: it must not replace 'malloc' and 'free', must not require its clients to use its own allocator, must not rely on manipulating the read/write status of memory pages, must not require special compiler or run-time support (e.g., APIs for scanning the C stack), must not depend on the availability of multi-threading, and must not take possession of one of a limited number of timer interrupts. However, in order not to constrain its own code unduly, it must also not require using special macros or calls to enter or leave procedures or assign pointers, and must not constrain the variety of C data structures any more than absolutely necessary. It achieves all of these goals, at the expense of some complexity, some performance cost (mostly for garbage collection), and some extra manual work required for each structure type allocated by its allocator. The details appear in the Memory management section below.
From many years of experience with the benefits of object-oriented design, we have learned that when the word "the" appears in a software design -- "the" process scheduler, "the" memory manager, "the" output device, "the" interpreter, "the" stack -- it often flags an area in which the software will have difficulty adapting to future needs. For this reason, Ghostscript attempts to make every internal structure capable of existing in multiple instances. For example, Ghostscript's memory manager is not a one-of-a-kind entity with global state and procedures: it is (or rather they are, since Ghostscript has multiple memory managers, some of which have multiple instances) objects with their own state and (virtual) procedures. Ghostscript's PostScript interpreter has no writable non-local data (necessary, but not sufficient, to allow multiple instances), and in the future will be extended to be completely reentrant and instantiable. The device driver API is designed to make this easy for drivers as well. The graphics library is currently not completely reentrant or instantiable: we hope this will occur in the future.
Ghostscript is designed to make configuration choices as late as possible, subject to simplicity and performance considerations. The major binding times for such choices are compilation, linking, startup, and dynamic.
In addition, a number of major implementation decisions are made dynamically depending on the availability of resources. For example, Ghostscript chooses between banded and non-banded rendering depending on memory availability.
At the largest design scale, Ghostscript consists of 4 layers. Layer N is allowed to use the facilities of all layers M <= N.
The most important interface in Ghostscript is the API between the graphics library and the device drivers: new printers (and, to a lesser extent, window systems, displays, plotters, film recorders, and graphics file formats) come on the scene frequently, and it must be possible to produce output for them with a minimum of effort and distruption. This API is the only one that is extensively documented (see Drivers.htm) and kept stringently backward-compatible through successive releases.
Ghostscript makes heavy use of object-oriented constructs, including analogues of classes, instances, subclassing, and class-associated procedures. Since Ghostscript is written in C, not C++, implementing these constructs requires following coding conventions. The "Objects" section of the C style guide explains these.
The memory manager API provides run-time type information about each class, but this information does not include anything about subclassing. See under Structure descriptors below.
This section of the document provides a roadmap to all of the Ghostscript source files.
See below.
Ghostscript uses 4 internal representations of color. We list them here in the order in which they occur in the rendering pipeline.
gs_client_color
, defined in
psi/gs.color.h), consisting of one or more
numeric values and/or a pointer to a Pattern instance. This corresponds
directly to the values that would be passed to the PostScript
setcolor
operator: one or more (floating-point) numeric
components and/or a Pattern. Client colors are interpreted relative to a
color space (gs_color_space
, defined in base/gscspace.h and base/gxcspace.h, with specific color spaces
defined in other files). Client colors do not explicitly reference the
color space in which they are are interpreted: setcolor
uses
the color space in the graphics state, while images and shadings explicitly
specify the color space to be used.
frac
values (defined in base/gxfrac.h) rather than floats. The procedure
for this step is the virtual concretize_color
and
concrete_space
procedures in the (original) color space.
This step reduces Indexed colors, CIEBased colors, and Separation and
DeviceN colors that use the alternate space.
gx_device_color
, defined in psi/gs.color.h and base/gxdcolor.h). For ordinary non-Pattern
colors, a device color is either a pure color, or a halftone. The device
and color model associated with a device color are implicit. The procedure
for this step is the virtual remap_concrete_color
procedure
in the color space.
gx_color_index
,
defined in base/gscindex.h). The device with
which they are associated is implicit. Although the format and
interpretation of a pixel value are known only to the device, the device's
color model and color representation capabilities are public, defined by a
gx_color_info
structure stored in the device (defined in base/gxdevcli.h). Virtual procedures of the
device driver map between pixel values and RGB or CMYK. (This area is
untidy and will need to be cleaned up when we implement native
Separation/DeviceN colors.)
Steps 2 and 3 are normally combined into a single step for efficiency, as
the remap_color
virtual procedure in a color space.
Using a device color to actually paint pixels requires a further step called
color loading, implemented by the load
virtual
procedure in the device color. This does nothing for pure colors, but loads
the caches for halftones and Patterns.
All of the above steps -- concretizing, mapping to a device color, and color loading -- are done as late as possible, normally not until the color is actually needed for painting.
All painting operations (fill, stroke, imagemask/show) eventually call a
virtual procedure in the device color, either fill_rectangle
or fill_mask
to actually paint pixels. For rectangle fills,
pure colors call the device's fill_rectangle
procedure;
halftones and tiled Patterns call the device's
tile_rectangle
; shaded Patterns, and painting operations
that involve a RasterOp, do something more complicated.
ICC profiles are in some ways a special case of color mapping, but are not standard in PostScript.
The following files provide a callback mechanism to allow a client program to specify a special case alternate tint transforms for Separation and DeviceN color spaces. Among other uses this can be used to provide special handling for PANTONE colors.
Ghostscript represents halftones internally by "whitening orders" -- essentially, arrays of arrays of bit coordinates within a halftone cell, specifying which bits are inverted to get from halftone level K to level K+1. The code does support all of the PostScript halftone types, but they are all ultimately reduced to whitening orders.
Threshold arrays, the more conventional representation of halftones, can be mapped to whitening orders straightforwardly; however, whitening orders can represent non-monotonic halftones (halftones where the bits turned on for level K+1 don't necessarily include all the bits turned on for level K), while threshold arrays cannot. On the other hand, threshold arrays allow rapid conversion of images (using a threshold comparison for each pixel) with no additional space, while whitening orders do not: they require storing the rendered halftone cell for each possible level as a bitmap.
Ghostscript uses two distinct types of rendered halftones -- that is, the bitmap(s) that represent a particular level.
The halftone level for rendering a color is computed in base/gxdevndi.c; the actual halftone mask or tile is computed either in base/gxcht.c (for multi-plane halftones), or in base/gxht.c and base/gxhtbit.c (for binary halftones).
Pattern colors (tiled patterns and shadings) each use a slightly different approach from solid colors.
The device color for a tiled (PatternType 1) pattern contains a pointer to a
pattern instance, plus (for uncolored patterns) the device color to be
masked. The pattern instance includes a procedure that actually paints the
pattern if the pattern is not in the cache. For the PostScript interpreter,
this procedure returns an e_RemapColor
exception code: this
eventually causes the interpreter to run the pattern's PaintProc, loading
the rendering into the cache, and then re-execute the original drawing
operator.
The device color for a shading (PatternType 2) pattern also contains a pointer to a pattern instance. Shadings are not cached: painting with a shading runs the shading algorithm every time.
In addition to the PostScript graphics model, Ghostscript supports RasterOp, a weak form of alpha channel, and eventually the full PDF 1.4 transparency model. The implemention of these facilities is quite slipshod and scattered: only RasterOp is really implemented fully. There is a general compositing architecture, but it is hardly used at all, and in particular is not used for RasterOp. It is used for implementation of the general support for overprint and overprint mode.
The Ghostscript graphics library implements clipping by inserting a clipping device in the device pipeline. The clipping device modifies all drawing operations to confine them to the clipping region.
The library supports three different kinds of clipping:
Note that simply scan-converting a clipping path in the usual way does not produce a succession of rectangles that can simply be stored as the list for region-based clipping: in general, the rectangles do not satisfy the constraint for rectangle lists specified in base/gxcpath.h, since they may overlap in X, Y, or both. A non-trivial "clipping list accumulator" device is needed to produce a rectangle list that does satisfy the constraint.
See doc/Drivers.htm for extensive documentation on the interface between the core code and drivers.
The driver API includes high-level (path / image / text), mid-level (polygon), and low-level (rectangle / raster) operations. Most devices implement only the low-level operations, and let generic code break down the high-level operations. However, some devices produce high-level output, and therefore must implement the high-level operations.
There are a number of "devices" that serve internal purposes. Some of these are meant to be real rendering targets; others are intended for use in device pipelines. The rendering targets are:
The forwarding devices meant for use in pipelines are:
Because PostScript and PDF have the same graphics model, lexical syntax, and stack-based execution model, the drivers that produce PostScript and PDF output share a significant amount of support code. In the future, the PostScript output driver should be replaced with a slightly modified version of the PDF driver, since the latter is far more sophisticated (in particular, it has extensive facilities for image compression and for handling text and fonts).
The PDF code for handling text and fonts is complex and fragile. A major rewrite in June 2002 was intended to make it more robust and somewhat easier to understand, but also increased its size by about 40%, contrary to the expectation that it would shrink. Currently both sets of code are in the code base, with compatible APIs, selected by a line in base/devs.mak.
Currently, the CGM driver is raster-only. If anyone cares seriously about CGM in the future, this driver should be upgraded to a higher level.
The svgwrite device produces SVG only. The cairo device uses the cairo graphics library to produce PDF, EPS, SVG or PNG output, based on the requested filename extension.
The standard Ghostscript distribution includes a collection of drivers, mostly written by Aladdin Enterprises, that are "maintained" in the same sense as the Ghostscript core code.
This list is likely to be incomplete and inaccurate: see base/contrib.mak for the real one.
The PostScript interpreter is conceptually simple: in fact, an interpreter that could execute "3 4 add =" and print "7" was running 3 weeks after the first line of Ghostscript code was written. However, a number of considerations make the code large and complex.
The interpreter is designed to run in environments with very limited memory. The main consequence of this is that it cannot allocate its stacks (dictionary, execution, operand) as ordinary arrays, since the user-specified stack size limit may be very large. Instead, it allocates them as a linked list of blocks. See below for more details.
The interpreter must never cause a C runtime error that it cannot trap. Unfortunately, C implementations almost never provide the ability to trap stack overflow. In order to put a fixed bound on the C stack size, the interpreter never implements PostScript recursion by C recursion. This means that any C code that logically needs to call the interpreter must instead push a continuation (including all necessary state information) on the PostScript execution stack, followed by the PostScript object to be executed, and then return to the interpreter. (See psi/estack.h for more details about continuations.) Unfortunately, since PostScript Level 2 introduces streams whose data source can be a PostScript procedure, any code that reads or writes stream data must be prepared to suspend itself, storing all necessary state in a continuation. There are some places where this is extremely awkward, such as the scanner/parser.
The use of continuations affects many places in the interpreter, and even some places in the graphics library. For example, when processing an image, one may need to call a PostScript procedure as part of mapping a CIE color to a device color. Ghostscript uses a variety of dodges to handle this: for example, in the case of CIE color mapping, all of the PostScript procedures are pre-sampled and the results cached. The Adobe implementation limits this kind of recursion to a fixed number of levels (5?): this would be another acceptable approach, but at this point it would require far more code restructuring than it would be worth.
A significant amount of the PostScript language implementation is in fact written in PostScript. Writing in PostScript leverages the C code for multi-threading, garbage collection, error handling, continuations for streams, etc., etc.; also, we have found PostScript in general more concise and easier to debug than C, mostly because of memory management issues. So given the choice, we tended to implement a feature in PostScript if it worked primarily with PostScript data structures, wasn't heavily used (example: font loading), or if it interacted with the stream or other callback machinery (examples: ReusableFileDecode streams, resourceforall). Often we would add non-standard PostScript operators for functions that had to run faster or that did more C-like things, such as the media matching algorithm for setpagedevice.
The main program of the interpreter is normally invoked from the command line, but it has an API as well. In fact, it has two APIs: one that recognizes the existence of multiple "interpreter instances" (although it currently provides a default instance, which almost all clients use), and a completely different one designed for Windows DLLs. These should be unified as soon as possible, since there are two steadily growing incompatible bodies of client code.
The main data structures visible to the PostScript programmers are arrays, contexts, dictionaries, names, and stacks.
Arrays have no unusual properties. See under Refs below for more information about how array elements are stored.
Contexts are used to hold the interpreter state even in configurations that don't include the Display PostScript multiple context extension. Context switching is implemented by a complex cooperation of C and PostScript code.
Dictionaries have two special properties worth noting:
Names are allocated in blocks. The characters and hash chains are stored separately from the lookup cache information, so that in the future, most of the former can be compiled into the executable and shared or put in ROM. (This is not actually done yet.)
As mentioned above, each stack is allocated as a linked list of blocks. However, for reasonable performance, operators must normally be able to access their operands and produce their results using indexing rather than an access procedure. This is implemented by ensuring that all the operands of an operator are in the topmost block of the stack, using guard entries that cause an internal error if the condition isn't met. See psi/iostack.h for more details.
PostScript parsing consists essentially of token scanning, and is simple in principle. The scanner is complex because it must be able to suspend its operation at any time (i.e., between any two input characters) to allow an interpreter callout, if its input is coming from a procedure-based stream and the procedure must be called to provide more input data.
The interpreter includes many non-standard operators. Most of these provide some part of the function of a standard operator, so that the standard operator itself can be implemented in PostScript: these are not of interest to users, and their function is usually obvious from the way they are used. However, some non-standard operators provide access to additional, non-standard facilities that users might want to know about, such as transparency, RasterOp, and in-memory rendering. These are documented at Language.htm#Additional_operators.
We don't document the complete set of non-standard operators here, because the set changes frequently. However, all non-standard operators are supposed to have names that begin with '.', so you can find them all by executing the following (Unix) command:
grep '{".[.]' psi/[zi]*.c
In addition to individual non-standard operators implemented in the same files as standard ones, there are several independent optional packages of non-standard operators. As with other non-standard operators, the names of all the operators in these packages begin with '.'. We list those packages here.
Memory management (refs, GC, save/restore) -- see below.
Ghostscript's PDF interpreter is written entirely in PostScript, because its data structures are the same as PostScript's, and it is much more convenient to manipulate PostScript-like data structures in PostScript than in C. There is definitely a performance cost, but apparently not a substantial one: we considered moving the main interpreter loop (read a token using slightly different syntax than PostScript, push it on the stack if literal, look it up in a special dictionary for execution if not) into C, but we did some profiling and discovered that this wasn't accounting for enough of the time to be worthwhile.
Until recently, there was essentially no C code specifically for the purpose of supporting PDF interpretation. The one major exception is the PDF 1.4 transparency features, which we (but not Adobe) have made available to PostScript code.
In addition to patching the run
operator to detect PDF
files, the interpreter provides some procedures in lib/pdf_main.ps that are meant to be called
from applications such as previewers.
Extraction of layer information from Illustratir CS2/CS3 PDF files is
implemented for a specific commercial customer; it is not used by any of the
included drivers. To activate this feature add
cslayer.dev
to the list of feature devices.
A PostScript Printer Description tells a generic PostScript printer driver how to generate PostScript for a particular printer. Ghostscript includes a PPD file for generating PostScript intended to be converted to PDF. A Windows INF file for installing the PPD on Windows 2000 and XP is included.
Ghostscript's makefiles embody a number of design decisions and assumptions that may not be obvious from a casual reading of the code.
makefile
, which in
turn simply references the real top-level makefile in the source
subdirectory.
rm *
) at any time with no bad
effects. The source subdirectories are defined by macros named
xxxSRCDIR
.
BINDIR
, and those that are not needed at run time, defined
by xxxGENDIR
and xxxOBJDIR
. (The
distinction between these is historical and probably no longer relevant.)
obj
and bin
directories are used for normal production builds, the
debugobj
directory for debugging builds, and the
pgobj
directory for profiling builds; other platforms may
use different conventions. The Unix makefiles support targets named
debug
and pg
for debugging and profiling
builds respectively; other platforms generally do not.
abc.h
#includes def.h
and
xyz.h
, the definition must have the form
wherexyz_h=$(xxxSRCD)xyz.h $(def_h) $(xyz_h)
xxxSRCD
is the macro defining the relevant source
directory (including a trailing '/'). Note that the '.' in the file name
has been replaced by an underscore. Note also that the definition must
follow all definitions it references, since some make
programs expand macros in definitions at the time of definition rather than
at the time of use.
abc.c
#includes def.h
and
lmn.h
, the rule must have approximately the form
where$(xxxOBJD)abc.$(OBJ) : $(xxxSRCD)abc.c $(def_h) $(lmn_h) $(xxCC) $(xxO_)abc.$(OBJ) $(C_) $(xxxSRCD)abc.c
xxxSRCD
is as before; xxxOBJD
is the
relevant object directory; xxCC
defines the name of the C
compiler plus the relevant compilation switches; and xxO_
and C_
are macros used to bridge syntactic differences
between different make
programs.
The requirement to keep makefiles up to date by hand has been controversial. Two alternatives are generally proposed.
makedeps
, which generate build rules
automatically from the #include lists in C files. We have found such
programs useless: they "wire in" specific, concrete directory names, not
only for our own code but even for the system header files; they have to be
run manually whenever code files are added, renamed, or deleted, or whenever
the list of #includes in any file changes; and they cannot deal with
different source files requiring different compilation switches.
We have seriously considered writing our own build program in Tcl or Python that would eliminate these problems, or perhaps porting the tools developed by Digital's Vesta research project (if we can get access to them); however, either of these approaches would create potential portability problems of its own, not to mention difficulties in integrating with others' build systems.
For more information about makefiles:
On top of the general conventions just described, Ghostscript's makefiles add a further layer of structure in order to support an open-ended set of fine-grained, flexible configuration options. Selecting an option (usually called a "module") for inclusion in the build may affect the build in many ways:
-replace
in the makefiles and in
base/genconf.c.
Each module is defined in the makefiles by rules that create a file named
xxx.dev
. The dependencies of the rule include all
the files that make up the module (compiled code files, PostScript files,
etc.); the body of the rule creates the .dev file by writing the description
of the module into it. A program called genconf
, described
in the next section, merges all the relevant .dev files together. For
examples of .dev rules, see any of the Ghostscript makefiles.
Ultimately, a person must specify the root set of modules to include in a
build (which of course may require other modules, recursively).
Ghostscript's makefiles do this with a set of macros called
FEATURE_DEVS
and DEVICE_DEVS
n,
defined in each top-level makefile, but nothing in the module machinery
depends on this.
Ghostscript's build procedure is somewhat unusual in that it compiles and then executes some support programs during the build process. These programs then generate data or source code that is used later on in the build.
The most important and complex of the generator programs is
genconf
. genconf
merges all the .dev files
that make up the build, and creates three or more output files used in later
stages:
gconfig.h
, consisting mainly of macro calls, one call
per "resource" making up the build, other than "resources" listed in the
other output files.
gconfigf.h
, produced only for PostScript builds with
compiled-in fonts, consisting of one macro call per font.
COMPILE_INITS=1
feature (a compressed init fileset is more
efficient than the current 'gsinit.c' produced by 'geninit.c'). This IODevice
is more versatile since other files can be encapsulated such as fonts, helper
PostScript files and Resources. The list of files is defined in part in
psi/psromfs.mak.
genconf
, but was never
completed.
There are a number of programs, scripts, and configuration files that exist only for the sake of the build process.
Ghostscript comes with many utilities for doing things like viewing bitmap files and converting between file formats. Some of these are written in PostScript, some as scripts, and some as scripts that invoke special PostScript code.
These are all documented in doc/Psfiles.htm, q.v.
Many of these scripts come in both Unix and MS-DOS (.bat
versions; some also have an OS/2 (.cmd
) version. The choice
of which versions are provided is historical and irregular. These scripts
should all be documented somewhere, but currently, many of them have man
pages, a few have their own documentation in the doc directory, and some
aren't documented at all.
In many environments, the memory manager is a set of library facilities that implicitly manage the entire address space in a homogenous manner. Ghostscript's memory manager architecture has none of these properties:
As noted above, allocators provide two different storage genera.
Objects:
Given a pointer to an object, the allocator that allocated it must be able to return the object's size and the pointer to its structure descriptor. (It is up to the client to know what allocator allocated an object.)
Strings:
The object/string distinction reflects a space/capability tradeoff. The per-object space overhead of the standard type of allocator is typically 12 bytes; this is too much to impose on every string of a few bytes. On the other hand, restricting object pointers to reference the start of the object currently makes object garbage collection and compaction more space-efficient. If we were to redesign the standard allocator, we would probably opt for a different design in which strings were allocated within container objects of a few hundred bytes, and pointers into the middle of all objects were allowed.
Every object allocated by a Ghostscript allocator has an associated structure descriptor, which the caller provides when allocating the object. The structure descriptor serves several purposes:
Structure descriptors are read-only, and are normally defined statically
using one of the large set of gs_private_st_
or
gs_public_st_
macros in base/gsstruct.h.
While the structure descriptor normally specifies the size of the object, one can also allocate an array of bytes or objects, whose size is a multiple of the size in the descriptor. For this reason, every object stores its size as well as a reference to its descriptor.
Because the standard Ghostscript garbage collector is non-conservative and can move objects, every object allocated by a Ghostscript allocator must have an accurate structure descriptor. If you define a new type of object (structure) that will be allocated in storage managed by Ghostscript, you must create an accurate descriptor for it, and use that descriptor to allocate it. The process of creating accurate descriptors for all structures was long and painful, and accounted for many hard-to-diagnose bugs.
By convention, the structure descriptor for structure type
xxx_t
is named st_xxx
(this is preferred),
or occasionally st_xxx_t
.
Note that a structure descriptor is only required for objects allocated by
the Ghostscript allocator. A structure type xxx_t
does not
require a structure descriptor if instances of that type are used
only in the following ways:
xxx_t xxx1, xxx2;
, or on the C heap, with
malloc
or through the Ghostscript "wrapper" defined in base/gsmalloc.h.
In general, structures without descriptors are problem-prone, and are deprecated; in new code, they should only be used if the structure is confined to a single .c file and its instances are only allocated on the C stack.
The allocator architecture is designed to support compacting garbage collection. Every object must be able to enumerate all the pointers it contains, both for tracing and for relocation. As noted just above, the structure descriptor provides procedures that do this.
Whether or not a particular allocator type actually provides a garbage collector is up to the allocator: garbage collection is invoked through a virtual procedure. In practice, however, there are only two useful garbage collectors for Ghostscript's own allocator:
As noted above, because the architecture supports compacting garbage collection, a "real" garbage collector cannot be run at arbitrary times, because it cannot reliably find and relocate pointers that are on the C stack. In general, it is only safe to run a "real" garbage collector when control is at the top level of the program, when there are no pointers to garbage collectable objects from the stack (other than designated roots).
As just noted, objects are normally movable by the garbage collector. However, some objects must be immovable, usually because some other piece of software must retain pointers to them. The allocator API includes procedures for allocating both movable (default) and immovable objects. Note, however, that even immovable objects must be traceable (have a structure descriptor), and may be freed, by the garbage collector.
When an allocator needs to add memory to the pool that it manages, it
requests the memory from its parent allocator. Every allocator has
a pointer to its parent; multiple allocators may share a single parent. The
ultimate ancestor of all allocators that can expand their pool dynamically
is an allocator that calls malloc
, described below. However, especially in embedded environments, an
allocator may be limited to a fixed-size pool assigned to it when it is
created.
For details, see base/gsmemory.h.
The allocator API also includes one special hook for the PostScript
interpreter: the concept of stable allocators. See the section on save
and
restore
below for details.
Ghostscript's memory management architecture provides three different ways to free objects: explicitly, by reference counting, or by garbage collection. They provide different safety / performance / convenience tradeoffs; we believe that all three are necessary.
Objects are always freed as a whole; strings may be freed piecemeal.
An object may have an associated finalization procedure, defined in the structure descriptor. This procedure is called just before the object is freed, independent of which method is being used to free the object. A few types of objects have a virtual finalization procedure as well: the finalization procedure defined in the descriptor simply calls the one in the object.
Objects and strings may be freed explicitly, using the
gs_free_
virtual procedures in the allocator API. It is up
to the client to ensure that all allocated objects are freed at most once,
and that there are no dangling pointers.
Explicit freeing is the fastest method, but is the least convenient and least safe. It is most appropriate when storage is freed in the same procedure where it is allocated, or for storage that is known to be referenced by only one pointer.
Objects may be managed by reference counting. When an object is allocated, its reference count may be set to 0 or 1. Subsequently, when the reference count is decremented to 0, the object is freed.
The reference counting machinery provides its own virtual finalization procedure for all reference-counted objects. The machinery calls this procedure when it is about to free the object (but not when the object is freed in any other way, which is probably a design bug). This is in addition to (and called before) any finalization procedure associated with the object type.
Reference counting is as fast as explicit freeing, but takes more space in the object. It is most appropriate for relatively large objects which are referenced only from a small set of pointers. Note that reference counting cannot free objects that are involved in a pointer cycle (e.g., A -> B -> C -> A).
Objects and strings may be freed automatically by a garbage collector. See below.
As mentioned above, the ultimate ancestor of
all allocators with an expandable pool is one that calls
malloc
.
Note that the default gsmalloc.c allocator for malloc/free now uses a mutex so that allocators that use this can be assured of thread safe behavior.
In a multi-threaded environment, if an allocator must be callable from multiple threads (for example, if it is used to allocate structures in one thread that are passed to, and freed by, another thread), the allocator must provide mutex protection. Ghostscript provides this capability in the form of a wrapper allocator, that simply forwards all calls to a target allocator under protection of a mutex. Using the wrapper technique, any allocator can be made thread-safe.
In an embedded environment, job failure due to memory exhaustion is very undesirable. Ghostscript provides a wrapper allocator that, when an allocation attempt fails, calls a client-provided procedure that can attempt to free memory, then ask for the original allocation to be retried. For example, such a procedure can wait for a queue to empty, or can free memory occupied by caches.
When multiple threads are used and there may be frequent memory allocator requests, mutex contention is a problem and can cause severe performance degradation. The chunk memory wrapper can provide each thread with its own instance of an allocator that only makes requests on the underlying (non-GC) alloctor when large blocks are needed. Small object allocations are managed within chunks.
This allocator is intended to be used on top of the basic 'gsmalloc' allocator (malloc/free) which is NOT garbage collected or relocated and which MUST be mutex protected.
The standard Ghostscript allocator gets storage from its parent (normally
the malloc
allocator) in large blocks called
chunks, and then allocates objects up from the low end and strings
down from the high end. Large objects or strings are allocated in their own
chunk.
The standard allocator maintains a set of free-block lists for small object sizes, one list per size (rounded up to the word size), plus a free-block list for large objects (but not for objects so large that they get their own chunk: when such an object is freed, its chunk is returned to the parent). The lists are not sorted; adjacent blocks are only merged if needed.
While the standard allocator implements the generic allocator API, and is usable with the library alone, it includes a special hook for the PostScript interpreter to aid in the efficient allocation of PostScript composite objects (arrays and dictionaries). See the section on Refs below for details.
The PostScript interpreter uses an allocator that extends the graphic
library's standard allocator to handle PostScript objects,
save
and restore
, and real garbage
collection.
Ghostscript represents what the PLRM calls PostScript "objects" using a
structure called a ref
, defined in psi/iref.h; packed refs, used for the elements of
packed arrays, are defined in psi/ipacked.h.
See those files for detailed information.
The PLRM calls for two types of "virtual memory" (VM) space: global and local. Ghostscript adds a third space, system VM, whose lifetime is an entire session -- i.e., it is effectively "permanent". All three spaces are subject to garbage collection. There is a separate allocator instance for each VM space (actually, two instances each for global and local spaces; see below). In a system with multiple contexts and multiple global or local VMs, each global or local VM has its own allocator instance(s).
Refs that represent PostScript composite objects, and therefore include
pointers to stored data, include a 2-bit VM space tag to indicate in which
VM the object data are stored. In addition to system, global, and local VM,
there is a tag for "foreign" VM, which means that the memory is not managed
by a Ghostscript allocator at all. Every store into a composite object must
check for invalidaccess
: the VM space tag values are chosen
to help make this check efficient. See psi/ivmspace.h, psi/iref.h, and psi/store.h for details.
PostScript composite objects (arrays and dictionaries) are usually small. Using a separate memory manager object for each composite object would waste a lot of space for object headers. Therefore, the interpreter's memory manager packs multiple composite objects (also called "ref-containing objects") into a single memory manager object, similar to the way the memory manager packs multiple objects into a chunk (see above). See base/gxalloc.h for details. This memory manager object has a structure descriptor, like all other memory manager objects.
Note that the value.pdict
, value.refs
, or
value.packed
member of a ref must point to a PostScript
composite object, and therefore can point into the middle of a memory
manager object. This requires special handling by the garbage collector (q.v.).
In addition to save
and restore
, Ghostscript
provides a .forgetsave
operator that makes things as though
a given save
had never happened. (In data base terminology,
save
is "begin transaction", restore
is
"abort transaction", and .forgetsave
is "end/commit
transaction"). .forgetsave
was implemented for a specific
commercial customer (who may no longer even be using it): it was a pain to
make work, but it's in the code now, and should be maintained. See the
extensive comments in psi/isave.c for more
information about how these operations work.
Even though save
and restore
are concepts
from the PostScript interpreter, the generic allocator architecture and API
include a feature to support them, called stable allocators. Every
allocator has an associated stable allocator, which tags pointers with the
same VM space number but which is not subject to save
and
restore
. System VM is intrinsically stable (its associated
stable allocator is the same allocator), so there are only 5 allocators in
ordinary single-context usage: system VM, stable global VM, ordinary global
VM, stable local VM, ordinary local VM.
The reason that we cannot simply allocate all stable objects in system VM is that their refs must still be tagged with the correct VM space number, so that the check against storing pointers from global VM to local VM can be enforced properly.
All PostScript objects are normally allocated with the non-stable
allocators. The stable allocators should be used with care, since using
them can easily create dangling pointers: if storage allocated with a stable
allocator contains any references to PostScript objects, the client is
responsible for ensuring that the references don't outlive the referenced
objects, normally by ensuring that any such referenced objects are allocated
at the outermost save
level.
The original reason for wanting stable allocators was the PostScript stacks,
which are essentially PostScript arrays but are not subject to
save
and restore
. Some other uses of stable
allocators are:
gstate_path_memory
in
base/gsstate.c.
gs_image_row_memory
in base/gsimage.c), because the data-reading
procedure for an image can invoke save
and
restore
.
For more specific examples, search the sources for references to
gs_memory_stable
.
The interpreter's garbage collector is a compacting, non-conservative, mark-and-sweep collector.
Because the garbage collector is non-conservative, it cannot be run if there
are any pointers to movable storage from the C stack. Thus it cannot be run
automatically when the allocator is unable to allocate requested space.
Instead, when the allocator has allocated a given amount of storage (the
vm_threshold
amount, corresponding to the PostScript
VMThreshold
parameter), it sets a flag that the interpreter
checks in the main loop. When the interpreter sees that this flag is set,
it calls the garbage collector: at that point, there are no problematic
pointers from the stack.
Roots for tracing must be registered with the allocator. Most roots are registered during initialization.
"Mark-and-sweep" is a bit of a misnomer. The garbage collector actually has 5 main phases:
There is some extra complexity to handle collecting local VM only. In this case, all pointers in global VM are treated as roots, and global VM is not compacted.
As noted above, PostScript arrays and strings can have refs that point
within them (because of getinterval
). Thus the garbage
collector must mark each element of an array, and even each byte of a
string, individually. Specifically, it marks objects, refs, and strings
using 3 different mechanisms:
Similarly, it records the relocation information for objects, refs, and strings differently:
value
field. Every memory manager object
that stores ref-containing objects as described above has an extra, unused
ref at the end for this purpose.
One of Ghostscript's most important features is its great portability across platforms (CPUs, operating systems, compilers, and build tools). The code supports portability through two mechanisms:
Ghostscript attempts to discover characteristics of the CPU and compiler
automatically during the build process, by compiling and then executing a
program called genarch
. genarch
generates a
file obj/arch.h
, which almost all Ghostscript files then
include. This works well for things like word size, byte order, and
floating point representation, but it can't determine whether or not a
compiler supports a particular feature, because if a feature is absent, the
compilation may fail.
Despite the supposed standardization of ANSI C, platforms vary considerably
in where (and whether) they provide various standard library facilities.
Currently, Ghostscript's build process doesn't attempt to sort this out
automatically. Instead, for each library header file
<
xxx.h>
there is a
corresponding Ghostscript source file
base/
xxx_.h
, containing a set of
compile-time conditionals that attempt to select the correct platform header
file, or in some cases substitute Ghostscript's own code for a missing
facility. You may need to edit these files when moving to platforms with
unusually non-standard libraries.
It has been suggested that the GNU configure
scripts do the
above better, for Unix systems, than Ghostscript's current methods. While
this may be true, we have found configure
scripts difficult
to write, understand, and maintain; and the autoconf
tool
for generating configure
scripts, which we found easy to
use, doesn't cover much of the ground that Ghostscript requires.
For a few library facilities that are available on all platforms but are not well standardized, or that may need to be changed for special environments, Ghostscript defines its own APIs. It is an architectural property of Ghostscript that the implementations of these APIs are the only .c files for which the choice of platform (as opposed to choices of drivers or optional features) determines whether they are compiled and linked into an executable.
For information on the structure and conventions used within makefiles, see the Makefile structure section above.
Ghostscript's makefiles are structured very similarly to the cross-platform
library files. The great majority of the makefiles are portable across all
platforms and all versions of make
. To achieve this, the
platform-independent makefiles must obey two constraints beyond those of the
POSIX make
program:
include
s are allowed. While most
make
programs now provide some form of conditional execution
and some form of inclusion, there is no agreement on the syntax.
(Conditionals and includes are allowed in platform-dependent makefiles; in
fact, an inclusion facility is required.)
MMS
and MMK
programs.
The top-level makefile for each platform (where "platform" includes the OS,
the compiler, and the flavor of make
) contains all the build
options, plus include
s for the generic makefiles and any
platform-dependent makefiles that are shared among multiple platforms.
While most of the top-level makefiles build a PostScript and/or PDF interpreter configuration, there are also a few makefiles that build a test program that only uses the graphics library without any language interpreter. Among other things, this can be helpful in verifying that no accidental dependencies on the interpreter have crept into the library or drivers.
For families of similar platforms, the question arises whether to use multiple top-level makefiles, or whether to use a single top-level makefile that may require minor editing for some (or all) platforms. Ghostscript currently uses the following top-level makefiles for building interpreter configurations:
The following top-level makefiles build the library test program:
The MSVC makefiles may require editing to select between different versions of MSVC, since different versions may have slightly incompatible command line switches or customary installation path names. The Unix makefiles often require editing to deal with differing library path names and/or library names. For details, see the Unix section of the documentation for building Ghostscript.
Coding for portability requires avoiding both explicit
dependencies, such as platform-dependent #ifdef
s, and
implicit dependencies, such as dependencies on byte order or the
size of the integral types.
The platform-independent .c files never, ever, use #ifdef
or
#if
to select code for specific platforms. Instead, we
always try to characterize some abstract property that is being tested. For
example, rather than checking for macros that are defined on those specific
platforms that have 64-bit long
values, we define a macro
ARCH_SIZEOF_LONG
that can then be tested. Such macros are
always defined in a .h file, either automatically in arch.h
,
or explicitly in a xxx_.h
file, as described in
earlier sections.
The most common source of byte ordering dependencies is casting between types (T1 *) and (T2 *) where T1 and T2 are numeric types that aren't merely signed/unsigned variants of each other. To avoid this, the only casts allowed in the code are between numeric types, from a pointer type to a long integral type, and between pointer types.
Ghostscript's code assumes the following about the sizes of various types:
The code does not assume that the char
type is signed (or
unsigned); except for places where the value is always a literal string, or
for interfacing to library procedures, the code uses byte
(a
Ghostscript synonym for unsigned char
) almost everywhere.
Pointers are signed on some platforms and unsigned on others. In the few
places in the memory manager where it's necessary to reliably order-compare
(as opposed to equality-compare) pointers that aren't known to point to the
same allocated block of memory, the code uses the
PTR_
relation macros rather than direct comparisons.
See the files listed above for other situations where a macro provides platform-independence or a workaround for bugs in specific compilers or libraries (of which there are a distressing number).
There are some features that are inherently platform-specific:
The new DLL interface (new as of 7.0) is especially useful with the new display device, so it is included here. Both are due to Russell Lang.
The Ghostscript code has many tracing and debugging features that can be
enabled at run time using the -Z
command line switch, if the
executable was compiled with DEBUG
defined. One
particularly useful combination is -Z@\?
, which fills free
memory blocks with a pattern and also turns on run-time memory consistency
checking. For more information, see doc/Use.htm#Debugging; you can also search for
occurrences of if_debug
or gs_debug_c
in the
source code. Note that many of these features are in the graphics library
and do not require a PostScript interpreter.
The code also contains many run-time procedures whose only purpose is to be
called from the debugger to print out various data structures, including all
the procedures in psi/idebug.c (for the
PostScript interpreter) and the debug_dump_
procedures in base/gsmisc.c.
The Microsoft profiling tool is included into Microsoft Developer Studio 6 Enterprise Edition only. Standard Edition and Professional Edition do not include it.
Microsoft profiler tool requires the application to be linked with
a special linker option. To provide it you need the following change to
gs/base/msvccmd.mak
:
Note that any of debug and release build may be profiled.*** SVN-GS\HEAD\gs\src\msvccmd.mak Tue Jan 9 21:41:07 2007 --- gs\src\msvccmd.mak Mon May 7 11:29:35 2007 *************** *** 159,163 **** # Note that it must be followed by a space. CT=/Od /Fd$(GLOBJDIR) $(NULL) $(CDCC) $(CPCH) ! LCT=/DEBUG /INCREMENTAL:YES COMPILE_FULL_OPTIMIZED= # no optimization when debugging COMPILE_WITH_FRAMES= # no optimization when debugging --- 159,164 ---- # Note that it must be followed by a space. CT=/Od /Fd$(GLOBJDIR) $(NULL) $(CDCC) $(CPCH) ! # LCT=/DEBUG /INCREMENTAL:YES ! LCT=/DEBUG /PROFILE COMPILE_FULL_OPTIMIZED= # no optimization when debugging COMPILE_WITH_FRAMES= # no optimization when debugging *************** *** 167,175 **** !if $(DEBUGSYM)==0 CT= ! LCT= CMT=/MT !else CT=/Zi /Fd$(GLOBJDIR) $(NULL) ! LCT=/DEBUG CMT=/MTd !endif --- 168,178 ---- !if $(DEBUGSYM)==0 CT= ! # LCT= ! LCT=/PROFILE CMT=/MT !else CT=/Zi /Fd$(GLOBJDIR) $(NULL) ! # LCT=/DEBUG ! LCT=/DEBUG /PROFILE CMT=/MTd !endif
Mictosoft Profiler tool can't profile a dynamically loaded DLLs.
When building Ghostscript with makefiles you need to specify
MAKEDLL=0
to nmake
command line.
The Integrated Development Environment of Microsoft Developer Studio 6 cannot profile a makefile-based project. Therefore the profiling tool to be started from command line.
The profiling from command line is a 4 step procedure. The following batch file provides a sample for it :
set DEVSTUDIO=G:\Program Files\Microsoft Visual Studio set GS_HOME=..\..\gs-hdp set GS_COMMAND_LINE=%GS_HOME%\bin\gswin32c.exe -I%GS_HOME%\lib;f:\afpl\fonts -r144 -dBATCH -dNOPAUSE -d/DEBUG attachment.pdf set START_FUNCTION=_main set Path=%DEVSTUDIO%\Common\MSDev98\Bin;%DEVSTUDIO%\VC98\Bin PREP.EXE /OM /SF %START_FUNCTION% /FT %GS_HOME%\bin\gswin32c.exe If ERRORLEVEL 1 echo step 1 fails&exit PROFILE /I %GS_HOME%\bin\gswin32c.pbi %GS_COMMAND_LINE% If ERRORLEVEL 1 echo step 2 fails&exit PREP /M %GS_HOME%\bin\gswin32c /OT xxx.pbt If ERRORLEVEL 1 echo step 3 fails&exit PLIST /ST xxx.pbt >profile.txt If ERRORLEVEL 1 echo step 4 fails&exit
This batch file to be adopted to your configuration :
Copyright © 2001-2009 Artifex Software, Inc. All rights reserved.
This software is provided AS-IS with no warranty, either express or implied. This software is distributed under license and may not be copied, modified or distributed except as expressly authorized under the terms of that license. Refer to licensing information at http://www.artifex.com/ or contact Artifex Software, Inc., 7 Mt. Lassen Drive - Suite A-134, San Rafael, CA 94903, U.S.A., +1(415)492-9861, for further information.
Ghostscript version 8.71, 10 February 2010