Shared Libraries and naming conventions

Dale Scheetz dwarf@polaris.net
Sat, 29 May 1999 15:23:27 -0400 (EDT)


I must first appologize. I promised to get this posting out at least a
month ago. My only excuse is the avalanche of work that has fallen upon me
in the last several weeks. I about have it all shoveled off, so I'm taking
this time to make good on my promis ;-)

This is a relatively long posting, so I ask your patience, and please read
on...

What follows is in three parts. The first, and possibly longest part is
mostly an introduction and overview of the problem being addressed. The
actual naming convention is then proposed, followed by the section
containing library names. It is not clear how much of the first part will
actually make it into the specifications, although I would like to take
this time to encourage us to provide "justification" for the positions we
take, and a "philosophy" or "priniciple" that expresses the context and
usefulness of the areas we are going to standardize.

Everything I am presenting here should be considered an experimental
starting point. The list of libraries is a simple extraction of the
libraries on my system, and is only a starting point for this discussion,
and not even an expression of the scope or limits of this set of
requirements.

INTRODUCTION:

One of the major requirements of any third party software is that it find
the shared libraries that it needs in the places it expects, with a
stable, defined, funtionality. This discussion is going to leave the
functionaltity specifications of the libraries until after the list of
libraries is well defined. This section will only address how libraries
are named and which libraries are to be a part of the standard requiring a
functional specification. The functional specifications of these libraries
will be found in a separate section of the spec. (or possibly added in
this section?)

<asside>
As the Debian maintianer of the gmp library packages (This is the GNU
Multi-Precision math library) I have been getting several other Debian
developers to "educate" me on the aspects of shared library naming using
this package as the "example" reference for the discussion. While it isn't
clear that a specific example will be needed in the spec, it was easier
for the discussion to be about a specific library. This introduction will
make use of that example.

The gmp library provides an example of several of the needs of shared
libraries. There are two, incompatible releases of the library; the 1.x
series, and the 2.x series. There are users who wish to do development on
both the old and the new versions, so a development environment is
necessary that results in the correct runtime library for each of the two
development headers. This is the principle difficulty with shared
libraries, and one for which a convention on names could go a long way
towards providing a solution.

The run-time side of things is actually pretty simple. The principle
requirement is that the so link properly indicate the major release number
so that the linker can obtain the correct library. For libraries like gmp,
the solution is simply to include the major number in the so link name, so
the two links for gmp are:

     libgmp.so.2 -> libgmp.so.2.0.2
and
     libgmp.so.1 -> libgmp.so.1.3.2

The development links become a bit more complex. The temptation is to
place both development sets in their own subdirectories, providing:

     /usr/include/gmp1
and
     /usr/include/gmp2

and the development symlinks:

     libgmp1.so
and
     libgmp2.so

This causes problems for packages which expect to build against gmp.
Having to edit the make file to reflect the correct include and link names
is not a clean solution.

Thus, for gmp-2.0.2, the includes are provided in /usr/include, and the
development link is libgmp.so. Only gmp1 need be placed in the special
locations above.

When gmp is released in the 3.x series, the gmp2 package need only be
released with the development link changed and the include files buried
away in their own location before the new gmp3 package is released. The
new package then only need conflict (in the development package) with the
older versions of the previous library, and correct development becomes
possible for each of the three versions of the library. The resulting
packages can depend upon the correct version of the runtime, and all three
libraries can be in use at the same time on the same system.

There are libraries, like tk/tcl, for which even the point releases are
incompatible. For these upstream packages their maintainers have chosen to
include the important terms of the release number into the package name
itself, resulting in unique package names for each of the incompatible
releases. These libraries also incorporate that part of the name into the
library names, so that they are unique and can be found correctly by the
loader. This is a simple layer on top of the described convention,
yielding additional isolation of library versions, as needed by the
particular library. Once the name is standardized the convention will use
it properly.


THE LIBRARY NAMING CONVENTION:

There are three references to every shared library: The sared library 
itself; The runtime link, and; The compiler link. They take the following 
form:

The shared library:          lib<name>.so.X.Y.Z

The runtime link:            lib<name>.so.X ->   lib<name>.so.X.Y.Z

The development link:        lib<name>.so   ->   lib<name>.so.X.Y.Z

If the shared library is an "older" version of the library and wishes to 
provide a development link, it can not be of the above form. Instead it 
takes the form:

Second development link:     lib<name>X.so  ->   lib<name>.so.X.Y.Z

In this way, with some simple changes to the link statements in the make 
file, a program can be built, linked to an older (different) version of 
the library, in the same system that other programs are built, linked to 
the newest version. This provides the ability to maintain programs that 
depend on features only available in a certain release of the given 
library.

<name> should, as often as possible be chosen as the upstream name of the 
library. Many tk/tcl versions are so indepent of other versions that their 
names include the pertinant portion of the version number. For uniqueness 
sake these "versioned" names should be used anywhere <name> is inserted.

Exceptions:

The glibc C library takes the following form for the shared library name:

     lib<name>-X.Y.Z.so
     
The run-time, and compile-time links are constructed the same as 
in the other cases, so there is little impact on the standard.

LIBRARY NAMES:

The following is a list of library names covered under this convention.
Names preceeded by an asterisk (*) are only found in /usr/lib. The others
are found in /lib as well as providing symlinks in /usr/lib:

*	bfd
*	bsd-compat
	c
	db
	dl
*	efence
        ext2fs
*	form
*	g++
*	gdbm
*	gdk
*	gif
*	glib
*	gmp
*	gnuma
*	gpm
*	grove
*	gtk
*	history
*	jpeg
*	lockfile
        m
*	menu
*	mh
        ncurses
*	ndbm
        nsl
        nss-compat
        nss-db
        nss-dns
        nss-files
*	opcodes
*	pam
*	panel
*	paper
*	pcap
*	png
        proc
        pthread
*	pwdb
        readline
        resolve
*	rpm
        slang
*	sp
*	spgrove
        ss
*	ssl
*	stdc++
*	style
*	tcl
        termcap
*	tiff
*	tk
*	ungif
        util
        uuid
*	vga
*	vgagl
*	wrap
*	z

Waiting is,

Dwarf
--
_-_-_-_-_-   Author of "The Debian Linux User's Guide"  _-_-_-_-_-_-

aka   Dale Scheetz                   Phone:   1 (850) 656-9769
      Flexible Software              11000 McCrackin Road
      e-mail:  dwarf@polaris.net     Tallahassee, FL  32308

_-_-_-_-_-_- See www.linuxpress.com for more details  _-_-_-_-_-_-_-