Skip to content

The heirloom toolchest ported to WebAssembly using WASIX

Notifications You must be signed in to change notification settings

wasmtube/heirloom-wasix

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

README file for the Heirloom Toolchest
======================================

For a description what the Heirloom Toolchest is, see the file
intro.txt, the online introduction, or, if the tools are already
installed, the corresponding manual page intro(1). Note that the
path names mentioned in these documents may differ from those you
select when building the tools as described below. It is assumed
that you know the contents of the introduction as you read on.

For licensing terms, see the files in the LICENSE subdirectory.

How to install
==============

System environments that are known to support the Heirloom Toolchest are:

Linux           Kernel 2.4 and above; glibc 2.2 and above, diet libc, uClibc;
                gcc, Intel C
Solaris         8 and above; Sun C, gcc
Open UNIX       8.0.0; native (cc), LKP (gcc)
HP-UX           B.11.11, B.11.22, B.11.23; HP-UX ANSI C, gcc
AIX             5.1; VisualAge C, gcc
FreeBSD         4.9, 5.2.1, 6.1
NetBSD          1.6, 2.0, 3.0
OpenBSD         3.4, 3.8
DragonFlyBSD    1.3.7-DEVELOPMENT
Mac OS X        10.4.3 and above

The following prerequisites are necessary:

yacc and lex    Several of the included yacc and lex programs need to
                access internals of the yacc and lex implementations
                beyond the respective POSIX specifications. This has
                been made to work with Unix yacc and lex as well as
                with many versions of Berkeley yacc, bison, and flex.
                However, recent changes in these programs may result
                in incompatibilities. If this happens, install the
                "Heirloom Development Tools" package first and use
                the yacc and lex versions provided with it. You could
                also fix the problem on your own, but now that portable
                Open Source versions of Unix yacc and lex are available,
                such changes can no longer be considered important, and
                will not be made as part of the regular maintenance of
                this project.

The following prerequisites are optional:

Bourne shell    It is recommended that the Heirloom Bourne shell from
                <http://heirloom.sourceforge.net/sh.html> is installed
                first. The tools will work with any shell, but use of the
                Bourne shell ensures maximum compatibility with traditional
                Unix behavior.
gzip            <http://www.gzip.org/>. Invoked by cpio/pax, file, and
                tar when a gzip compressed file is encountered.
bzip2           <http://sources.redhat.com/bzip2>. Invoked by cpio/pax,
                file, and tar when a bzip2 compressed file is encountered.
libz            <http://www.gzip.org/zlib>. It is needed if cpio/pax
                should be able to create deflate compressed entries
                in zip files; reading of such entries works without it.
libbz2          <http://sources.redhat.com/bzip2>. It is needed if
                cpio/pax should be able to read or create bzip2
                compressed entries in zip files.

Unless you are a developer, it is strongly recommended that you use the
RPM packaging system to build and install the Heirloom Toolchest on Linux.
This requires the "heirloom-devtools" package to be installed; its source
code is available from <http://heirloom.sourceforge.net/devtools.html>.
Then simply execute

        rpmbuild -tb heirloom-<version>.tar.bz2

wait a few minutes and install the resulting rpm that is found in
/usr/src/redhat/RPMS/i386/heirloom-<version>-1.i386.rpm (or wherever
your vendor's setup puts such files). Then access the tools by prefixing
your PATH with /usr/5bin and skip the rest of this section. (With older
versions of rpm you need to use 'rpm' instead of 'rpmbuild'.) The tools
in the package will automatically use the Heirloom Bourne shell if it is
installed at compilation time; the Heirloom Toolchest package then
requires the Heirloom Bourne shell package to be also installed.

If your system lacks rpm or if you want to make changes to the Heirloom
Toolchest, you have to build it manually. The first thing to understand
is the build system. This is actually quite simple: Every directory
contains a file named Makefile.mk that includes the directory-specific
make instructions. To generate the real Makefile, configuration
settings are prepended to the directory-specific file. You have to edit
these configuration settings before you start compiling; they are
located in the file build/mk.config which is also in make syntax.
Follow the descriptions in this file and select appropriate values for
your system.

Be very careful at this step! If you overwrite your system utilities
or put the Heirloom Toolchest at an early place in your system's PATH,
some of your shell scripts or of those of your vendor may cease to
work. This is because utilities from the Heirloom Toolchest may behave
differently than the ones supplied by your vendor. You may wish to
correct this later, but the first build is definitively not the right
time. - If you want to stay with the default /usr/5bin path on Solaris,
be sure to remove the existing symbolic link to /usr/bin.

If you don't know what path names to select, get a competent person to
assist you. I won't assist at this point and do, as described in the
license terms, not warrant for any problems caused by anyone. You have
been warned.

After finishing the configuration, type 'make' and 'make install',
then use the tools found at the location you selected before.

The Heirloom Toolchest can also be compiled using dietlibc
<http://www.fefe.de/dietlibc/>. This will result in relatively small
statically linked executables; the tools will not have multibyte
functionality, though. Note that you have to compile ncurses, optionally
also zlib and libbz2, and to install the libraries in an appropriate
location for 'diet' before you can start to build the Heirloom tools.

After installation
==================

Some vendors or system administrators predefine an alias or a shell
function for 'ls' in the shell startup files. If you type 'ls' and
only get something like 'ls: illegal option -- -' from the Heirloom
'ls' command, try 'unalias ls' or 'unset ls'.

The Heirloom 'man' command will display the Heirloom manual pages
first, as configured in /etc/default/man. If there is a 'MANPATH'
environment variable present however, it will override this setting
and you get whatever MANPATH points to. In this situation, either
unset MANPATH or adjust it to include the Heirloom manual path first
(normally /usr/share/man/5man, see build/mk.config.)

Porting the Heirloom Toolchest
==============================

If you want to port the Heirloom Toolchest to a new environment,
check that the following conditions are satisfied:

- The environment is a Unix-like system, e.g. the file system is
  case-sensitive, there is a unique dev_t and ino_t combination for
  each file i-node, there are at least POSIX.1 calls etc. The code
  heavily relies on this, and I'm not interested in changes for
  non-Unix environments for a variety of reasons.

- The compiler supports 'long long' and distinguished 8-bit, 16-bit,
  32-bit, and 64-bit data types. The system stdio library supports
  printf("%lld").

- The compiler does not insert padding between two structure fields of
  type 'char' or array of 'char'.

- You know how to read directory contents without opendir() and friends,
  using a file descriptor obtained from open(). On most systems, this
  involves either the System V function getdents() or the BSD function
  getdirentries(). It may also be possible to use read(), although this
  should only be the last resort. Such functionality may or may not be
  documented; search for it in /usr/include if in doubt. If getdents()
  or getdirentries() exist, you can probably adjust the existing code
  in 'libcommon/getdir.c'.

- You know how to obtain a process listing in C. This may be as easy as
  reading /proc (you can probably re-use existing code then), but might
  also require that you know how to read structures from /dev/kmem in
  the worst case. Documentation for this is usually sparse; look around
  in /usr/include or search the web for existing open source code that
  performs such lookups (e.g. 'lsof', 'libgtop').

- You know how to code tape device control (<sys/mtio.h> or the like).
  This is usually documented, so if in doubt, just read the manual
  pages and header files.

If all of these are true, you can expect the work to be done within
a day or so.

The Heirloom Toolchest is coded with The Open Group's 'Single Unix
Specification', Version 3, 2001, Base Definitions and Headers, and
System Interfaces, in mind. If the system is Unix-like, missing
functionality can usually be emulated. For a number of functions,
such as those for wide-character support, utmpx access, simplified
signal management (sigset() etc.), emulation code is already present
in the libcommon and libwchar directories; enable it if necessary.

Hacking the Heirloom Toolchest
==============================

One of the considerations the Heirloom Toolchest is designed along
is accessibility for other developers. Each utility was made largely
independent from other ones, to the extent of duplicate functionality
at some places. There are just three libraries you have to know about:

1. The 'common' library, found in libcommon, provides the following
   functionality:

   - Traditional ('simple') regular expressions, available as full
     source to #include in the old-style ed-hack regexp.h and as normal
     library functions in regexpr.h. You should definitively use the
     latter if you want to use simple REs in new applications.

   - Traditional file name matching; manually include the prototype
     'extern int gmatch(const char *, const char *);' if you want
     to use it.

   - SVID4-style error reporting; #include <pfmt.h>.

   - Locale-independent character class functions in <asciitype.h>.
     These are necessary e.g. to recognize programming language syntax
     white space if LC_CTYPE is used otherwise (if you do not want to
     handle no-break-space characters etc. as word delimiters).

   - Functions for reading directory entries with a file descriptor
     obtained from open(), in <getdir.h>. These are only useful if
     you descend a directory hierarchy with fchdir() and have to care
     about the limit of open file descriptors, as the 'file' and 'rm'
     tools; otherwise, stay with the standard readdir().

   - A getopt() that has the traditional functionality, and just the
     traditional functionality. This is here mainly because GNU getopt()
     is so annoyingly non-standard.

   - Functions for buffered input and output in <iblok.h> and <oblok.h>,
     respectively. They can be used as a replacement for stdio in simple
     cases; the main reason for their existence is the desire to read
     lines of arbitrary length without the slow-down caused by getc().

2. A modified version of Caldera's 'UNIX(R) Regular Expression Library'
   in the libuxre directory. This is used for (mostly) POSIX.2-compatible
   regular expression matching. It provides a syntax without too much
   extension fuss, and good performance, since it includes both a DFA
   and a NFA style matcher and uses either of them depending on the
   expression and the need for subexpression locations.

3. The widechar portability library in the libwchar directory. This is
   only used for system environments that do not provide appropriate
   wide character functionality in their own libc. It currently just
   does simple copying between bytes and wide characters and thus does
   not handle anything than plain ASCII well. If other wide character
   functions than those currently present are used in new code, it will
   be necessary to add those function to libwchar as well.

I make no premises about keeping the interfaces of these libraries
compatible in upcoming releases, so if you want to use them in other
projects, just fork off your own version and don't rely on mine.

To go on, you now only need to know about the sources in the directory
of the utility you want to hack. This should make it not too difficult,
even though I've generally stayed with good old Unix programming style
(few comments, short variable names - but not too much to read either).

Extending the Heirloom Toolchest
================================

If you want to add an extension to a utility, just go on and do it - as
you've read in the section above, accessibility for hacking is intended
to be easy. For my version, though, the concept is to have not too many
extensions. I sometimes include extensions because the functionality is
really expected these days (such as diff -u), adds a real advantage for
interactive use (such as colored output for ls), or is widely used for
convenience and does not add much overhead (such as grep -r). I am not
much interested in extensions that change the behavior of tools for use
in scripts, though. Such extensions lead to subtly incompatible shell
scripts, and these should be avoided by all means. It is much better to
make a fresh start by creating a new utility with a different name for
such cases. Such utilities are then beyond the scope of this collection.
In general, chances are rather low that I will accept your extension for
my version. But that doesn't mean that you should not do it for yourself.

Reporting bugs
==============

Before you report behavior that you suspect to be a bug, check at least
the following to be sure not to report a feature:

- Does it leave a core dump? If yes, it's certainly a bug (unless you
  sent a signal, of course).

- Read the manual page and the source for the utility carefully. Did
  you use the utility properly?

- If the utility is used within a shell script or with another user's
  environment, try how it behaves if used standalone.

- If the behavior disturbs you just because your vendor's utilities do
  something different, that alone is no reason to report it. The Heirloom
  Toolchest is oriented on System V behavior. Verify that real System V
  derivatives behave differently before reporting.

- If you are using one of the POSIX personalities and the specification
  says something different, it is likely to be a bug. But be careful to
  read the respective specification itself; some incompatibilities to
  the wording of the specification may even exist in certified systems,
  so you cannot rely on a vendor's implementation or documentation here.

If you went through these statements and you're still sure that it is
a bug, report it.

Gunnar Ritter
Freiburg i. Br.
Germany
<http://heirloom.sourceforge.net>
<[email protected]>                                                2/24/07

About

The heirloom toolchest ported to WebAssembly using WASIX

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published