-
Notifications
You must be signed in to change notification settings - Fork 0
/
TODO
535 lines (425 loc) · 22.2 KB
/
TODO
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
Things to be done in the near future:
* Add I10N support
* Clean/rewrite the build system
* Extract common code into shared files
* Review the Info manual
* Improve output of commands (i.e. the default output of 'ac')
* Simplify the option handling code (getopt)
--------------------------------------------------------------
OLD TODO: (the more long-term items)
Next major release: utmpx support.
-"Bruno Lopes F. Cabral" <[email protected]> would like at
least a copy of fwtmp so he can fix his wtmp files
-write some library routines that will read utmp and utmpx records
into the same internal data structure -- that way, we don't have to
handle the two record types separately
Need a few utilities that convert wtmp and acct entries to text and
vice versa. That would be a great way to write a test suite for the
programs, too -- write the input files in text, convert them to the
appropriate binary format, and run the programs on those binary files.
Possibly add `--wide' flag code to ac.
make compare targets
-- detect GNU versions of utils (or else we'll get differences between
"ac -p" and "ac -p --compat")
-- skip the compare if a system util doesn't exist
handle uucpxxxx records as well as ftpxxxx ones.
Don't complain about missing login entries? I wrote the following to
Dirk & Karl: "I should disable the "missing login record" warning,
because it really doesn't mean much. Oftentimes getty or another
system utility will write a `logout' record for a given tty just for
good measure -- it really doesn't mean anything is wrong. I'll put
that in the todo list."
Delete version.h.in and files.h.in from the disty and generate them
entirely using autoconf? Currently this seems broken to me. I need a
generic way to detect already installed pacct/acct/wtmp/utmp files...
Write a simple set of functions that manage getopt options, usage
strings, and the like. It's stupid to have to update them in several
places -- that's prone to error.
Write up a few sentences on column labeling.
Make sure the --seperate-forks flag works.
DNS lookups for ut_addr field -- should we lookups at all? Should we
cache lookups?
Grow number precision in ac so we always get decent column alignment?
Add appropriate INDEX entries for the texi file.
Use stdarg instead of this alloca business for error messages?
Forrest Aldrich <[email protected]> requests: Write some shell
scripts to manage log rotation and all. Perhaps along the lines of
the "handleacct.sh" script by Juha. Ask him.
Linux patches: check to see if disk space is available before writing.
Ugly -- try to get character string usernames from inside the kernel.
UIDs are no problem (given), but the kernel doesn't know about
GETPWUID. Also look at NetBSD sources and see how they calculate
AC_MEM.
Some System V boxes (kropotkin) are smart enough to include the
USERNAME of the person when issuing a DEAD_PROCESS record! Cool.
Change the code to use it.
----------------------------------------------------------------------
The following are some e-mail messages between myself and various
other folks. They may or may not be useful, but are included
(primarily) to make me keep some of these concepts in mind.
For the concerned reader, I've removed some of the discussion on
logfile rotation specifics (the junk that isn't within the scope of
this package). You might want to check out the `logrotate' package
available from Red Hat Software (http://www.redhat.com).
----------------------------------------------------------------------
----------
From rms Nov 2 1992 to noel
Knowing the shoddy practices of Unix, I have a suspicion that there
are arbitrary limits in the format of the accounting files on Unix.
Are there?
For example, is there a restriction in the format on the length of a
user name?
The GNU system is supposed to abolish such limits. So if there are any,
could you send me a description of where they are? Also, could you
add #ifdefs to support an improved file format that will not have
any limits? We could use the improved format in GNU.
Another issue occurs to me. It seems that on Unix systems the
accounting files fill up the disk and one must manually take care of
moving them, purging old info, etc. Can you implement something that
automatically solves this problem?
[snip]
----------
From noel Nov 2 1992
to rms
>Knowing the shoddy practices of Unix, I have a suspicion that there
>are arbitrary limits in the format of the accounting files on Unix.
>Are there?
Yes, you guessed it. There is a limit on the size of the username,
command name, tty, and node name (just to name a few)...
>The GNU system is supposed to abolish such limits. So if there are any,
>could you send me a description of where they are? Also, could you
>add #ifdefs to support an improved file format that will not have
>any limits? We could use the improved format in GNU.
I agree that the limits should be disposed of. In fact, I think (in
this case, at least) that it would save on the file sizes -- to
automatically save 8 characters for a username is wasteful, especially
if many usernames are 4 or 5 characters (tty name are even a better
example)! Yes, I think I can implement a better file format pretty
easily -- it will make the job of writing records to the file easier
(by login and init, for example) but involve a little craftier reading
on the part of my programs. Oh, well. Who cares, if we save a lot of
space in the process AND make the resultant data much more readable.
Even better -- I could restructure the file so that it doesn't contain
ambiguities of the current system. What I mean: the wtmp file has a
record for each login and logout and leaves ac to try and figure out
which records belong together. Sometimes you just can't figure it
out -- records are missing. What if the wtmp file had records like
the acct file, where you recorded the username, tty, etc. along with
BOTH the login AND logout time? There would be no problem with
reboots, though: i.e., somebody logs in and the machine reboots -- no
record would be written to the file. Is that important? I guess so.
I'm just musing about this...
[snip]
----------
From rms Nov 2 1992
to noel
[snip]
Make the inquiry tools (such as lastcomm) to look at *all* the
existing wtmp files, both the current one and the renamed ones,
in the proper order.
[snip]
----------
From noel Nov 2 1992
to noah
I've been working on re-writes of the unix accounting utils (ac, last,
lastcomm, sa, etc.). In the process, rms has asked me to change the
format of the files that are kept around -- he wants variable-length
records (unlimited name lengths). I was thinking that it might be
more useful to re-evaluate how the login/logout records are stored.
Could you give me your thoughts on this?
- - - - - - - - - - - - - - - - - - - -
When a person logs in, LOGIN writes a record to both the wtmp and utmp
files. When the login process dies, INIT searches through the utmp
file and removes the appropriate record and then adds a record to the
wtmp file.
Wouldn't it be great if the wtmp file had records that contained both
the login AND logout time (instead of two separate ones)? Since INIT
is cleaning up after LOGIN anyways (it has to find the login record in
the utmp file), why not give the responsibility to INIT completely?
The process as I envision it is:
* INIT starts up and writes a reboot record (as usual). It then
checks the utmp file. If there are entries, we know that the
system crashed while people were logged in. Read those records
and write them to the wtmp file as people who were interrupted
because the machine rebooted.
* Somebody logs in and LOGIN writes a record to utmp file (just in
case system goes down while somebody is logged in -- see above)
* when login process dies, INIT looks through the utmp file and gets
the info that LOGIN wrote there. After removing the record from
the utmp file, INIT writes a complete record to the wtmp file.
* when init is told to quit, it writes a shutdown record (as usual)
and writes records for all entries in the utmp file (if
necessary), marking them as people who got logged out because of
shutdown.
* DATE will both write a record of time change to the wtmp file AND
update all of the records in the utmp file, so INIT and LOGIN
don't have to worry about it.
INIT, LOGIN, and DATE are the producers of the utmp and wtmp files. If
the format of the files were to change, these would be the main
concerns. The other stuff just looks at these files -- the accounting
stuff like AC and LAST; the login monitoring stuff like W and WHO.
I'm doing the accounting package, so that's not a problem. The W and
WHO programs will just have to be modified to support rms'
arbitrary-length names.
- - - - - - - - - - - - - - - - - - - -
That's the deal for the login accounting. I don't think there's much
help for process accouting, unless it be in saving an actual username
(userids can change over time)... Let me know.
-N
----------
From noah Nov 2 1992
to noel, rms, mib
(CC recipients: Noel Cragg asked me what kinds of changes might be made to
various accounting systems for the hurd. Here are some thoughts off the
top of my head. Perhaps someone else has better ideas).
Assuming we don't have to be backward-contemptible with the currently
broken unix accounting mechanisms, I'd envision implementing some of the
following changes.
Wtmp/utmp files would be of the form
long length_header; /* total size of record */
struct {
char *ut_line; /* null-terminated tty name */
char *ut_name; /* null-terminated user login name */
char *ut_host; /* null-terminated FQDN of remote host, or IP
* address in string form if no FQDN can be
* found.
*/
time_t ut_time; /* login time */
time_t ut_ltime; /* logout time */
long ut_ltime_offset; /* Offset of this record in wtmp file */
... /* whatever else we find useful */
};
long length_trailer; /* total size of record */
ut_{line,name,host} should not have any static size limitations---hence the
need for recording the size of the record (and also requiring string
records to be null-terminated). Reading the whole record into a buffer and
setting the appropriate pointers into it is straightforward.
The reason for recording the length of the record both before and after it
is that most (but not all) programs read accounting files backwards for
convenience to the user (i.e. show most recent records first). I don't
think I've ever seen a program which needs random-access to the utmp/wtmp
or pacct files---they read through them in a linear fashion---which is why
a dynamic format is possible to begin with. Having the length headers also
allows for potential expansion later without having to stick reserved
fixed-size fields in the structure. You just have to guarantee that new
fields are always added to the end of the structure (rather than being
inserted in the middle) and old programs will continue to process the old
information without any lossage.
The ut_ltime_offset record is for whatever cleans up the utmp file when a
user logs out (it probably should not be init---the hurd can include some
sort of server to do this). When a user logs in a record should be created
in utmp and at the tail end of wtmp. The utmp record should have the
offset of the ut_ltime field of the relevant wtmp record stored in
ut_ltime_offset. Then whatever removes the utmp entry later can lseek
directly to the correct position in wtmp and write the logout time (getting
this info from the utmp record before removing it, of course). This wins
as long as the wtmp file isn't rotated in the meantime, but in that case
the "wtmp server" can notice by doing a stat of /var/adm/wtmp and seeing
when the inode changes, or keeping an open file descripter on wtmp and only
doing a close/reopen on the right path when it gets a HUP signal. Who
cares for now---it's trivial to make the server do the right thing.
(For the record: in present unix systems init, rlogind, and telnetd clean
the utmp file. Init does it when a login spawned by getty (which init is
in turn responsible for spawning) exits, and rlogind and telnetd hang
around implementing the Internet-domain socket communication as well as to
clean up after the user logs out).
Process accounting will almost certainly have to be done by the exec
server(s).
For just summarizing CPU time the current accounting mechanism is more or
less adequate, but almost useless for audit trails. It would be nice if
there were a way for process accounting to store both the pathname of the
program run and the arguments with which it was invoked, along with all the
other usual things. Pacct records should have a more flexible format
too---no limit on the length of strings. If possible the accounting data
should be stored by user name, not uid. That way user accounts with
identical uids can be summarized separately, for example.
All those pathname and argument strings and so forth will be very expensive
in terms of disk usage. That's why the amount of stuff actually recorded
by the exec server should be settable via the command line or a
configuration file. I'm mainly interested in having a design which is
flexible enough to do all this even if most sites never use it (for
instance, on the GNU machines we don't charge people for CPU time so the
only good the pacct records are for is tracking down attempted breakins to
foreign sites. Right now when we need to do that the available information
is usually next to useless).
Note that storing the major/minor device number will probably not be the
preferred way to record the controlling terminal on which a process was
run, but I don't know what the preferred method will be yet.
If they don't exist already, perhaps there could be library interface
routines for getting records from these files so that the details of the
searching mechanism can be minimized. If this is then kept in a shared
library, sites can customize their accounting mechanisms without having to
recompile any programs (at least in some cases).
I don't have time right now to think about this issue in any more detail.
Ask me again in a couple of weeks if you still want more thoughts. In any
case, this is (in rough detail) the functionality I would like as both a
sysadmin and a casual user of the accounting system.
----------
From mib Dec 9 1992
to noel, rms, noah
That's not good enough for utmp. When a record is cleared from utmp,
it needs to be deleted. On Unix, this can be done without
interlocking with other users, because of the fixed-length records.
Your scheme doesn't explain how the utmp file ever gets shorter:
nothing about it seems to be different from wtmp.
I have some ideas about how to make it work more easily; among other
things, the information could be stored in separate files; one per
user. This makes it easier for users to control their own
information, as well as making it less complicated to access and
modify. It could be managed by a translator; thus saving disk space;
the info would actually all be in core, and automagically cleared on
reboot.
Your scheme works well for wtmp, however. It might be better if these
were ascii formats instead of binary formats, however.
-mib
----------
From rms Dec 9 1992
to mib, noel, noah
If utmp is for users currently logged in, then I think it would
be ok to keep a separate file for each such user. After all,
the number of them is not going to be terribly large. Then you
can freely choose the format of the data in these files.
----------
From noel <when?>
to noah, rms, mib
Noah Friedman writes:
> For just summarizing CPU time the current accounting mechanism is
> more or less adequate, but almost useless for audit trails. It
> would be nice if there were a way for process accounting to store
> both the pathname of the program run and the arguments with which it
> was invoked, along with all the other usual things.
> Pacct records should have a more flexible format too---no limit on
> the length of strings. If possible the accounting data should be
> stored by user name, not uid. That way user accounts with identical
> uids can be summarized separately, for example.
> All those pathname and argument strings and so forth will be very
> expensive in terms of disk usage. That's why the amount of stuff
> actually recorded by the exec server should be settable via the
> command line or a configuration file. I'm mainly interested in
> having a design which is flexible enough to do all this even if most
> sites never use it (for instance, on the GNU machines we don't
> charge people for CPU time so the only good the pacct records are
> for is tracking down attempted breakins to foreign sites. Right now
> when we need to do that the available information is usually next to
> useless).
> Note that storing the major/minor device number will probably not be
> the preferred way to record the controlling terminal on which a
> process was run, but I don't know what the preferred method will be
> yet.
> If they don't exist already, perhaps there could be library
> interface routines for getting records from these files so that the
> details of the searching mechanism can be minimized. If this is
> then kept in a shared library, sites can customize their accounting
> mechanisms without having to recompile any programs (at least in
> some cases).
Ding! These are good things to worry about. The acct record should
definitely be of variable length. It could be implemented in a
similar way to the utmp/wtmp records. To the user, however, the
structure will be as follows:
- - - - - - - - - - - - - - - - - - - -
struct acct
{
long ac_items; /* what items in this structure are
valid -- see the description of the
file format below */
char *ac_comm; /* null-terminated command name */
char *ac_path; /* path of the executable */
char *ac_args; /* arguments to the command */
char *ac_uname; /* null-terminated user name */
unsigned short ac_uid; /* user id */
char *ac_gname; /* null-terminated group name */
unsigned short ac_gid; /* group id */
char *ac_tty; /* null-terminated tty name */
dev_t ac_ttyno; /* tty number */
double ac_utime; /* user time */
double ac_stime; /* system time */
double ac_etime; /* elapsed time */
char ac_flags; /* setuid, exec, fork, etc. */
time_t ac_begin; /* beginning time */
};
#define AFORK 0001 /* has executed fork, but no exec */
#define ASU 0002 /* used super-user privileges */
#define ACOMPAT 0004 /* used compatibility mode */
#define ACORE 0010 /* dumped core */
#define AXSIG 0020 /* killed by a signal */
#define AVP 0040 /* this is (was) a vector process */
#define AHZ 64 /* the accuracy of data is 1/AHZ */
- - - - - - - - - - - - - - - - - - - -
But the records will not be stored quite this way on disk. Records
will not only be variable-length to allow for strings, however. They
can even store a variable amount of information.
- - - - - - - - - - - - - - - - - - - -
long length_header; /* total size of the record */
long ac_items; /* bits set to 1 if the particular
piece of information is used */
/* the information, provided it its used -- NOTE that strings have two
pieces of information: the length and the null-terminated string
itself */
#define AC_COMM 0x00000001
short ac_comm_len;
char *ac_comm; /* null-terminated command name */
#define AC_PATH 0x00000002
short ac_path_len;
char *ac_path; /* path of the executable */
#define AC_ARGS 0x00000004
short ac_args_len;
char *ac_args; /* arguments to the command */
#define AC_UNAME 0x00000008
short ac_uname_len;
char *ac_uname; /* null-terminated user name */
#define AC_UID 0x00000010
unsigned short ac_uid; /* user id */
#define AC_GNAME 0x00000020
short ac_gname_len;
char *ac_gname; /* null-terminated group name */
#define AC_GID 0x00000040
unsigned short ac_gid; /* group id */
#define AC_TTY 0x00000080
short ac_tty_len;
char *ac_tty; /* null-terminated tty name */
#define AC_TTYNO 0x00000100
dev_t ac_ttyno; /* tty number */
#define AC_UTIME 0x00000200
double ac_utime; /* user time */
#define AC_STIME 0x00000400
double ac_stime; /* system time */
#define AC_ETIME 0x00000800
double ac_etime; /* elapsed time */
#define AC_FLAGS 0x00001000
char ac_flags; /* setuid, exec, fork, etc. */
#define AC_BEGIN 0x00002000
time_t ac_begin; /* beginning time */
long length_trailer; /* total record length */
- - - - - - - - - - - - - - - - - - - -
Library routines will be implemented so that records can be written to
disk correctly, though the interface to those library routines will
use the struct acct defined above.
It will also be possible to select the information you want when
accton is run. It is easy enough to maintain a /usr/adm/acct.config
file or something similar which would store the ac_items variable.
Since accton is a system call, we can have some kernel global to
signal that a new set of items should be recorded.
The structure also has room to grow -- another 18 bits in ac_items
have yet to be used. Plus, if we hold to the convention of adding to
the end of the structure rather than the beginning, and we pass the
highest bit mask we know about, old programs can still be used without
recompiling, even if the reoutines in the shared libraries change.
That is, the library routine will be smart enough not to give the
calling program more information than it knows about -- overfilling a
structure.
It is even possible for one /usr/adm/acct file to contain records with
varying types -- sa and last will have to be a little smarter, but
that's no big deal. Most machines have the CPU to handle a little
extra work.
----------
From: Jim Kingdon <[email protected]>
Subject: GNU Accounting utilities
Date: Fri, 12 Sep 1997 10:31:51 -0400
[snip]
All the discussion of "but what if the log is rotated?" and such make
me think this is one of those "given sufficient thrust, pigs fly just
fine" solutions. A better solution would be to still write login and
logout records, but to add a login time field to the logout record,
add some kind of session ID, or something of the sort.
[snip]