-
Notifications
You must be signed in to change notification settings - Fork 0
/
paper2.txt
1731 lines (1731 loc) · 82 KB
/
paper2.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
.pn 0
.ls1
.EQ
delim $$
.EN
.ev1
.ps-2
.vs-2
.ev
\&
.sp 10
.ps+4
.ce
COMPUTER (IN)SECURITY \(em
.sp
.ce
INFILTRATING OPEN SYSTEMS
.ps-4
.sp4
.ce
Ian H. Witten
.sp2
.ce4
Department of Computer Science
The University of Calgary
2500 University Drive NW
Calgary, Canada T2N 1N4
.sp2
.ce2
November 1986
Revised March 1987
.bp 1
.ls 2
.pp
Shared computer systems today are astonishingly insecure.
And users, on the whole, are blithely unaware of the weaknesses of the
systems in which they place \(em or rather, misplace \(em their trust.
Taken literally, of course, it is meaningless to ``trust'' a computer system
as such, for machines are neither trustworthy nor untrustworthy;
these are human qualities.
In trusting a system one is effectively trusting all those who create and
alter it, in other words, all who have access (whether licit or
illicit).
Security is a fundamentally \fIhuman\fP issue.
.pp
This article aims not to solve security problems but to raise reader
consciousness
of the multifarious cunning ways that systems can be infiltrated, and the
subtle but devastating damage that an unscrupulous infiltrator can wreak.
It is comforting, but highly misleading, to imagine that technical means of
enforcing security have guaranteed that the systems we use are safe.
It is true that in recent years some ingenious procedures have been invented
to preserve security.
For example, the advent of ``one-way functions'' (explained below) has
allowed the password file, once a computer system's central stronghold, to be
safely exposed to casual inspection by all and sundry.
But despite these innovations, astonishing loopholes exist in practice.
.pp
There are manifest advantages in ensuring security by technical means rather
than by keeping things secret.
Not only do secrets leak, but as individuals change projects,
join or leave the organization, become promoted and so on, they need to learn
new secrets and forget old ones.
With physical locks one can issue and withdraw keys to reflect changing
security needs.
But in computer systems, the keys constitute information which can be given
out but not taken back, because no-one can force people to forget.
In practice, such secrets require considerable administration to maintain
properly.
And in systems where security is maintained by tight control of information,
.ul
quis custodiet ipsos custodes
\(em who will guard the guards themselves?
.pp
There is a wide range of simple insecurities that many
systems suffer.
These are, in the main, exacerbated in open systems where information and
programs are shared among users \(em just those features that characterize
pleasant and productive working environments.
The saboteur's basic tool is the Trojan horse,
a widely trusted program which has been surreptitiously modified to do
bad things in secret.
``Bad things'' range from minor but rankling irritations through theft of
information to holding users to ransom.
The inevitable fragilities of operating systems can
be exploited by constructing programs which behave in some ways like primitive
living organisms.
Programs can be written which spread bugs like an epidemic.
They hide in binary code, effectively undetectable (because nobody ever
examines binaries).
They can remain dormant for months or years, perhaps quietly and imperceptibly
infiltrating their way into the very depths of a system, then suddenly pounce,
causing irreversible catastrophe.
A clever and subtle bug\(dg can survive
recompilation despite the fact that there is no record of it in the source
program.
.FN
\(dg Throughout this article the word ``bug'' is meant to bring to mind a
concealed snooping device as in espionage, or a micro-organism carrying
disease as in biology, rather than an inadvertent programming error.
.EF
This is the ultimate parasite.
It cannot be detected because it lives only in binary code.
And yet it cannot be wiped out by recompiling the source program!
We might wonder whether these techniques, which this article develops
and explains in the context of multi-user timesharing operating systems,
pose any threats to computer networks or even stand-alone micros.
.pp
Although the potential has existed for decades, the possibility of the kind of
``deviant'' software described here has been recognized only recently.
Or has it?
Probably some in the world of computer wizards and sorcerers have known for
years how systems can be silently, subtly infiltrated \(em and concealed
the information for fear that it might be misused (or for other reasons).
But knowledge of the techniques is spreading nevertheless, and I believe it
behooves us all \(em professionals and amateurs alike \(em to understand just
how our continued successful use of computer systems hangs upon a thread of
trust.
Those who are ignorant of the possibilities of sabotage can easily be
unknowingly duped by an unscrupulous infiltrator.
.pp
The moral is simple.
Computer security is a human business.
One way of maintaining security is to keep things secret, trusting people
(the very people who can do you most harm) not to tell.
The alternative is to open up the system and rely on technical means
of ensuring security.
But a system which is really ``open'' is also open to abuse.
The more sharing and productive the environment, the more potential exists for
damage.
You have to trust your fellow users, and educate yourself.
If mutual trust is the cornerstone of computer security, we'd better know it!
.sh "The trend towards openness"
.pp
Many people believe that computer systems can maintain security not
by keeping secrets but by clever technical mechanisms.
Such devices include electronic locks and keys, and schemes for maintaining
different sets of ``permissions'' or ``privileges'' for each user.
The epitome of this trend towards open systems is the well-known \s-2UNIX\s+2
operating system, whose developers, Dennis Ritchie and Ken Thompson, strove
to design a clean, elegant piece of software that could be understood,
maintained, and modified by users.
(In 1983 they received the prestigious ACM Turing Award for their work.) \c
Ken Thompson has been one of the prime contributors to our knowledge of
computer (in)security, and was responsible for much of the work described in
this article.
.pp
The most obvious sense in which the \s-2UNIX\s+2 system
is ``open'' is illustrated by looking at its password file.
Yes, there is nothing to stop you from looking at this file!
Each registered user has a line in it, and Figure\ 1 shows mine.
It won't help you to impersonate me, however, because what it shows in the
password field is not my password but a scrambled version of it.
There is a program which computes encrypted passwords from plain ones, and
that is how the system checks my identity when I log in.
But the program doesn't work in reverse \(em it's what is called a ``one-way
function'' (see Panel\ 1).
It is effectively impossible to find the plain version from the encrypted one,
even if you know exactly what the encryption procedure does and try to work
carefully backward through it.
\fINobody\fR can recover my plain password from the information stored in the
computer.
If I forget it, not even the system manager can find out what it is.
The best that can be done is to reset my password to some standard one, so
that I can log in and change it to a new secret password.
(Needless to say this creates a window of opportunity for an imposter.) \c
The system keeps no secrets.
Only I do.
.pp
Before people knew about one-way functions, computer systems maintained a
password file which gave everyone's plain password for the login procedure to
consult.
This was the prime target for anyone who tried to
break security, and the bane of system managers because of the
completely catastrophic nature of a leak.
Systems which keep no secrets avoid an unnecessary Achilles heel.
.pp
Another sense in which \s-2UNIX\s+2 is ``open'' is the accessibility of its
source code.
The software, written in the language "C", has been distributed
(to universities) in source form so that maintenance can be done locally.
The computer science research community has enjoyed numerous benefits from
this enlightened policy (one is that we can actually look at some of the
security problems discussed in this article).
Of course, in any other system there will inevitably be a large number of
people who have or have had access to the source code \(em even though it may
not be publicly accessible.
Operating systems are highly complex pieces of technology, created by large
teams of people.
A determined infiltrator may well be able to gain illicit access to source
code.
Making it widely available has the very positive effect of bringing the
problems out into the open and offering them up for public scrutiny.
.pp
Were it attainable, perfect secrecy would offer a high degree of security.
Many people feel that technical innovations like one-way functions and
open password files provide comparable protection.
The aim of this article is to show that this is a dangerous misconception.
In practice, security is often severely compromised by people who have
intimate knowledge of the inner workings of the system \(em precisely the
people you rely on to \fIprovide\fR the security.
This does not cause problems in research laboratories because they are
founded on mutual trust and support.
But in commercial environments, it is vital to be aware of any limitations on
security.
We must face the fact that
in a hostile and complex world, computer security is best preserved by
maintaining secrecy.
.sh "A pot-pourri of security problems"
.pp
Here are a few simple ways that security might be compromised.
.rh "Guessing a particular user's password."
Whether your password is stored in a secret file or encrypted by a one-way
function first, it offers no protection if it can easily be guessed.
This will be hard if it is chosen at random from a large enough set.
But for a short sequence of characters from a restricted alphabet
(like the lower-case letters), an imposter could easily try all possibilities.
And in an open system which gives access to the password file and one-way
function, this can be done mechanically, by a program!
.pp
In Figure\ 2, the number of different passwords is plotted against the length
of the password, for several different sets of characters.
For example, there are about ten million ($10 sup 7$) possibilities for a
5-character password chosen from the lower-case letters.
This may seem a lot, but if it takes 1\ msec to try each one, they can all be
searched in about 3\ hours.
If 5-character passwords are selected from the 62 alphanumerics, there
are more than 100 times as many and the search would take over 10\ days.
.pp
To make matters worse, people have a strong propensity to choose as
passwords such things as
.LB
.NP
English words
.NP
English words spelled backwards
.NP
first names, last names, street names, city names
.NP
the above with initial upper-case letters
.NP
valid car license numbers
.NP
room numbers, social security numbers, telephone numbers, etc.
.LE
Of course, this isn't particularly surprising since passwords have to be
mnemonic in order to be remembered!
But it makes it easy for an enterprising imposter to gather a substantial
collection of candidates (from dictionaries, mailing lists, etc) and search
them for your password.
At 1\ msec per possibility, it takes only 4\ minutes to search a 250,000-word
commercial dictionary.
.pp
A study some years ago of a collection of actual passwords that people used to
protect their accounts revealed the amazing breakdown reproduced in Figure\ 3.
Most fell into one of the categories discussed, leaving less
than 15% of passwords which were hard to guess.
Where does your own password stand in the pie diagram?
.rh "Finding any valid password."
There is a big difference between finding a particular person's password and
finding a valid password for any user.
You could start searching through the candidates noted above until you found
one which, when encrypted, matched one of the entries in the password file.
That way you find the most vulnerable user, and there are almost certain to be
some lazy and crazy enough to use easily-guessable passwords, four-letter
words, or whatever.
Hashing techniques make it almost as quick to check a candidate against a
group of encrypted passwords as against a single one.
.pp
A technique called ``salting'' protects against this kind of attack.
Whenever a user's password is initialized or changed, a small random number
called the ``salt'' is generated (perhaps from the time of day).
Not only is this combined with the password when it is encrypted, but as
Figure\ 1 shows it is also stored in the password file for everyone to see.
Every time someone claiming to be that user logs in, the salt is combined with
the password offered before being encrypted and compared
with whatever is stored in the password file.
For example, say my password was ``w#xs27'' (it isn't!).
If the salt is ``U6'' (as in Figure\ 1), the system will apply its one-way
function to ``w#xs27U6'' to get the encrypted password.
.pp
Since all can see the salt, it is no harder for anyone to guess
an individual user's password.
One can salt guesses just as the system does.
But it \fIis\fR harder to search a group of passwords, since the salt will be
different for each, rendering it meaningless to compare a single encrypted
password against all those in the group.
Suppose you were checking to see if anyone had the password ``hello''.
Without salting, you simply apply the one-way function to this word and
compare the result with everyone's encrypted password.
But with salting it's not so easy, since to see if my password is ``hello''
you must encrypt ``helloU6'', and the salt is different for everyone.
.rh "Forced-choice passwords."
The trouble with letting users choose their own passwords is that they often
make silly, easily-guessed, choices.
Many systems attempt to force people to choose more ``random'' passwords, and
force them to change their password regularly.
All these attempts seem to be complete failures.
The fundamental problem is that people have to be able to remember their
passwords, because security is immediately compromised if they are written
down.
.pp
There are many amusing anecdotes about how people thwart systems that attempt
to dictate when they have to change their passwords.
I had been using a new system for some weeks when it insisted that I change my
password.
Resenting it ordering me about, I gave my old password as the new one.
But it was programmed to detect this ruse and promptly told me so.
I complained to the user sitting beside me.
``I know,'' she said sympathetically.
``What I always do is change it to something else and then immediately
change it back again!'' \c
Another system remembered your last several passwords, and insisted on a
once-a-month change.
So people began to use the name of the current month as their password!
.rh "Wiretaps."
Obviously any kind of password protection can be thwarted by a physical
wiretap.
All one has to do is watch as you log in and make a note of your password.
The only defense is encryption at the terminal.
Even then you have to be careful to ensure that someone can't intercept
your encrypted password and pose as you later on by sending this
\fIencrypted\fR string to the computer \(em after all, this is what the
computer sees when you log in legitimately!
To counter this, the encryption can be made time-dependent so that the same
password translates to different strings at different times.
.pp
Assuming that you, like 99.9% of the rest of us, don't go to the trouble of
terminal encryption, when was the last time you checked the line between your
office terminal and the computer for a physical wiretap?
.rh "Search paths."
We will see shortly that you place yourself completely at the mercy of other
users whenever you execute their programs, and they
can do some really nasty things like spreading infection to your files.
However, you don't necessarily have to execute someone else's program overtly,
for many systems make it easy to use other people's
programs without even realizing it.
This is usually a great advantage, for you can install programs so that you
or others can invoke them just like ordinary system programs, thereby
creating personalized environments.
.pp
Figure\ 4 shows part of the file hierarchy in our system.
The whole hierarchy is immense \(em I alone have something like 1650 files,
organized into 200 of my own directories under the ``ian'' node shown in the
Figure, and there are hundreds of other users \(em and what is shown is just a
very small fragment.
Users can set up a ``search path'' which tells the system
where to look for programs they invoke.
For example, my search path includes the 6 places that are circled.
Whenever I ask for a program to be executed, the system seeks it in these
places.
It also searches the ``current directory'' \(em the one where I happen to be
at the time.
.pp
To make it more convenient for you to set up a good working environment, it
is easy to put someone else's file directories on your search path.
But then they can do arbitrary damage to you, sometimes completely
accidentally.
For example, I once installed a spreadsheet calculator called ``sc'' in one
of my directories.
Unknown to me, another user suddenly found that the Simula compiler stopped
working and entered a curious mode where it cleared his VDT screen and wrote
a few incomprehensible characters on it.
There was quite a hiatus.
The person who maintained the Simula compiler was away,
but people could see no reason for the compiler to have been altered.
Of course, told like this it is obvious that the user had my directory on his
search path and I had created a name conflict with \fIsc\fR, the Simula
compiler.
But it was not obvious to the user, who rarely thought about the search path
mechanism.
And I never use the Simula compiler and had created the conflict in all
innocence.
Moreover, I didn't even know that other users had my directory on their search
paths!
This situation caused only frustration before the problem was diagnosed and
fixed.
But what if I were a bad guy who had created the new \fIsc\fR program to
harbor a nasty bug (say one which deleted the hapless user's files)?
.pp
You don't necessarily have to put someone on your search path to run the
risk of executing their programs accidentally.
As noted above, the system (usually) checks your current working directory
for the program first.
Whenever you change your current workplace to another's directory, you
might without realizing it begin to execute programs that had been
planted there.
.pp
Suppose a hacker plants a program with the same name as a common
utility program.
How would you find out?
The \s-2UNIX\s+2 \fIls\fR command lists all the files in a directory.
Perhaps you could find imposters using \fIls\fR? \(em Sorry.
The hacker might have planted another program, called \fIls\fR, which
simulated the real \fIls\fR exactly except that it lied about its own
existence and that of the planted command!
The \fIwhich\fR command tells you which version of a program you
are using \(em whether it comes from the current directory, another user's
directory, or a system directory.
Surely this would tell you? \(em Sorry.
The hacker might have written another \fIwhich\fR which lied about itself,
about \fIls\fR, and about the plant.
.pp
If you put someone else on your search path, or change into their directory,
you're implicitly trusting them.
You are completely at a user's mercy when you execute one of their programs,
whether accidentally or on purpose.
.rh "Programmable terminals."
Things are even worse if you use a ``programmable'' terminal.
Then, the computer can send a special sequence of characters to command the
terminal to transmit a particular message whenever a particular key is struck.
For example, on the terminal I am using to type this article, you could
program the \s-2RETURN\s+2 key to transmit the message ``hello'' whenever it
is pressed.
All you need to do to accomplish this is to send my terminal the character
sequence
.LB
\s-2ESCAPE\s+2 P ` + { H E L L O } \s-2ESCAPE\s+2
.LE
(\s-2ESCAPE\s+2 stands for the \s-2ASCII\s+2 escape character, decimal 27,
which is invoked by a key labeled ``Esc''.) \c
This is a mysterious and ugly incantation, and I won't waste time
explaining the syntax.
But it has an extraordinary effect.
Henceforth every time I hit the return key, my terminal will transmit the
string ``hello'' instead of the normal \s-2RETURN\s+2 code.
And when it receives this string, the computer I am connected to will try to
execute a program called ``hello''!
.pp
This is a terrible source of insecurity.
Someone could program my terminal so that it executed one of \fItheir\fR
programs whenever I pressed \s-2RETURN\s+2.
That program could reinstate the \s-2RETURN\s+2 code to make it
appear afterwards as though nothing had happened.
Before doing that, however, it could (for example) delete all my files.
.pp
The terminal can be reprogrammed just by sending it an ordinary character
string.
The string could be embedded in a file, so that the terminal would be bugged
whenever I viewed the file.
It might be in a seemingly innocuous message;
simply reading mail could get me in trouble!
It could even be part of a file \fIname\fR, so that the bug would appear
whenever I listed a certain directory \(em not making it my current directory,
as was discussed above, but just \fIinspecting\fR it.
But I shouldn't say ``appear'', for that's exactly what it might not do.
I may never know that anything untoward had occurred.
.pp
How can you be safe?
The programming sequences for my terminal all start with \s-2ESCAPE\s+2,
which is an \s-2ASCII\s+2 control character.
Anyone using such a terminal should whenever possible work through a
program that exposes control characters.
By this I mean a program that monitors output from the computer and translates
the escape code to something like the 5-character sequence ``<ESC>''.
Then a raw \s-2ESCAPE\s+2 itself never gets sent to the terminal,
so the reprogramming mechanism is never activated.
.pp
Not only should you avoid executing programs written by people you don't
trust, but in extreme cases you should take the utmost care in \fIany\fR
interaction with untrustworthy people \(em even reading their electronic
mail.
.sh "Trojan horses: getting under the skin"
.pp
The famous legend tells of a huge, hollow wooden horse filled with Greek
soldiers which was left, ostensibly as a gift, at the gates of the city of
Troy.
When it was brought inside, the soldiers came out at night and
opened the gates to the Greek army, which destroyed the city.
To this day, something used to subvert an organization from within by abusing
misplaced trust is called a Trojan horse.
.pp
In any computer system for which security is a concern, there must be things
that need protecting.
These invariably constitute some kind of information (since the computer is,
at heart, an information processor), and such information invariably outlasts
a single login session and is therefore stored in the computer's file system.
Consequently the file system is the bastion to be kept secure, and will be
the ultimate target of any invader.
Some files contain secret information that not just anyone may read,
others are vital to the operation of an organization and must at all costs
be preserved from surreptitious modification or deletion.
A rather different thing that must be protected is the ``identity'' of each
user.
False identity could be exploited by impersonating someone else in order to
send mail.
Ultimately, of course, this is the same as changing data in mailbox files.
Conversely, since for each and every secret file \fIsomeone\fR must
have permission to read and alter it, preserving file system security
requires that identities be kept intact.
.rh "What might a Trojan horse do?"
The simplest kind of Trojan horse turns a common program like a text editor
into a security threat by implanting code in it which secretly reads
or alters files it is not intended to.
An editor normally has access to all the user's
files (otherwise they couldn't be altered).
In other words, the program runs with the user's own privileges.
A Trojan horse in it can do anything the user himself could do, including
reading, writing, or deleting files.
.pp
It is easy to communicate stolen information back to the person who bugged
the editor.
Most blatantly, the access permission of a secret file could be changed so
that anyone can read it.
Alternatively the file could be copied temporarily to disk \(em most systems
allocate scratch disk space for programs that need to create temporary working
files \(em and given open access.
Another program could continually check for it and, when
it appeared, read and immediately delete it to destroy the trace.
More subtle ways of communicating small amounts of information might be to
rearrange disk blocks physically so that their addresses formed a code, or to
signal with the run/idle status of the process to anyone who monitored the
system's job queue.
Clearly, any method of communication will be detectable by others \(em in
theory.
But so many things go on in a computer system that messages can easily be
embedded in the humdrum noise of countless daily events.
.pp
Trojan horses don't necessarily do bad things.
Some are harmless but annoying, created to meet a challenge rather than to
steal secrets.
One such bug, the ``cookie monster'', signals its presence by announcing
to the unfortunate user ``I want a cookie''.
Merely typing the word ``cookie'' will satiate the monster and cause it to
disappear as though nothing had happened.
But if the user ignores the request, although the monster appears to go
away it returns some minutes later with ``I'm hungry; I really want a
cookie''.
As time passes the monster appears more and more frequently with increasingly
insistent demands, until it makes a serious
threat: ``I'll remove some of your files if you don't give me a cookie''.
At this point the poor user realizes that the danger is real and is
effectively forced into appeasing the monster's appetite by supplying the word
``cookie''.
Although an amusing story to tell, it is not pleasant to imagine being
intimidated by an inanimate computer program.
.pp
A more innocuous Trojan horse, installed by a system programmer to commemorate
leaving her job, occasionally drew a little teddy-bear on the graph-plotter.
This didn't happen often (roughly every tenth plot), and even when it did
it occupied a remote corner of the paper, well outside the normal plotting
area.
But although they initially shared the joke, management soon ceased to
appreciate the funny side and ordered the programmer's replacement to get rid
of it.
Unfortunately the bug was well disguised and many fruitless hours were spent
seeking it in vain.
Management grew more irate and the episode ended when the originator
received a desperate phone-call from her replacement, whose job was by now at
risk, begging her to divulge the secret!
.rh "Installing a Trojan horse."
The difficult part is installing the Trojan horse into a trusted program.
System managers naturally take great care that only a few people get access
to suitable host programs.
If anyone outside the select circle of ``system people'' is ever given an
opportunity to modify a commonly-used program like a text editor
(for example, to add a new feature) all changes will be closely scrutinized by
the system manager before being installed.
Through such measures the integrity of system programs is preserved.
Note, however, that constant vigilance is required, for once bugged, a system
can remain compromised forever.
The chances of a slip-up may be tiny, but the consequences are unlimited.
.pp
One good way of getting bugged code installed in the system is to write a
popular utility program.
As its user community grows, more and more people will copy the program into
their disk areas so that they can use it easily.
Eventually, if it is successful, the utility will be installed as a ``system''
program.
This will be done to save disk space \(em so that the users can delete their
private versions \(em and perhaps also because the code can now be made
``sharable'' in that several simultaneous users can all execute a single copy
in main memory.
As a system program the utility may inherit special privileges, and so be
capable of more damage.
It may also be distributed to other sites, spreading the Trojan horse far and
wide.
.pp
Installing a bug in a system utility like a text editor puts anyone who uses
that program at the mercy of whoever perpetrated the bug.
But it doesn't allow that person to get in and do damage at any time, for
nothing can be done to a user's files until that user invokes the bugged
program.
Some system programs, however, have a special privilege which allows them
access to files belonging to \fIanyone\fR, not just the current user.
We'll refer to this as the ``ultimate'' privilege, since nothing could be more
powerful.
An example of a program with the ultimate privilege is the \fIlogin\fR program
which administers the logging in sequence, accepting the user name and
password and creating an appropriate initial process.
Although \s-2UNIX\s+2 \fIlogin\fR runs as a normal process, it must have the
power to masquerade as any user since that is in effect the goal of the
logging in procedure!
From an infiltrator's point of view, this would be an excellent
target for a Trojan horse.
For example, it could be augmented to grant access automatically to any user
who typed the special password ``trojanhorse'' (see Panel\ 2).
Then the infiltrator could log in as anyone at any time.
Naturally, any changes to \fIlogin\fR will be checked especially carefully
by the system administrators.
.pp
Some other programs are equally vulnerable \(em but not many.
Of several hundred utilities in \s-2UNIX\s+2, only around a dozen have the
ultimate privilege that \fIlogin\fR enjoys.
Among them are the \fImail\fR facility, the \fIpasswd\fR program which lets
users change their passwords, \fIps\fR which examines the status of all
processes in the system, \fIlquota\fR that enforces disk quotas, \fIdf\fR
which shows how much of the disk is free, and so on.
These specially-privileged programs are prime targets for Trojan horses since
they allow access to any file in the system at any time.
.rh "Bugs can lurk in compilers."
Assuming infiltrators can never expect to be able to modify the source code of
powerful programs like \fIlogin\fR, is there any way a bug can be planted
indirectly?
Yes, there is.
Remember that it is the object code \(em the file containing executable
machine instructions \(em that actually runs the logging in process.
It is this that must be bugged.
Altering the source code is only one way.
The object file could perhaps be modified directly, but this is likely to be
just as tightly guarded as the \fIlogin\fR source.
More sophisticated is a modification to the compiler itself.
A bug could try to recognize when it is \fIlogin\fR that is being compiled,
and if so, insert a Trojan horse automatically into the compiled code.
.pp
Panel\ 3 shows the idea.
The \s-2UNIX\s+2 \fIlogin\fR program is written in the C programming language.
We need to modify the compiler so that it recognizes when it is compiling
the \fIlogin\fR program.
Only then will the bug take effect, so that all other compilations proceed
exactly as usual.
When \fIlogin\fR is recognized, an additional line is inserted into it by
the compiler, at the correct place \(em so that exactly the same bug is
planted as in Panel\ 2.
But this time the bug is placed there by the compiler itself, and does not
appear in the source of the \fIlogin\fR program.
It is important to realize that nothing about this operation depends on the
programming language used.
All examples in this article could be redone using, say, Pascal.
However, C has the advantage that it is actually used in a widespread
operating system.
.pp
The true picture would be more complicated than this simple sketch.
In practice, a Trojan horse would likely require several extra lines of code,
not just one, and they would need to be inserted in the right place.
Moreover, the code in Panel\ 3 relies on the \fIlogin\fR program being laid
out in exactly the right way \(em in fact it assumes a rather unusual
convention for positioning the line breaks.
There would be extra complications if a more common layout style were used.
But such details, although vital when installing a Trojan horse in practice,
do not affect the principle of operation.
.pp
We have made two implicit assumptions that warrant examination.
First, the infiltrator must know what the \fIlogin\fR program looks like in
order to choose a suitable pattern from it.
This is part of what we mean by ``open-ness''.
Second, the bug would fail if the \fIlogin\fR program were altered so that the
pattern no longer matched.
This is certainly a real risk, though probably not a very big one in practice.
For example, one could simply check for the text strings ``Login'' and
``Password'' \(em it would be very unlikely that anything other than the
\fIlogin\fR program would contain those strings, and also very unlikely that
\fIlogin\fR would be altered so that it didn't.
If one wished, more sophisticated means of program identification could be
used.
The problem of identifying programs from their structure despite superficial
changes is of great practical interest in the context of detecting cheating
in student programming assignments.
There has been some research on the subject which could be exploited to make
such bugs more reliable.
.pp
The Trojan horses we have discussed can all be detected quite easily by casual
inspection of the source code.
It is hard to see how such bugs could be hidden effectively.
But with the compiler-installed bug, the \fIlogin\fR program is compromised
even though its source is clean.
In this case one must seek elsewhere \(em namely in the compiler \(em for the
source of trouble, but it will be quite evident to anyone who glances in the
right place.
Whether such bugs are likely to be discovered is a moot point.
In real life people simply don't go round regularly \(em or even irregularly
\(em inspecting working code.
.sh "Viruses: spreading infection like an epidemic"
.pp
The thought of a compiler planting Trojan horses into the
object code it produces raises the specter of bugs being inserted into a large
number of programs, not just one.
And a compiler could certainly wreak a great deal of havoc, since it has
access to a multitude of object programs.
Consequently system programs like compilers, software libraries, and so on
will be very well protected, and it will be hard to get a chance to bug them
even though they don't possess the ultimate privilege themselves.
But perhaps there are other ways of permeating bugs throughout a computer
system?
.pp
Unfortunately, there are.
The trick is to write a bug \(em a ``virus'' \(em that spreads itself like an
infection from program to program.
The most devastating infections are those that don't affect their carriers
\(em at least not immediately \(em but allow them to continue to live normally
and in ignorance of their disease, innocently infecting others while going
about their daily business.
People who are obviously sick aren't nearly so effective at spreading
disease as those who appear quite healthy!
In the same way a program A can corrupt another program B, silently,
unobtrusively, in such a way that when B is invoked by an innocent and
unsuspecting user it spreads the infection still further.
.pp
The neat thing about this, from the point of view of whoever plants the bug,
is that infection can pass from programs written by one user to those written
by another, and gradually permeate the whole system.
Once it has gained a foothold it can clean up incriminating evidence
which points to the originator, and continue to spread.
Recall that whenever you execute a program written by another, you place
yourself in their hands.
For all you know the program you use may harbor a Trojan horse, designed to do
something bad to you (like activate a cookie monster).
Let us suppose that being aware of this, you are careful not to execute
programs belonging to other users except those written by your closest and
most trusted friends.
Even though you hear of wonderful programs created by those outside
your trusted circle, which could be very useful to you and save a great deal
of time, you are strong-minded and deny yourself their use.
But maybe your friends are not so circumspect.
Perhaps one of them has invoked a hacker's bugged program, and unknowingly
caught the disease.
Some of your friend's own programs are infected.
Fortunately, perhaps, they aren't the ones you happen to use.
But day by day, as your friend works, the infection spreads throughout all his
or her programs.
And then you use one of them\ ...
.rh "How viruses work."
Surely this can't be possible!
How can mere programs spread bugs from one to the other?
Actually, it's very simple.
Imagine.
Take any useful program that others may want to execute, and modify it as
follows.
Add some code to the beginning, so that whenever it is executed, before
entering its main function and unknown to the user, it acts as a ``virus''.
In other words, it does the following.
It searches the user's files for one which is
.LB
.NP
an executable program (rather than, say, a text or data file)
.NP
writable by the user (so that they have permission to modify it)
.NP
not infected already.
.LE
Having found its victim, the virus ``infects'' the file.
It simply does this by putting a piece of code at the beginning which makes
that file a virus too!
Panel\ 4 shows the idea.
.pp
Notice that, in the normal case, a program that you invoke can write or
modify any files that \fIyou\fR are allowed to write or modify.
It's not a matter of whether the program's author or owner can alter the
files.
It's the person who invoked the program.
Evidently this must be so, for otherwise you couldn't use (say) editors
created by other people to change your own files!
Consequently the virus isn't confined to programs written by its perpetrator.
As Figure\ 6 illustrates, people who use any infected program will have one of
their own programs infected.
Any time an afflicted program runs, it tries to pollute another.
Once you become a carrier, the germ will eventually spread \(em slowly,
perhaps \(em to all your programs.
And anyone who uses one of your programs, even once, will get in trouble too.
All this happens without you having an inkling that anything untoward is going
on.
.pp
Would you ever find out?
Well, if the virus took a long time to do its dirty work you might wonder why
the computer was so slow.
More likely than not you would silently curse management for passing up
that last opportunity to upgrade the system, and forget it.
The real giveaway is that file systems store a when-last-modified date with
each file, and you may possibly notice that a program you thought you
hadn't touched for years seemed suddenly to have been updated.
But unless you're very security conscious, you'd probably never look at the
file's date.
Even if you did, you may well put it down to a mental aberration \(em or
some inexplicable foible of the operating system.
.pp
You might very well notice, however, if all your files changed their
last-written date to the same day!
This is why the virus described above only infects one file at a time.
Sabotage, like making love, is best done slowly.
Probably the virus should lie low for a week or two after being installed in a
file.
(It could easily do this by checking its host's last-written date.) \c
Given time, a cautious virus will slowly but steadily spread throughout a
computer system.
A hasty one is much more likely to be discovered.
(Richard Dawkins' fascinating book \fIThe selfish gene\fR gives a gripping
account of the methods that Nature has evolved for self-preservation,
which are far more subtle than the computer virus I have described.
Perhaps this bodes ill for computer security in the future.)
.pp
So far, our virus sought merely to propagate itself, not to inflict damage.
But presumably its perpetrator had some reason for planting it.
Maybe they wanted to read a file belonging to some particular person.
Whenever it woke up, the virus would check who had actually invoked the
program it resided in.
If it was the unfortunate victim \(em bingo, it would spring into action.
Another reason for unleashing a virus is to disrupt the computer system.
Again, this is best done slowly.
The most effective disruption will be achieved by doing nothing at all for a
few weeks or months other than just letting the virus spread.
It could watch a certain place on disk for a signal to start doing damage.
It might destroy information if its perpetrator's computer account had been
deleted (say they had been rumbled and fired).
Or the management might be held to ransom.
Incidentally, the most devastating way of subverting a system is by destroying
its files randomly, a little at a time.
Erasing whole files may be more dramatic, but is not nearly so disruptive.
Contemplate the effect of changing a random bit on the disk every day!
.rh "Experience with a virus."
Earlier I said ``Imagine''.
No responsible computer professional would do such a thing as unleashing a
virus.
Computer security is not a joke.
Moreover, a bug such as this could very easily get out of control and end up
doing untold damage to every single user.
.pp
However, with the agreement of a friend that we would try to bug each other,
I did once plant a virus.
Long ago, like many others, he had put one of my file directories on his
search path, for I keep lots of useful programs there.
(It is a tribute to human trust \(em or foolishness? \(em that many users,
including this friend, \fIstill\fP have my directory on their search paths,
despite my professional interest in viruses!) \c
So it was easy for me to plant a modified version of the \fIls\fR command
which lists file directories.
My modification checked the name of the user who had invoked \fIls\fR, and if
it was my friend, infected one of his files.
Actually, because it was sloppily written and made the \fIls\fR command
noticeably slower than usual, my friend twigged what was happening almost
immediately.
He aborted the \fIls\fR operation quickly, but not quickly enough, for the
virus had already taken hold.
Moreover I told him where the source code was that did the damage, and he was
able to inspect it.
Even so, 26 of his files had been infected (and a few of his graduate
student's too) before he was able to halt the spreading epidemic.
.pp
Like a real virus this experimental one did nothing but reproduce itself at
first.
Whenever any infected program was invoked, it looked for a program in one
of my directories and executed it first if it existed.
Thus I was able to switch on the ``sabotage'' part whenever I wanted.
But my sabotage program didn't do any damage.
Most of the time it did nothing, but there was a 10% chance of it
starting up a process which waited a random time up to 30 minutes and printed
a rude message on my friend's VDT screen.
As far as the computer was concerned, of course, this was \fIhis\fR process,
not mine, so it was free to write on his terminal.
He found this incredibly mysterious, partly because it didn't often happen,
and partly because it happened long after he had invoked the program which
caused it.
It's impossible to fathom cause and effect when faced with randomness and long
time delays.
.pp
In the end, my friend found the virus and wiped it out.
(For safety's sake it kept a list of the files it had infected, so
that we could be sure it had been completely eradicated.) \c
But to do so he had to study the source code I had written for the virus.
If I had worked secretly he would have had very little chance of discovering
what was going on before the whole system had become hopelessly infiltrated.
.rh "Exorcising a virus."
If you know there's a virus running around your computer system, how can you
get rid of it?
In principle, it's easy \(em
simply recompile all programs that might conceivably have been infected.
Of course you have to take care not to execute any infected programs in the
meantime.
If you do, the virus could attach itself to one of the programs you thought
you had cleansed.
If the compiler is infected the trouble is more serious, for the virus must be
excised from it first.
Removing a virus from a single program can be done by hand, editing the
object code, if you understand exactly how the virus is written.
.pp
But is it really feasible to recompile all programs at the same time?
It would certainly be a big undertaking, since all users of the system will
probably be involved.
Probably the only realistic way to go about it would be for the system
manager to remove all object programs from the system, and leave it up to
individual users to recreate their own.
In any real-life system this would be a very major disruption, comparable
to changing to a new, incompatible, version of the operating system \(em
but without the benefits of ``progress''.
.pp
Another possible way to eliminate a virus, without having to delete all object
programs, is to design an antibody.
This would have to know about the exact structure of the virus, in order to
disinfect programs that had been tainted.
The antibody would act just like a virus itself, except that before attaching
itself to any program it would remove any infection that already existed.
Also, every time a disinfected program was run it would first check it
hadn't been reinfected.
Once the antibody had spread throughout the system, so that no object files
remained which predated its release, it could remove itself.
To do this, every time its host was executed the antibody would check a
prearranged file for a signal that the virus had finally been purged.
On seeing the signal, it would simply remove itself from the object file.
.pp
Will this procedure work?
There is a further complication.
Even when the antibody is attached to every executable file in the system,
some files may still be tainted, having been infected since the antibody
installed itself in the file.
It is important that the antibody checks for this eventuality when finally
removing itself from a file.
But wait! \(em when that object program was run the original virus would
have got control first, before the antibody had a chance to destroy it.
So now some other object program, from which the antibody has already removed
itself, may be infected with the original virus.
Oh no!
Setting a virus to catch a virus is no easy matter.
.sh "Surviving recompilation: the ultimate parasite"
.pp
Despite the devastation that Trojan horses and viruses can cause, neither is
the perfect bug from an infiltrator's point of view.
The trouble with a Trojan horse is that it can be seen in the source code.
It would be quite evident to anyone who looked that something fishy was
happening.
Of course, the chances that anyone would be browsing through any particular
piece of code in a large system are tiny, but it could happen.
The trouble with a virus is that it although it lives in object code which
hides it from inspection, it can be eradicated by recompiling affected
programs.
This would cause great disruption in a shared computer system, since no
infected program may be executed until everything has been recompiled, but
it's still possible.
.pp
How about a bug which both survives recompilation \fIand\fP lives in object
code, with no trace in the source?
Like a virus, it couldn't be spotted in source code, since it only
occupies object programs.
Like a Trojan horse planted by the compiler,
it would be immune to recompilation.
Surely it's not possible!
.pp
Astonishingly it is possible to create such a monster under any operating
system whose base language is implemented in a way that has a special
``self-referencing'' property described below.
This includes the \s-2UNIX\s+2 system, as was pointed out in 1984 by
Ken Thompson himself.
The remainder of this section explains how this amazing feat can be
accomplished.
Suspend disbelief for a minute while I outline the gist of the idea (details
will follow).
.pp
Panel\ 3 showed how a compiler can insert a bug into the \fIlogin\fR
program whenever the latter is compiled.
Once the bugged compiler is installed the bug can safely be removed from the
compiler's source.
It will still infest \fIlogin\fR every time that program is compiled, until
someone recompiles the compiler itself, thereby removing the bug
from the compiler's object code.
Most modern compilers are written in the language they compile.
For example, C compilers are written in the C language.
Each new version of the compiler is compiled by the previous version.
Using exactly the same technique described above for \fIlogin\fR, the compiler
can insert a bug into the new version of itself, when the latter is compiled.
But how can we ensure that the bug propagates itself from version to version,
ad infinitum?
Well, imagine a bug that \fIreplicates\fR itself.
Whenever it is executed, it produces a new copy of itself.
That is just like having a program that, when executed, prints itself.
It may sound impossible but in fact is not difficult to write.
.pp
Now for the details.
Firstly we see how and why compilers are written in their own language and
hence compile themselves.
Then we discover how programs can print themselves.
Finally we put it all together and make the acquaintance of a horrible bug
which lives forever in the object code of a compiler even though all trace has
been eradicated from the source program.
.rh "Compilers compile themselves!"
Most modern programming languages implement their own compiler.
Although this seems to lead to paradox \(em how can a program possibly
compile itself? \(em it is in fact a very reasonable thing to do.
.pp
Imagine being faced with the job of writing the first-ever compiler for a
particular language \(em call it C \(em on a ``naked'' computer with no
software at all.
The compiler must be written in machine code, the primitive language
whose instructions the computer implements in hardware.
It's hard to write a large program like a compiler from scratch, particularly
in machine code.
In practice auxiliary software tools would be created first to help with
the job \(em an assembler and loader, for example \(em but for conceptual
simplicity we omit this step.
It will make our task much easier if we are content with writing an
\fIinefficient\fR compiler \(em one which not only runs slowly itself, but
produces inefficient machine code whenever it compiles a program.
.pp