-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy path02-bullets.Rmd
1577 lines (1296 loc) · 104 KB
/
02-bullets.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
\addtocounter{chapter}{1}\setcounter{section}{0}\specialchapt{CHAPTER 2. AUTOMATIC MATCHING OF BULLET LANDS}
\begin{center}
A paper submitted to the \textbf{Annals of Applied Statistics}. \\
Eric Hare, Heike Hofmann, Alicia Carriquiry
\textbf{Abstract}
\end{center}
In 2009, the National Academy of Sciences published a report questioning the scientific validity of many forensic methods including firearm examination. Firearm examination is a forensic tool used to help the court determine whether two bullets were fired from the same gun barrel. During the firing process, rifling, manufacturing defects, and impurities in the barrel create striation marks on the bullet. Identifying these striation markings in an attempt to match two bullets is one of the primary goals of firearm examination. We propose an automated framework for the analysis of the 3D surface measurements of bullet lands that first transcribes the markings into a 2D plotting framework. This makes identification of matches easier and allows for a quantification of both matches and matchability of barrels. The automatic matching routine we propose manages to (a) correctly identify lands (the surface between two bullet grooves) with too much damage to be suitable for comparison, and (b) correctly identify all 10,384 land-to-land matches of the James Hamby study [@hamby:2009].
\newpage
```{r, fig.keep='all', cache=FALSE, echo=FALSE, eval=TRUE, message=F, warning=F}
#rm(list=ls())
#wd <- getwd()
library(extrafont)
library(knitr)
imgdir <- "Figure/"
codedir <- "code"
datadir <- "images/Hamby (2009) Barrel/bullets/"
options(replace.assign=TRUE,scipen=3, digits=2)
bstats <- read.csv("data/data-25-25/bullet-stats-old.csv")
scrubPath <- function(x) {
splits <- strsplit(as.character(x), split="/")
last <- sapply(splits, function(x) x[length(x)])
gsub(".x3p","", last)
}
library(RColorBrewer)
library(ggplot2)
library(scales)
library(dplyr)
library(bulletr)
library(grid)
library(gridExtra)
library(zoo)
library(tidyr)
library(rpart)
library(rpart.plot)
library(xtable)
library(sm)
library(reshape2)
library(randomForest)
```
# Introduction
Firearm examination is a forensic tool used to help the court determine whether two bullets were fired from the same gun barrel. This process has broad applicability in terms of convictions in the United States criminal justice system. Firearms identification has long been considered an accepted and reliable procedure, but in the past ten years has undergone more significant scrutiny. In 2005, in *United States vs. Green*, the court ruled that the forensic expert could not confirm that the bullet casings came from a specific weapon with certainty, but could merely "describe" other casings which are similar. Further court cases in the late 2000s expressed caution about the use of firearms identification evidence [@giannelli:2011].
In 2009, the National Academy of Sciences published a report [@NAS:2009] questioning the scientific validity of many forensic methods including firearm examination. The report states that "[m]uch forensic evidence -- including, for example, bite marks and firearm and toolmark identification is introduced in criminal trials without any meaningful scientific validation, determination of error rates, or reliability testing to explain the limits of the discipline."
Rifling, manufacturing defects, and impurities in a barrel create striation marks on the bullet during the firing process. These marks are assumed to be unique to the barrel, as described in a 1992 AFTE article [@afte:1992]. "The theory of identification as it pertains to the comparison of toolmarks enables opinions of common origin to be made when the *unique surface contours* of two toolmarks are in sufficient agreement". The article goes on to state that "Significance is determined by the comparative examination of two or more sets of surface contour patterns comprised of individual peaks, ridges and furrows."
From a statistical standpoint, identification of the gun that fired the bullet(s) requires that we compare the probabilities of observing matching striae under the competing hypotheses that the gun fired, or did not fire, the crime scene bullet. If indeed the uniqueness assumption is plausible, the latter probability approaches zero.
Current firearm examination practice relies mostly on visual assessment and comparison of striation. Indeed, the AFTE Theory of Identification (https://afte.org/about-us/what-is-afte/afte-theory-of-identification) explicitly requires that examiners evaluate the strength of similarity between two samples relative to other comparisons they may have carried out in the past. An attempt to quantify the degree of similarity consists in counting the number of consecutively matching striae (CMS) between two bullets, first proposed by @biasotti:1959. This approach has two drawbacks, however. First, determining matching striae is still a subjective activity. Second, as discussed by @miller:1998, the number of CMS may be high even if the bullets were not fired by the same gun.
Here, we focus on the question of defining a metric that can be used to objectively compare two bullets. We propose a framework which allows for the automatic analysis of the surface topologies of bullets, and the transcription of the individual characteristics into a 2D plotting framework.
We work with images from the James Hamby Consecutively Rifled Ruger Barrel Study [@hamby:2009]. Ten consecutively rifled Ruger P-85 pistol barrels were obtained from the manufacturer and fired to produce 20 known test bullets and 15 unknown bullets for comparison. 3D topographical images of each bullet were obtained using a NanoFocus lens at 20x magnification and made publicly available on the NIST Ballistics Database Project\footnote{\url{http://www.nist.gov/forensics/ballisticsdb/hamby-consecutively-rifled-barrels.cfm}} in a format called x3p (XML 3-D Surface Profile). The x3p format conforms to the ISO5436-2 standard\footnote{\url{http://sourceforge.net/p/open-gps/mwiki/X3p/}}, implemented to provide a simple and standard conforming way to exchange 2D and 3D profile data. It was adopted by the OpenFMC (Open Forensic Metrology Consortium\footnote{\url{http://www.openfmc.org/}}), a group of academic, industry, and government firearm forensics researchers whose aim is to establish best practices for researchers using metrology in forensic science. We have developed an open-source package for analyzing bullet lands written in R [@R]. This package is called bulletr [@bulletr] and enables direct reading and manipulation of x3p files. It also implements all of the methods we propose in this paper. A different package exists for reading x3p files called x3pr [@x3pr] developed by Petraco (2014), but it is not designed to carry out calculations like the ones we propose after the x3p files have been read.
\begin{figure}[H]
\centering
\includegraphics[width=\linewidth]{images/sidex3p.png}
\caption{View of the data along the circumference of the bullet (circular segment of about 30 degrees).}
\label{fig:sidex3p}
\end{figure}
\begin{figure}[H]
\centering
\includegraphics[width=\linewidth]{images/topx3p.png}
\caption{Frontal view of a bullet land (lower end of the view is the bottom of the bullet).}
\label{fig:topx3p}
\end{figure}
```{r, echo=FALSE}
typical_width <- read.csv("csvs/grooves.csv") %>%
summarise(length = mean(groove_right - groove_left)) %>%
as.numeric %>%
round(digits = 2)
```
Each fired bullet is provided in the form of a set of six x3p files, where each file is a surface scan between adjacent grooves on the bullet, called a "land". In the Hamby data, typical length (groove-to-groove) of a land is about `r typical_width` micrometers or `r typical_width / 1000` millimeters. For notational simplicity, we refer to a particular land of a bullet as bullet X-Y, where X is the bullet identifier, and Y is the land number. An example of plotting one of these lands is given in Figures \ref{fig:sidex3p} and \ref{fig:topx3p}. These figures show side and top profiles of the land respectively. The tilt of the lines to the left in Figure \ref{fig:topx3p} is not an artifact, but a direct and expected consequence of the spin induced by the rifling during the firing process. Depending on whether a barrel is rifled clockwise or counter-clockwise, the striations have a left or right tilt. The direction of the rifling is a class characteristic, i.e. a feature that pertains to a particular class of firearms, and is not unique at the individual barrel/bullet level.
The typical number and width of striation markings on bullets varies significantly depending on the gun barrel. For instance, a Smith and Wesson barrel with a land-width of 2.4 millimeters contained an average 60 striae, with an average width of about 0.08 millimeters [@chu:2011].
The purpose of our paper is to present an automatic matching routine that allows for a completely objective assessment of the strength of a match between two bullet lands. While we assess the performance of the algorithm in terms of a binary decision of match vs. non-match using a 50% probability cut-off, our primary goal is to highlight the features that are statistically associated with matches and non-matches, and to provide a quantitative assessment of this association. In a real-world application of our algorithm, the raw scores would need further analysis and scrutiny, and it is likely a 50% cut-off would be an inappropriate choice on the basis of reasonable doubt.
Our algorithm is fully open source and available on GitHub [@bulletr]. This transparency allows for a greater understanding of the individual steps involved in the bullet matching process, and allows other forensic examiners, as well as outside observers, to examine the factors that discriminate between known bullet matches and non-matches. We have chosen to perform the matching on a land-to-land level, rather than bullet-to-bullet level. Although doing so introduces an implicit assumption of independence between lands, assuming independence only serves to make the task more challenging.
The remainder of this paper is structured as follows: We first briefly review some earlier work. We then discuss two methods of modeling the class structure of the bullet surfaces. Finally, we proceed to describing an automatic matching routine which we evaluate on the bullets made available through the Hamby study.
# Previous Work
There have been attempts to develop automatic or semi-automatic matching protocols, but most have focused on breech face and firing pin marks [e.g., @riva:2014] or discuss a single attribute for comparison [e.g., @vorburger:2011, @chu:2011]. Still others refer to proprietary algorithms [@roberge:2006]. We briefly review some of this earlier work in what follows.
The original paper on the complete Hamby study already reports the successful use of several computer-assisted methods. However, aside from a zero false positive rate, false-negative error rates for bullets are not given nor are error rates for land-to-land matches mentioned.
@lock2013significance proposed an approach to quantify similarity of toolmarks. Their algorithm determines an optimal matching window between two toolmark signatures, and then performs a set of both coordinated and independent shifts. Given a match, the coordinated shifts would be expected to yield correlation values higher than those obtained from independent shifts. This is assessed using a Mann-Whitney U Statistic.
A procedure for bullet matching using the BulletTrax3D system is described in @roberge:2006. Their study used a different set of ten consecutively rifled barrels; matches are identified based on a bullet-to-bullet correlation score. The authors state that this process `could be automated', but no implementation of the algorithm is available.
Modern automated techniques using 3D images have also been proposed by e.g. @riva:2014. However, the authors focused on cartridge cases and not bullets. This might seem like a trivial distinction, but it has implications for the development of the algorithm. The algorithm performs alignment of striae by rotation of the XY plane, which is not generalizable to bullets in which the XY plane is not flat.
Other work on 3D images has been described by @petraco:2012, who also focus on cartridge cases, as well as screwdriver striation patterns, and by others [e.g., @chu:2011, @chu:2010, @vorburger:2011].
# Bullet Signatures
To analyze the striation pattern, we extract a **bullet profile** [@ma:2004] by taking a cross section of the surface measurements at a fixed height $x$ along the bullet land. Figure \ref{fig:fixedX} shows a plot of the side profile of a bullet land. It can be seen that the global structure of the land dominates the appearance of the plot. The grooves can be clearly identified on the left and right side, and the curvature of the surface is the most visible feature in the middle.
\begin{figure}[hbtp]
\centering
```{r fixedX, echo=FALSE, warning=FALSE, message=FALSE, fig.height=2, fig.width=6, out.width='0.65\\textwidth'}
cols = c(alpha("grey60", alpha=0.6), alpha("black", 0.5))
br111 <- read_x3p(paste(datadir,"Br1 Bullet 1-5.x3p", sep = "/"))
dbr111 <- fortify_x3p(br111)
pars <- data.frame(getCircle(dbr111$y, dbr111$value))
dbr111$theta <- acos((dbr111$y-pars$x0)/pars$radius)/pi*180
dbr111 <- dbr111 %>% mutate(
xpred = cos(theta/180*pi)*pars$radius + pars$x0,
ypred = sin(theta/180*pi)*pars$radius + pars$y0
)
qplot(data=subset(dbr111, x <= 100*1.5625^2 & x >= 99*1.5625^2), y, value, geom="line", size=I(1)) +
geom_line(aes(x=xpred, y=ypred, group=x),
colour="grey30", size=0.25) +
# ylab(expression(paste("Surface Measurements (in ",mu,m,")", sep=""))) +
ylab("") +
theme_bw() +
theme(legend.position="bottom") #+ coord_equal()
```
\caption{\label{fig:fixedX}Side profile of the surface measurements (in~$\mu m$) of a bullet land at a fixed height $x$. Note that the global features dominate any deviations, corresponding to the individual characteristics of striation marks.}
\end{figure}
The smooth curve on the plot represents a segment of a perfect circle with the same radius as the bullet. While the circle is an obvious first choice for fitting the structure, it does not completely capture the bullet surface after it was fired. A discussion of a circular fit and the remaining residual structure can be found in Section \ref{cylindrical-fit}.
Instead of a circular fit, we use multiple loess fits to model the overall structure and extract the bullet markings.
## Identifying Groove Locations
We first identify the location of the left and right grooves in the image. The grooves are assumed to contain no information relevant for determining matches. They also dominate the structure, and therefore need to be removed.
Fortunately, the location and appearance of the grooves in the surface profiles is quite consistent.
Surface measurements reach local maxima around the peak of the groove at either end of the range of $y$, and we can then follow the descent of the surface measurements inwards to the valley of the groove.
The location of the valleys mark the points at which we trim the image. The procedure can be described as follows:
1. At a fixed height $x$ extract a bullet's profile (Figure \ref{fig:loess_step1}, with $x = 243.75\mu m$).
2. For each $y$ value, smooth out any deviations occurring near the minima by twice applying a rolling average with a pre-set \emph{smoothing factor} $s$. (Figure \ref{fig:loess_step3}, smoothing factor $s = 35$ corresponding to 55$\mu m$).
3. Determine the location of the peak of the left groove by finding the first doubly-smoothed value $y_i$ that is the maximum within its smoothing window (e.g. such that $y_i > y_{i - 1}$ and $y_i > y_{i + 1}$, where $i$ is between 1 and $\lfloor s/2 \rfloor$). We call the location of this peak $p_{\ell}$ (see Figure \ref{fig:loess_step47}).
4. Similarly, determine the location of the valley of the left groove by finding the first double-smoothed $y_j$ that is the minimum within its smoothing window. Call the location of this valley $v_{\ell}$.
5. Reverse the order of the $y$ values and repeat the previous two steps to find the peak and valley of the right groove, $(p_{r}, v_{r})$.
6. Trim the surface measurements to values within the two grooves (i.e.\ remove all records with $y_i < v_{\ell}$ and $y_i > v_{r}$) (see Figure \ref{fig:loess_step47}).
\begin{figure}[hbtp]
\centering
\begin{subfigure}[t]{\textwidth}\centering
\caption{\label{fig:loess_step1}Step 1 of identifying groove locations: For a fixed height ($x = 243.75\mu m$) surface measurements for bullet 1-5 are plotted across the range of $y$.}
```{r loess_step1, echo=FALSE, fig.width=10, fig.height=3.5, out.width='.5\\textwidth', warning=FALSE, message=FALSE}
br111_bullet <- get_crosscut(paste(datadir,"Br1 Bullet 1-5.x3p", sep = "/"))
br121_bullet <- get_crosscut(paste(datadir,"Br1 Bullet 2-1.x3p", sep = "/"))
br122_bullet <- get_crosscut(paste(datadir,"Br1 Bullet 2-2.x3p", sep = "/"))
br123_bullet <- get_crosscut(paste(datadir,"Br1 Bullet 2-3.x3p", sep = "/"))
br124_bullet <- get_crosscut(paste(datadir,"Br1 Bullet 2-4.x3p", sep = "/"))
br125_bullet <- get_crosscut(paste(datadir,"Br1 Bullet 2-5.x3p", sep = "/"))
br126_bullet <- get_crosscut(paste(datadir,"Br1 Bullet 2-6.x3p", sep = "/"))
p1 <- qplot(y, value, data = br111_bullet, size=I(1)) + theme_bw() + ylab("Surface measurements")
p1
```
\end{subfigure}
\begin{subfigure}[t]{\textwidth}\centering
\caption{\label{fig:loess_step3}Step 2 of identifying groove locations: The surface measurements are smoothed twice with a smoothing factor of $s = 35$. The orange rectangle shows an example of the smoothing window. Valleys and peaks are detected, if they are not within the same window.}
```{r loess_step3, echo=FALSE, fig.width=10, fig.height=3.5, out.width='.5\\textwidth', warning=FALSE, message=FALSE}
subdata <- br111_bullet
smoothfactor <- 35
value_filled <- na.fill(subdata$value, "extend")
smoothed <- rollapply(value_filled, smoothfactor, function(x) mean(x))
smoothed_truefalse <- rollapply(smoothed, smoothfactor, function(x) mean(x))
xloc=250
p3 <- ggplot() +
annotate("rect",xmin=xloc-17*1.5625, xmax=xloc+18*1.5625, ymin=80, ymax=200, fill=alpha("orange", alpha=0.5)) +
geom_point(aes(x=(1:length(smoothed_truefalse))*1.5625, y=smoothed_truefalse)) + theme_bw() +
geom_vline(xintercept=xloc, colour="grey50") + xlab("y") + ylab("Surface measurements")
p3
```
\end{subfigure}
\begin{subfigure}[t]{\textwidth}\centering
\caption{\label{fig:loess_step47}Steps 3 -- 6 of identifying groove locations: After smoothing the surface measurements extrema on the left and right are detected (marked by vertical lines, red indicating peaks and blue indicating valleys). Values outside the blue boundaries are removed (shown in grey)}
```{r loess_step47, echo=FALSE, fig.width=10, fig.height=3.5, out.width='.5\\textwidth', warning=FALSE, message=FALSE}
lengthdiff <- length(subdata$value) - length(smoothed_truefalse)
peak_ind_smoothed <- head(which(rollapply(smoothed_truefalse, 3, function(x) which.max(x) == 2)), n = 1)
peak_ind <- peak_ind_smoothed + floor(lengthdiff / 2)
groove_ind <- head(which(rollapply(tail(smoothed_truefalse, n = -peak_ind_smoothed), 3, function(x) which.min(x) == 2)), n = 1) + peak_ind
peak_ind2_smoothed_temp <- head(which(rollapply(rev(smoothed_truefalse), 3, function(x) which.max(x) == 2)), n = 1)
peak_ind2_temp <- peak_ind2_smoothed_temp + floor(lengthdiff / 2)
groove_ind2_temp <- head(which(rollapply(tail(rev(smoothed_truefalse), n = -peak_ind2_smoothed_temp), 3, function(x) which.min(x) == 2)), n = 1) + peak_ind2_temp
peak_ind2 <- length(subdata$value) - peak_ind2_temp + 1
groove_ind2 <- length(subdata$value) - groove_ind2_temp + 1
p4 <- qplot(subdata$y[subdata$y < subdata$y[groove_ind]], subdata$value[subdata$y < subdata$y[groove_ind]], colour=I("grey60"), alpha=I(.25)) +
ylim(c(min(subdata$value[!is.nan(subdata$value)]) - 25, max(subdata$value[!is.nan(subdata$value)]) + 25)) +
xlim(c(min(subdata$y) - 25, max(subdata$y) + 25)) +
geom_point(aes(x, y), data = data.frame(x = subdata$y[subdata$y > subdata$y[groove_ind2]], y = subdata$value[subdata$y > subdata$y[groove_ind2]]), colour=I("grey60"), alpha=I(.25)) +
geom_point(inherit.aes = FALSE, aes(x, y), data = data.frame(x = subdata$y[subdata$y < subdata$y[groove_ind2] & subdata$y > subdata$y[groove_ind]], y = subdata$value[subdata$y < subdata$y[groove_ind2] & subdata$y > subdata$y[groove_ind]])) +
theme_bw() +
geom_vline(xintercept = subdata$y[peak_ind], colour = "red") +
geom_vline(xintercept = subdata$y[groove_ind], colour = "blue") +
geom_vline(xintercept = subdata$y[peak_ind2], colour = "red") +
geom_vline(xintercept = subdata$y[groove_ind2], colour = "blue") +
xlab("y") + ylab("Surface measurements")
p4
```
\end{subfigure}
\caption{Overview of all six steps of the smoothing algorithm to identify and remove grooves from the bullet images.}
\end{figure}
The smoothing factor $s$ introduced in the algorithm represents the window size to use for a rolling average. Higher values of $s$ therefore lead to more smoothing. Empirically, a value of $s = 35$ for the smoothing factor seems to work well (the smoothing factor is further discussed in Section \ref{varying-smoothing-factor}). It is important to note that the smoothing pass is done twice. That is, the smoothed data are once again smoothed by computing a new rolling average with the same smoothing factor. This bears some similarities to the ideas of John Tukey in his book Exploratory Data Analysis, where he describes a smoothing process called ``twicing" in which a second pass is made on the residuals computed from the first pass and then added back to the result [@tukey:1977]. This has the effect of introducing a bit more variance back into the smoothed data.
We instead performed a second smoothing pass on the smoothed data, which has the effect of weighting observations near the center of the window the highest, with the weights linearly dropping off as we reach either end of the smoothing window.
## Removing Curvature
Next, we fit a loess regression to the data. Loess regression [@cleveland:1979] is based on the assumption that the relationship between two random variables $X$ and $Y$ can be described in the form of a smooth, continuous function $f$ with $y_i = f(x_i) + \varepsilon_i$ for all values $i = 1, ..., n$. The function $f$ is approximated via locally weighted polynomial regressions. Parameters of the estimation are $\alpha$, the proportion of all points included in the fit (here, $\alpha = 0.75$), the weighting function and the degree of the polynomial (here, we fit a quadratic regression).
The main idea of locally weighted regression is to use a weighting routine that emphasizes the effect of points close to the fitting location and de-emphasizes the effect of points as they are further away. The weighting function used here is the tricubic function $w(d) = \left(1 - d^3\right)^3$, for $d \in [0,1]$ and $w(d) = 0$ otherwise. Here, $d$ is defined as the distance between $x_i$ and the location of the fit $x_o$ and the maximum distance of the range of the $x$-values for span $\alpha$ in $x_o$.
Figure \ref{fig:loess_fit} shows the loess fit, in blue, overlaid on the processed image of bullet 1-5. The fit seems to do a reasonable job of capturing the structure of the image.
Figure \ref{fig:loess_resid} shows the residuals from this fit. These residuals are called the **signature** of bullet 1-5.
\begin{figure}[hbtp]
\centering
\begin{subfigure}[b]{.49\textwidth}\centering
\caption{\label{fig:loess_fit} Loess fit for bullet 1-5.}
```{r loess_fit, echo=FALSE, fig.width=10, fig.height=3.5, out.width='\\textwidth', warning=FALSE, message=FALSE}
br111.groove <- get_grooves(br111_bullet)
my.loess <- fit_loess(br111_bullet, br111.groove)
my.loess$fitted + ylab("Surface measurements")
```
\end{subfigure}
\begin{subfigure}[b]{.49\textwidth}\centering
\caption{\label{fig:loess_resid} Residuals of loess fit for bullet 1-5.}
```{r boot_loess, echo=FALSE, fig.width=10, fig.height=3.5, out.width='\\textwidth', warning=FALSE}
#boot.loess <- boot_fit_loess(br111_bullet, br111.groove)
#poly <- with(boot.loess, data.frame(x = c(y,rev(y)), y=c(high, rev(low))))
#qplot(x=x, y=y, geom="polygon", fill=I("steelblue"), data=poly) +
ggplot() +
geom_line(aes(x=y, y=resid), data=my.loess$data) +
ylab("Residuals of loess fit") + xlab("y") + theme_bw()
```
\end{subfigure}
\caption{Fit and residuals of a loess fit to bullet 1-5 (Barrel~1). The residuals define the {\it signature} of bullet 1-5. %Blue areas around the residuals show 95\% bootstrap confidence intervals (based on $B=1,000$ bootstrap samples).
}
\end{figure}
# Automatic Matching
Applying the loess fit to a range of different signatures (see Figure \ref{fig:manualmatch-rgl} for signatures extracted at heights between 50$\mu m$ and 150$\mu m$) shows the 3D striation marks from two bullets. Signatures of bullet~1 are shown on the left (all extracted from heights below 100$\mu m$) and signatures of bullet 2 are shown on the right (extracted at heights above 100$\mu m$). Signatures are manually aligned, resulting in many of the striation marks to continuously pass from one side to the other. Visually, this allows for an easy assessment of these two bullet lands as a match. However, this match relies on visual inspection and is therefore subjective. The goal of this section is to eliminate the need for a visual inspection during the matching process and replace it by an automatic algorithm. This also allows for a quantification of the strength of the match.
\begin{figure}[hbtp]
\centering
\includegraphics[width=.65\linewidth]{images/matchup-rgl-copy.png}
\caption{\label{fig:manualmatch-rgl}3D view of the manually adjusted side-by-side comparison of bullet~1-5 and bullet 2-1 after removing the curvature. Bullet 2-1 is shaded light grey in the background.}
\end{figure}
In this section, we describe the algorithm for matching signatures first, and the impact of parameter choices in the subsections thereafter.
## Algorithm
```{r twolands100, echo=FALSE}
images <- file.path(datadir, dir(datadir))
images <- images[grep(" ", images)]
im1 <- "images/Hamby (2009) Barrel/bullets/Br1 Bullet 1-5.x3p"
im2 <- "images/Hamby (2009) Barrel/bullets/Br1 Bullet 2-1.x3p"
#lof <- processBullets(paths = images[c(5,7)], x = 100)
subLOFx1 <- processBullets(read_x3p(im1), name = im1, x = 100)
subLOFx2 <- processBullets(read_x3p(im2), name = im2, x = 100)
#subLOFx1 <- subset(lof, bullet=="Br1 Bullet 1-5")
#subLOFx2 <- subset(lof, bullet=="Br1 Bullet 2-1")
subLOFx1$y <- subLOFx1$y + 23*1.5625 # working now!!!
lofX <- rbind(data.frame(subLOFx1), data.frame(subLOFx2))
# smooth
lofX <- lofX %>% group_by(bullet) %>% mutate(
l30 = smoothloess(y, resid, span = 0.03)
)
# cut at .75
threshold <- .75
lofX$r05 <- threshold* sign(lofX$l30) * as.numeric(abs(lofX$l30) > threshold)
lofX$type <- factor(lofX$r05)
levels(lofX$type) <- c("groove", NA, "peak")
```
```{r twolands100adjust, echo=FALSE}
images <- file.path(datadir, dir(datadir))
images <- images[grep(" ", images)]
#lof <- processBullets(paths = images[c(5,7)], x = 100)
subLOFx1 <- processBullets(read_x3p(im1), name = im1, x = 100)
subLOFx2 <- processBullets(read_x3p(im2), name = im2, x = 100)
#subLOFx1 <- subset(lof, bullet=="Br1 Bullet 1-5")
#subLOFx2 <- subset(lof, bullet=="Br1 Bullet 2-1")
subLOFx1$y <- subLOFx1$y - min(subLOFx1$y)
subLOFx2$y <- subLOFx2$y - min(subLOFx2$y)
ccf <- ccf(subLOFx1$resid, subLOFx2$resid, lag.max=100, plot=FALSE)
lag <- ccf$lag[which.max(ccf$acf)]
subLOFx1$y <- subLOFx1$y - lag*1.5625
lofY <- rbind(data.frame(subLOFx1), data.frame(subLOFx2))
```
```{r twolands100match, echo=FALSE, dependson='twolands100'}
matches <- lofX %>% group_by(y) %>% summarise(
potential = (length(unique(type)) == 1),
allnas = sum(is.na(type))/n(),
type1 = na.omit(type)[1],
type = paste(type, sep="|", collapse="|"),
n = n()
)
matches$id <- cumsum(matches$allnas == 1) + 1
matches$lineid <- as.numeric(matches$allnas != 1) * matches$id
isMatch <- function(id, type) {
if (id[1] == 0) return(FALSE)
# browser()
types <- strsplit(type, split = "|", fixed=TRUE)
t1 <- sapply(types, function(x) x[1])
t2 <- sapply(types, function(x) x[2])
if (all(t1 == "NA")) return(FALSE)
if (all(is.na(t2))) return(FALSE)
if (all(t2 == "NA")) return(FALSE)
peak <- length(grep("peak", c(t1, t2))) > 0
groove <- length(grep("groove", c(t1, t2))) > 0
if (peak & groove) return(FALSE)
return(TRUE)
}
lines <- matches %>% group_by(lineid) %>% summarise(
meany = mean(y, na.rm=T),
miny = min(y, na.rm=T),
maxy = max(y, na.rm=T) + 1.5625,
match = isMatch(lineid, type),
type = type1[1]
)
lines <- subset(lines, lineid != 0)
```
\begin{figure}[hbtp]
\centering
\begin{subfigure}[t]{\textwidth}\centering
\caption{Loess smooth of signatures at a height of $x = 100\mu m$ (span is 0.03).\label{fig:smooth}}{%
```{r smooth, echo=FALSE, dependson='twolands100', fig.width=7, fig.height=3.5, out.width='.65\\textwidth', warning=FALSE}
lofX$bulletname <- scrubPath(lofX$bullet)
qplot(data=lofX, y, resid, colour=I("grey30"), size=I(0.75), geom="line", group=bulletname) +
geom_line(aes(y=l30), colour="grey70", size=0.5) +
facet_grid(bulletname~.) +
# scale_colour_manual("", values=cols) +
theme_bw() + ylab("") +
theme(legend.position="none")
```
}
\end{subfigure}
\begin{subfigure}[t]{\textwidth}\centering
\caption{Using a rolling median peaks and valleys are identified for each signature. Peaks and valleys on the signature correspond to striation marks on the bullet's surface. \label{fig:smoothcutb}}{%
```{r smoothcut, echo=FALSE, dependson='twolands100', fig.width=7, fig.height=3.5, out.width='.65\\textwidth', warning=FALSE}
#images <- file.path(datadir, dir(datadir))
#lof <- processBullets(paths = images[c(5,7)], x = 100)
subLOFx1 <- processBullets(read_x3p(im1), name = im1, x = 100)
subLOFx2 <- processBullets(read_x3p(im2), name = im2, x = 100)
lof <- rbind(subLOFx1, subLOFx2)
lof <- bulletSmooth(lof)
bAlign = bulletAlign(lof)
lofX <- bAlign$bullet
b12 <- unique(lof$bullet)
peaks1 <- get_peaks(subset(lofX, bullet==b12[1]), smoothfactor = 25)
peaks2 <- get_peaks(subset(lofX, bullet == b12[2]), smoothfactor = 25)
peaks1$lines$bullet <- b12[1]
peaks2$lines$bullet <- b12[2]
peaks <- rbind(peaks1$lines, peaks2$lines)
peaks$bulletname <- scrubPath(peaks$bullet)
lofX$bulletname <- scrubPath(lofX$bullet)
ggplot() + theme_bw() +
geom_rect(aes(xmin=xmin, xmax=xmax, fill=factor(type)), ymin=-6, ymax=6,
data=peaks, alpha=0.2) +
geom_vline(aes(xintercept=extrema, colour=factor(type)),
data= peaks, alpha=0.7) +
scale_colour_brewer(palette="Set2") +
scale_fill_brewer(palette="Set2") +
theme(legend.position="none") +
facet_grid(bulletname~.) +
geom_line(aes(x=y, y=l30, group=bulletname), data=lofX) +
ylab(expression(paste("Signatures (in ",mu,"m)", sep="")))
```
}
\end{subfigure}
\begin{subfigure}[t]{\textwidth}\centering
\caption{Rectangles in the back identify a striation mark on one of the bullets. Matching striation marks are indicated by color filled rectangles and marked by an `o'. Mismatches are filled in grey and marked by an `x'. \label{fig:smoothcutd}}{%
```{r smoothmatch, echo=FALSE, dependson='twolands100match', fig.width=7, fig.height=2.75, out.width='.65\\textwidth', warning=FALSE}
peaks1$lines$bullet <- b12[1]
peaks2$lines$bullet <- b12[2]
lines <- striation_identify(peaks1$lines, peaks2$lines)
ggplot() +
geom_rect(aes(xmin = xmin, xmax = xmax, fill=factor(type)), ymin = -6, ymax=6.5, data = lines, alpha=0.2, show.legend = FALSE) +
theme(legend.position="bottom") +
geom_text(aes(x = meany), y= -5.5, label= "x", data = subset(lines, !match)) +
geom_text(aes(x = meany), y= -5.5, label= "o", data = subset(lines, match)) +
ylim(c(-6,6.5)) + theme_bw() +
geom_line(data=lofX, aes(x=y, y=l30, group=bulletname, linetype=bulletname)) +
scale_linetype_discrete("") +
scale_colour_manual("", values=cols) +
scale_fill_brewer("", palette="Set2", na.value=alpha("grey60", 0.5)) +
theme(legend.position = c(1,1.1), legend.justification=c(1,1),
legend.background = element_rect(fill=alpha('white', 0.4))) +
ylab(expression(paste("Signatures (in ",mu,"m)", sep=""))) +
xlab("y")
```
}
\end{subfigure}
\caption{\label{fig:match}Matching striation marks: smooth (a), identify peaks and valley (b), and match peaks and valleys between signatures (c).}
\end{figure}
Figure \ref{fig:match} gives an overview of the automated matching routine:
We first identify a stable region for each bullet land and extract the signature at the lowest height in this region, because typically, individual characteristics are best expressed at the lower end of the bullet, near the base (see Section \ref{signature-intensities} for a more detailed discussion). All of the other steps are done on pairs of bullet lands:
1. **Smooth the two signatures** using a loess with a very small span (see Figure \ref{fig:smooth}).
2. Use cross-correlation to **find the best alignment** of the two signatures: shift one of the signatures by the lag indicated by the cross-correlation function (see Figure \ref{fig:ccf} for the cross-correlation function and Figure \ref{fig:crosscutX} for the resulting shift).
3. Using a rolling average, **identify peaks and valleys** for each of the signatures. We then define an interval around the location of the extrema on each side as one third of the distance to the location of the next extrema (see Figure \ref{fig:smoothcutb}). Peaks and valleys constitute the *striation marks* on the bullet.
4. **Match striations across signatures:** based on the intervals around the extrema as defined above, we identify common intervals as the areas in which two or more of the individual intervals overlap: a joint interval is defined as the smallest interval that encompasses all of the overlapping intervals. A joint interval is then called a match(ing stria) between the signatures, if all of the intervals are of the same type of extrema, i.e. they are either all peaks or all valleys. In Figure \ref{fig:match} all matches are shown as color-filled rectangles corresponding to their type of extrema (peaks are shown in orange, and valleys in green). Non-matching intervals are left grey.
5. **Extract features from the aligned signatures and the matches between them:** many different features can be extracted from the aligned signatures. Here, we describe a few of the ones that can be found in the literature and some that we found to be of practical relevance:
i. Maximal number of CMS (consecutive matching striae), and, similarly, the number of consecutively non-matching striae (CNMS),
ii. Number of matches and non-matches,
iii. The value of the cross-correlation function (ccf) between the aligned signatures [@vorburger:2011],
iv. Average difference $D$ between signatures, defined as the Euclidean vertical distance between surface measurements of aligned signatures. Let $f(t)$ and $g(t)$ be smoothed, aligned signatures:
\[
D^2 = \frac{1}{\text{\#}t}\sum_t \left[f(t) - g(t)\right]^2,
\]
v. The sum $S$ of average absolute heights of matched extrema: for each of the two matched stria, compute the average of the absolute heights of the peaks or valleys. $S$ is then defined as the sum of all these averages.
The difference $D$ between signatures is here defined as the Euclidean distance (in $\mu m$). In the paper by @ma:2004, distance is defined as a measure relative to the first signature, which serves as a comparison reference and is therefore a unitless quantity.
Counting the maximal number of CMS is part of the current practice to identify bullet matches [@nichols:1997, @nichols:2003, @nichols:2003b].
In the example of Figure \ref{fig:match}, the number of consecutive matching striations (CMS) is fifteen, a high number suggestive of a match between the lands. Note that the definition of CMS we use does not match the one given in @thompson:2013. There, CMS is defined only in terms of matching peaks without regarding valleys. Additionally, peaks in @thompson:2013 are used only if they can be identified and matched `within a tolerable range' between lands. The definition given here is computationally less complex, but should yield highly correlated values, because of the requirement to only consider signatures from a stable region in the land (see Section \ref{impact-of-bullet-height} for further details on stability of regions). In the Hamby study, the definition of CMS by @thompson:2013 leads to approximately half of the values of CMS defined in this paper (with a correlation coefficient between the values of the two definitions of about 0.92).
For lead bullets, such as used in the Hamby study, @biasotti:1959 considered four or more consecutive peaks (corresponding to eight or more consecutive lines in our definition) to be sufficient evidence of a match.
Determining a threshold such that CMS values above the threshold indicate a match with high reliability is beyond the scope of this work, even though it is critically important in practice. We provide some ideas in the next section, but first we assess the robustness of the matching algorithm to different choices of the parameter values.
## Horizontal Alignment
Signatures of each of the two lands, 1-5 and 2-1, in Figure \ref{fig:manualmatch-rgl} are shown in Figure \ref{fig:cross100} extracted at a height of $x = 100\mu m$. Striation marks show up in these representations as peaks and valleys. The individual characteristics are prominent and, again, suggest a match between the lands. A horizontal shift of one of the signatures (result shown in Figure \ref{fig:crosscutX}) emphasizes the strong similarities between signatures.
\begin{figure}[hbtp]
\centering
\begin{subfigure}[b]{.49\textwidth}\centering
\caption{Raw bullet land signatures.\label{fig:crosscut}}{%
```{r crosscuts, echo=FALSE, dependson='twolands100', fig.width=7, fig.height=3, out.width='\\textwidth'}
lof$bulletname <- scrubPath(lof$bullet)
qplot(y, resid, group=bulletname, colour=bulletname, data=lof,
geom="line") +
scale_colour_manual("", values=cols) + theme_bw() +
theme(legend.position = c(1, 1.1), legend.justification=c("right", "top"),
legend.background = element_rect(fill=alpha('white', 0.4))) +
ylab("Residuals of loess fit")
```
}
\end{subfigure}
\begin{subfigure}[b]{.49\textwidth}\centering
\caption{Aligned signatures.\label{fig:crosscutX}}{%
```{r crosscutsX, echo=FALSE, dependson='twolands100', fig.width=7, fig.height=3, out.width='\\textwidth'}
lofY$bulletname <- scrubPath(lofY$bullet)
qplot(y+min(lof$y), resid, group=bulletname, colour=bulletname, data=lofY,
geom="line") + xlab("y") +
scale_colour_manual("", values=cols) + theme_bw() +
theme(legend.position = c(1, 1.1), legend.justification=c("right", "top"),
legend.background = element_rect(fill=alpha('white', 0.4))) +
ylab("Residuals of loess fit")
```
}
\end{subfigure}
\caption{\label{fig:cross100}Signatures of bullets 1-5 and 2-1 taken at heights of $x = 100\mu m$. A horizontal shift of the values of bullet 1-5 to the right shows the similarity of the striation marks.}
\end{figure}
For this alignment we use the cross-correlation function to find a maximal amount of agreement between the signatures [@bachrach:2002, @chu:2010, @vorburger:2011, @thompson:2013].
This horizontal shift is based on the cross-correlation between the two signatures: let $f(t)$ and $g(t)$ define the signature values at $t$, where $t$ are locations between 0~$\mu m$ and about 2500 $\mu m$, 1.5625 $\mu m$ apart.
The cross-correlation between $f$ and $g$ at lag $k$ is then defined as
\[
(f * g) (k) = \sum_t f(t+k) g(t),
\]
with suitably defined limits for the summation.
\begin{figure}[hbtp]
\centering
```{r ccf, echo=FALSE, fig.width = 8, fig.height = 3, out.width = '.65\\textwidth'}
lag <- ccf$lag[which.max(ccf$acf)]
dframe <- data.frame(lag = ccf$lag, yend=ccf$acf)
ggplot(data = dframe) +
geom_segment(aes(x = lag, xend=lag, yend=yend), y = 0, colour="grey50") +
theme_bw() + ylab("Correlation") + xlab("Lag k") +
geom_segment(x = lag, xend = lag, y = 0, yend = max(ccf$acf), colour="black") +
scale_x_continuous(breaks = c(-100, -50, lag, 0, 50, 100),
minor_breaks = c(-75,-25,25, 75))
```
\caption{\label{fig:ccf}Cross-correlation function between the two signatures shown in Figure \ref{fig:crosscut} at lags between -100 and 100. The correlation is maximized at a lag of -17, indicating the largest amount of agreement between the signatures. Figure \ref{fig:crosscutX} shows the lag-shifted signatures.}
\end{figure}
## Impact of Bullet Height
The height at which signatures are extracted for a comparison between bullet lands matters -- signatures taken from heights that are further apart, show more pronounced differences between the signatures.
This poses both a caveat to matching attempts as well as an opportunity for quality control: we have to be aware of the height that was used in a matching. Visually, matches degrade if the signatures upon which the match is based are from heights further than 200$\mu m$ apart (see Section \ref{cross-correlation-at-multiple-heights} for more discussion). However, we can extract signatures from multiple heights of the same bullet land for an initial assessment of its quality. By comparing signatures from heights that are not too far apart -- 25$\mu m$ to 50$\mu m$ -- we get an indication whether the signatures come from a rapidly changing section of the surface, indicative of a break-off or some other damage, or from a stable section, where we have a reasonable expectation of finding matches to other signatures. In the approach here, we keep increasing the height $x$ at which the signature is taken until we find a section with a stable pattern. This process is shown in Figure \ref{fig:crosscuts2} at the example of bullet 1-1 from barrel 3, where `stability' is defined as two aligned signatures from heights chosen 25$\mu m$ apart having a cross-correlation of at least 0.95.
\begin{figure}[hbtp]
\centering
```{r crosscuts-vary-b31, echo=FALSE, fig.width = 12, fig.height = 7, out.width = '\\textwidth', warning = FALSE}
paths <- file.path(datadir, dir(datadir))
paths <- paths[grep("Br[0-9]", paths)]
crosscuts <- seq(100,250, by=25)
lof <- processBullets(read_x3p(paths[37]), name = paths[37], x = crosscuts)
lof$bullet <- paste(scrubPath(lof$bullet), lof$x)
reslist <- lapply(1:length(crosscuts[-1]), function(i) {
b2 <- subset(lof, x %in% crosscuts[i:(i+1)])
lofX <- bulletSmooth(b2)
bAlign <- bulletAlign(lofX)
lofX <- bAlign$bullet
b12 <- unique(b2$bullet)
peaks1 <- get_peaks(subset(lofX, bullet==b12[1]), smoothfactor = 25)
peaks2 <- get_peaks(subset(lofX, bullet == b12[2]), smoothfactor = 25)
# threshold <- bulletPickThreshold(lofX, thresholds = seq(0.3, 1.5, by = 0.05))
# lines <- striation_identify(lofX, threshold = threshold)
peaks1$lines$bullet <- b12[1]
peaks2$lines$bullet <- b12[2]
lines <- striation_identify(peaks1$lines, peaks2$lines)
maxCMS <- maxCMS(lines$match==TRUE)
list(maxCMS = maxCMS, ccf = bAlign$ccf, lines=lines, bullets=lofX)
})
ccfs <- sapply(reslist, function(res) res$ccf)
lop <- lapply(reslist, function(res) {
ggplot() +
theme_bw() +
geom_rect(aes(xmin=xmin, xmax=xmax, fill=factor(type)), show.legend=FALSE, ymin=-6, ymax=5.5, data=res$lines, alpha=0.2) +
geom_line(aes(x = y, y = l30, linetype = bullet), data = res$bullets) +
scale_linetype_discrete("") +
# scale_colour_brewer("", palette="Set1", na.value=alpha("grey50", alpha=0.5)) +
scale_fill_brewer("", palette="Set2", na.value=alpha("grey50", alpha=0.5)) +
theme(legend.position = c(1,1.2), legend.justification=c(1,1),
legend.background = element_rect(fill=alpha('white', 0.4))) +
ylim(c(-6,6.5)) + ylab("") + xlab("") +
geom_text(aes(x = meany), y= -5.5, label= "x", data = subset(res$lines, !match)) +
geom_text(aes(x = meany), y= -5.5, label= "o", data = subset(res$lines, match))
})
grid.arrange(lop[[1]], lop[[2]], lop[[3]],
lop[[4]], lop[[5]], lop[[6]],
ncol=2)
```
\caption{\label{fig:crosscuts2}Signatures for barrel 3, bullet 1-1 extracted from varying heights. Initially, the match between signatures taken at heights 25$\mu m$ apart is affected strongly by some break off at the bottom of the bullet. At a level of $175\mu m$ the bullet's signature stabilizes. For this land, matches should not be attempted at lower heights.}
\end{figure}
## Varying Smoothing Factor
As mentioned earlier, the algorithm for detecting peaks and valleys depends on the selection of a smoothing window, called the smoothing factor or span. A smoothing factor of $k$ means that the $k$ closest observations to $x_o$ are considered for a fit for $x_o$. Because surface measurements are recorded at an equidistant resolution (here, of 1.5625$\mu m$), we decided to only consider odd smoothing factors $2k + 1$, which means that the $k$ observations to the left and right of $x_o$ are considered for a local fit of $x_o$. For detecting and removing the grooves prior to fitting a loess regression we selected a smoothing factor of 35, while for detecting the peaks/valleys of the loess residuals a smoothing factor of 25 seems more appropriate.
Figure \ref{fig:varysmooth} displays the peaks and valleys detected in the same signature at smoothing factors of 5, 25, and 45, respectively. The dark line corresponds to the smoothed values, while the grey line in the back shows the raw signature. The choice of smoothing factor is a classical decision of a bias/variance trade-off. It is immediately clear that a small smoothing factor like 5 is a poor choice. It results in a significant amount of noise in the data such that even just a point or two can skew the rolling average enough for a peak or valley to be detected. Given that striation widths are typically much larger, we are in effect muddying the waters by performing such minimal smoothing. Another consideration is that the smoothing should not fall below the resolution of the equipment at which the surface measurements are taken -- so as to not introduce artifacts in the analysis.
A larger smoothing factor on the other hand (like 45), seems to be a more plausible option. Most of the peaks/valleys present which are detected by a smoothing factor of 25 are also detected at 45. However, some notable issues arise. Notice that the valley on the right hand side of the image is smoothed out, and thus not detected. On the left hand side, a double peak is detected - that might be a questionable decision - but there are several peaks in the middle, that are smoothed out, for example the peak at around $y = 750$. That is, in many cases, large windows are smoothing out some of the structure that we wish to see. Furthermore, it can be seen that the peaks/valleys are often shifted relative to their position in the original loess residuals, or in the smoothed data with smaller smoothing factors.
\begin{figure}[hbtp]
\centering
```{r smoothfac1, echo=FALSE, fig.width=7, fig.height=5.2, out.width='.8\\textwidth', warning=FALSE}
br111.groove <- get_grooves(br111_bullet)
br111.loess <- fit_loess(br111_bullet, br111.groove)
peakslist <- lapply(c(5, 25, 45), function(s) {
peaks <- get_peaks(br111.loess$data, smoothfactor = s)
peaks$lines$smoothfactor = s
peaks$plot$data$smoothfactor = s
peaks
})
peaks <- plyr::ldply(peakslist, function(x) x$lines)
smooths <- plyr::ldply(peakslist, function(x) x$plot$data)
ggplot() + theme_bw() +
geom_rect(aes(xmin=xmin, xmax=xmax, fill=factor(type)), ymin=-6, ymax=6,
data=peaks, alpha=0.2) +
geom_vline(aes(xintercept=extrema, colour=factor(type)), data= peaks, alpha=.7) +
scale_colour_brewer(palette="Set2") +
scale_fill_brewer(palette="Set2") +
theme(legend.position="none") +
facet_grid(smoothfactor~., labeller="label_both") +
geom_line(aes(x=y, y=resid, group=x), data=br111.loess$data,
colour="grey50") +
geom_line(aes(x=y, y=smoothed), data=smooths) +
ylab(expression(paste("Signature values (in ",mu,"m)", sep="")))
```
\caption{\label{fig:varysmooth} Peak/valley detection at smoothing factors of 5, 25, and 45, respectively. Note that a smoothing factor of 5 yields enough noise that many very minimal overlapping peaks and valleys are detected, while a smoothing factor of 45 might over-smooth and cause the peaks/valleys to either end disappear or shift horizontally from their original position in the signature.}
\end{figure}
# Evaluation
In order to get a better understanding of how the matching algorithm works in known matches and non-matches, we investigate its performance using the James Hamby study data. As a first step, we automatically assess the quality of each of the lands by checking that we can identify a stable region on each land. For this, we compute the cross-correlation of signatures extracted from heights 25$\mu m$ apart. For a stable region, we require a minimum of 0.95 for the cross correlation. Four lands from different bullets are flagged as problematic in this respect. A visual inspection (see Figure \ref{fig:fourflags}) shows that each one of these lands has scratch marks across the surface, also known as `tank rash'.
\begin{figure}
\centering
\begin{subfigure}[t]{.49\textwidth}\centering
\caption{Barrel 6 Bullet 2-1}
\includegraphics[width=\textwidth]{images/br6-2-1-grey.png}
\end{subfigure}
\begin{subfigure}[t]{.49\textwidth}\centering
\caption{Barrel 9 Bullet 2-4}
\includegraphics[width=\textwidth]{images/br9-2-4-grey.png}
\end{subfigure}
\begin{subfigure}[t]{.49\textwidth}\centering
\caption{Unknown Bullet B-2}
\includegraphics[width=\textwidth]{images/b-2-grey.png}
\end{subfigure}
\begin{subfigure}[t]{.49\textwidth}\centering
\caption{Unknown Bullet Q-4}
\includegraphics[width=\textwidth]{images/q-4-grey.png}
\end{subfigure}
\caption{\label{fig:fourflags}Images of the four lands that got flagged during the quality assessment. All of them show scratch marks (tank rash) across the striation marks from the barrel. They are excluded from the remainder of the analysis.}
\end{figure}
We exclude these four lands from further matching considerations and run all remaining lands from the unknown bullets against all remaining lands from known bullets for matches, i.e. we are comparing $15 \times 6 -2 = 90 - 2 = 88$ lands from unknown bullets against $2 \times 10 \times 6 -2 = 120 - 2 = 118$ lands from known bullets, yielding a total of $10,384$ land-to-land comparisons. Out of these comparisons, there are 172 known matches (KM), while the rest are known non-matches (KNM). Ideally, results look like the results in Figure \ref{fig:hamby-perfect}: Figure \ref{fig:hamby-perfect}a shows the distribution of the number of maximum consecutive matching striae between land C-3 and all 118 lands from known bullets. Two lands show a high CMS. These correspond to the known matches with C-3, shown in Figures \ref{fig:hamby-perfect}b and \ref{fig:hamby-perfect}c.
Unfortunately, not all results are as clear cut.
It might not be reasonable to assume that we can match all lands, but the idea is to try to maximize the number of matches to get an overview of what we might be able to expect from an automated match.
\begin{figure}[hbtp]
\begin{subfigure}[t]{\textwidth}\centering
\caption{Maximal number of CMS between unknown bullet C-3 and all of the other 118 considered (known) lands. For two lands the number of maximum CMS is high. }
```{r cms, echo=FALSE, fig.width=7, fig.height=3.5, out.width='.5\\textwidth'}
load("data/data-25-25/unkn9.RData")
cmsdist <- sapply(reslist, function(x) x$maxCMS)
qplot(cmsdist, geom="bar") + theme_bw() + xlab("Number of CMS")
```
\end{subfigure}
\begin{subfigure}[b]{.49\textwidth}\centering
\caption{Overlaid signatures of C-3 and the land with the top matching CMS.}
```{r top, echo=FALSE, fig.width=7, fig.height=3.25, out.width='\\textwidth', warning = FALSE}
res <- reslist[[which.max(cmsdist)]]
#res <- reslist[[which.max(cmsdist[-which.max(cmsdist)])]] # number 2
res$bullets$bullet <- scrubPath(res$bullets$bullet)
print(ggplot() +
theme_bw() +
geom_rect(aes(xmin=xmin, xmax=xmax, fill=factor(type)), ymin=-6, ymax=5, data=res$lines, alpha=0.2, show.legend=FALSE) +
geom_line(aes(x = y, y = l30, linetype = bullet), data = res$bullets) +
scale_linetype_discrete("") + ylab("") +
scale_fill_brewer("", palette="Set2", na.value=alpha("grey50", alpha=0.5)) +
theme(legend.position = c(1,1), legend.justification=c(1,1)) +
ylim(c(-6,6)) +
geom_text(aes(x = meany), y= -5.5, label= "x", data = subset(res$lines, !match)) +
geom_text(aes(x = meany), y= -5.5, label= "o", data = subset(res$lines, match)))
```
\end{subfigure}
\begin{subfigure}[b]{.49\textwidth}\centering
\caption{Top 2 match with C-3 based on CMS.}
```{r top2, echo=FALSE, fig.width=7, fig.height=3.25, out.width='\\textwidth', warning = FALSE}
res <- reslist[[which(cmsdist==14)]] # number 2
res$bullets$bullet <- scrubPath(res$bullets$bullet)
print(ggplot() +
theme_bw() +
geom_rect(aes(xmin=xmin, xmax=xmax, fill=factor(type)), ymin=-6, ymax=5, data=res$lines, alpha=0.2, show.legend=FALSE) +
geom_line(aes(x = y, y = l30, linetype = bullet), data = res$bullets) +
scale_linetype_discrete("") + ylab("") +
scale_fill_brewer("", palette="Set2", na.value=alpha("grey50", alpha=0.5)) +
theme(legend.position = c(1,1), legend.justification=c(1,1)) +
ylim(c(-6,6)) +
geom_text(aes(x = meany), y= -5.5, label= "x", data = subset(res$lines, !match)) +
geom_text(aes(x = meany), y= -5.5, label= "o", data = subset(res$lines, match)))
```
\end{subfigure}
\caption{\label{fig:hamby-perfect}Showcase scenario when matching with CMS works very well. Unfortunately the matches are not always that convincing.}
\end{figure}
Figure \ref{fig:cms} shows the strong connection between the maximal number of consecutive striae and matches in the Hamby study. All 42 pairs of lands with at least thirteen CMS in common are matches.
\begin{figure}[hbtp]
\centering
\begin{minipage}[t]{.47\textwidth}
```{r cms-bars, echo=FALSE, fig.width=7, fig.height=3.5, out.width='\\textwidth'}
ggplot(data=subset(bstats, !flagged)) +
geom_bar(aes(x=factor(CMS))) + theme_bw() +
theme(legend.position="bottom") +
xlab("maximum CMS")
```
\end{minipage}
\begin{minipage}[t]{.52\textwidth}
```{r cms-spines, echo=FALSE, fig.width=7, fig.height=4, out.width='\\textwidth'}
bstats$km <- c( "Known non-match", "Known match")[as.numeric(bstats$match)+1]
ggplot(data=subset(bstats, !flagged)) +
geom_bar(aes(x=factor(CMS), fill=km), position="fill") +
theme(legend.position="bottom") +
scale_fill_brewer("", palette="Paired") +
xlab("maximum CMS") + ylab("Proportion")
```
\end{minipage}
\caption{\label{fig:cms}Distribution of maximal CMS (left). Conditional barchart (Hummel 1996) on the right: heights show probability of match/non-match given a specific CMS. All land-to-land comparisons with at least 13 CMS are matches.}
\end{figure}
There are two things that should be noted at this point: the automated algorithm finds a relatively high number of CMS even for non-matches. On average, there are `r mean(bstats$CMS[!bstats$match], na.rm=T)` maximal CMS between known non-matches (with a standard deviation of `r sd(bstats$CMS[!bstats$match])`). Known matches share on average `r mean(bstats$CMS[bstats$match])` maximal CMS, with a standard deviation of `r sd(bstats$CMS[bstats$match])`. While the probability for a match increases with the number of maximal CMS, a large number of maximal CMS by itself is not indicative of a match, as was previously pointed out by @miller:1998. Figure \ref{fig:mismatch} shows a known mismatch between two lands that share twelve consecutively matched striae. Visually we can easily tell that these two lands do not match well.
\begin{figure}[hbtp]
\centering
```{r strange-res, echo=FALSE, fig.width=7, fig.height=3.25, out.width='.65\\textwidth', warning=FALSE}
load("data/data-25-25/unkn47.RData")
res <- reslist[[106]]
res$bullets$bullet <- scrubPath(res$bullets$bullet)
print(ggplot() +
theme_bw() +
geom_rect(aes(xmin=xmin, xmax=xmax, fill=factor(type)), ymin=-6, ymax=5, data=res$lines, alpha=0.2, show.legend=FALSE) +
geom_line(aes(x = y, y = l30, linetype = bullet), data = res$bullets) +
scale_linetype_discrete("") + ylab("") +
scale_fill_brewer("", palette="Set2", na.value=alpha("grey50", alpha=0.5)) +
theme(legend.position = c(1,1), legend.justification=c(1,1)) +
ylim(c(-6,6)) +
geom_text(aes(x = meany), y= -5.5, label= "x", data = subset(res$lines, !match)) +
geom_text(aes(x = meany), y= -5.5, label= "o", data = subset(res$lines, match)))
```
\caption{\label{fig:mismatch}Known mismatch with a relatively large number of maximal consecutive matching striae (twelve) in the middle. The pattern in the middle does look surprisingly similar, however the outer ends of the signatures easily reveals this comparison as mismatch.}
\end{figure}
For smaller numbers of CMS, the percentage of false positives quickly increases. However, if we take other features of the image into account, we can increase the number of correct matches considerably: Figure \ref{fig:densities} gives an overview of the densities of all of the features derived earlier, for known matches (KM) and known non-matches (KNM). The densities of almost all of the features show strong differences between matches and non matches. For example, a high amount of cross-correlation between two signatures is indicative of a match -- in the Hamby study, only known matches have a cross-correlation of 0.75 or higher. There are 97 land-to-land comparisons with a cross-correlation that high.
\begin{figure}[hbtp]
\centering
```{r density-overview, echo=FALSE, warning=FALSE, fig.width=12.5, fig.height=6.25, out.width='\\textwidth'}
features <- c("CMS", "CNMS", "num.matches", "num.nonmatches", "D", "S", "ccf")
densities <- plyr::ldply(features, function(x) {
xr <- range(bstats[,x])
xx <- seq(xr[1], xr[2], length.out=500)
densKM <- sm.density(bstats[,x][bstats$match], display="none",
eval.points=xx, weights=NA, method="normal")
densKNM <- sm.density(bstats[,x][!bstats$match], display="none",
eval.points=xx, weights=NA, method="normal")
dframe <- data.frame(var=x, x = xx, KM=densKM$estimate, KNM = densKNM$estimate)
dframe <- rbind(data.frame(var=x, x = xr[1], KM=0, KNM=0),
dframe, data.frame(var=x, x = xr[2], KM=0, KNM=0))
dframe$order <- 1:nrow(dframe)
dframe
})
dm <- melt(densities, measure.var=c("KM", "KNM"))
dm$var <- factor(dm$var, levels=features)
levels(dm$var)[3:4] <- c("#matches", "#non-matches")
levels(dm$variable) <- c("Known matches (KM)", "Known non-matches (KNM)")
dm$varLabel <- dm$var
levels(dm$varLabel) <- c(
"Consecutive Matching Striae (CMS)",
"Consecutive Non-Matching Striae (CNMS)",
"#matches",
"#non-matches",
"Average difference (D)",
"Sum of peaks (S)",
"Cross-correlation function (ccf)")
qplot(x, value, group=variable, geom="polygon", data=dm, fill=variable,
alpha=I(0.6), colour=I("grey20")) +
facet_wrap(~varLabel, ncol=4, scales="free") +
theme_bw() + ylab("") + xlab("") +
scale_fill_brewer("Bullet Land Pairs", palette="Paired") +
theme(legend.position=c(1,0),
legend.justification = c("right", "bottom"),
legend.background = element_rect(colour="grey75"),
axis.title.y=element_blank(),
plot.margin = unit(c(0,0,0,0), unit="cm"))
```
\caption{\label{fig:densities}Overview of all the marginal densities for features described in Section \ref{algorithm}. Shifts in the mode of the density functions between known matches and known non-matches indicate the variable's predictive power in distinguishing matches and non-matches. Predictive power is shown in more detail in Figure \ref{fig:rocs}.}
\end{figure}
\begin{figure}[hbtp]
\centering
```{r rocs-overview,echo=FALSE, warning=FALSE, fig.width=12.5, fig.height=6.25, out.width='\\textwidth'}
# plot false positive against false negative
# prob of false positive: P(match | KNM)
# prob of false negative: P(non-match | KM)
# for X < c: we get probability for P(X < c | KNM) and P(X < c | KM)
features <- c("CMS", "CNMS", "num.matches", "num.nonmatches", "D", "S", "ccf")
errors <- plyr::ldply(features, function(x) {
xx <- unique(bstats[,x])
if (length(xx) > 500) {
xr <- range(bstats[,x])
xx <- seq(xr[1], xr[2], length.out=500)
}
# upper and lower rule:
# upper: match is defined as X > xx
# lower: match is defined as X < xx
subbstats <- subset(bstats, flagged == FALSE)
subbstats$match <- factor(subbstats$match)
errors <- plyr::ldply(xx, function(cc) {
idx <- which(subbstats[,x] >= cc)
upper <- data.frame(xtabs(~match, data= subbstats[idx,,drop=FALSE])/
xtabs(~match, data= subbstats))
idx <- which(subbstats[,x] <= cc)
lower <- data.frame(xtabs(~match, data= subbstats[idx,,drop=FALSE])/
xtabs(~match, data= subbstats))
data.frame(value = cc, match=lower[,1], lower=lower[,2], upper=upper[,2])
})
errors$variable <- x
errors
})
rocs.lower <- dcast(errors, variable+value~match, value.var="lower")
rocs.lower$type <- "lower"
rocs.upper <- dcast(errors, variable+value~match, value.var="upper")
rocs.upper$type <- "upper"
rocs <- rbind(rocs.lower, rocs.upper)
set.seed(20140105)
aucs <- plyr::ldply(features, function(x) {
subbstats <- subset(bstats, flagged == FALSE)
pos.scores <- sample(subbstats[which(subbstats$match),x], 50000, replace=TRUE)
neg.scores <- sample(subbstats[which(!subbstats$match),x], 50000, replace=TRUE)
data.frame(x, upper=mean(pos.scores > neg.scores),
lower = mean(pos.scores < neg.scores))
})
aucs <- melt(aucs, measure.var=c("lower", "upper"))
names(aucs) <- c("variable", "type", "auc")
rocs <- merge(rocs, aucs, by=c("variable", "type"))
rocs$variable <- reorder(rocs$variable, -rocs$auc, min)
levels(rocs$variable)[3:4] <- c("#non-matches", "#matches")
rocs$varLabel <- rocs$variable
levels(rocs$varLabel) <- c(
"Sum of peaks (S)",
"Cross-correlation function (ccf)",
"#non-matches",
"#matches",
"Consecutive Non-Matching Striae (CNMS)",
"Consecutive Matching Striae (CMS)",
"Average difference (D)"
)
labels <- unique(subset(rocs, auc > 0.5)[,c("variable", "auc")])
labels$labels <- sprintf("AUC: %.2f", labels$auc)
labels$type <- NA
eer <- rocs %>% filter(auc > 0.5) %>%
mutate(differror = abs(1-`TRUE` - `FALSE`))
eer <- eer %>% group_by(variable) %>% mutate(
minerror = differror==min(differror),
labels = sprintf("EER: %.2f", 0.5*(`FALSE`+1-`TRUE`))
) %>% filter(minerror==TRUE)
ggplot(data=subset(rocs, auc > 0.5)) +
geom_ribbon(aes(x=`FALSE`, ymax=`TRUE`, ymin=0), alpha=0.2) +
geom_line(aes(x=`FALSE`,y=`TRUE`))+ theme_bw() +
xlab("False negative rate") +
ylab("False positive rate") + facet_wrap(facets=~varLabel, ncol=4) +
geom_label(x=1, y=0, aes(label=labels), data=labels, hjust=1, vjust=0) +
geom_abline(linetype=2) +
geom_point(aes(x=`FALSE`, y=`TRUE`), data=eer, size=2.5) +
geom_label(x=1, y=0.125, aes(label=labels), data=eer, hjust=1, vjust=0)
```
\caption{\label{fig:rocs}ROC curves for all of the features described in Section \ref{algorithm}. Variables are sorted according to their area under the curve (AUC). The equal error rate (EER) is marked by a point on the ROC curve. Except for the distance $D$ between signatures, all individual features derived from the surface measurements and the aligned striation marks are more predictive than the maximal CMS.}
\end{figure}
All of the features in Figure \ref{fig:densities} show large, if not significant, differences between matches and non-matches. The predictive power of each one of these features is shown in the form of the Receiving Operating Characteristic (ROC) curves in Figure \ref{fig:rocs}. The features are arranged in descending order according to the area under the curve (AUC).
The dots mark the equal error rate, i.e. the location on the ROC curve, where false positive and false negative error rates are the same. The smaller the value, the better. We see that in this instance a low equal error rate (EER) goes hand in hand with high predictive power as measured in AUC.
The feature with the highest individual predictive power is $S$, the sum of the average heights of two signatures at peaks and valleys. The maximal number of CMS is only in the seventh position here. The overall high AUC values indicate that we can successfully employ machine learning methods to distinguish matches from non-matches.
Using recursive partitioning, we fit a decision tree [@breiman:1984, @rpart, @rpart.plot] to predict matches between lands based on features derived from the image files. The resulting tree is shown in Figure \ref{fig:tree}. A total of `r sum(bstats$match[bstats$pred>0.5])` lands is being matched correctly. Interestingly, the number of consecutive matching striae does not feature in this evaluation.
Instead of CMS, cross-correlation (ccf) between the signatures is very important in the matching process by the decision tree. Aside from cross correlation, the total number of matches is also included in the decision rule.
Between cross-correlation and CMS, cross-correlation has higher predictive power. This does not contradict earlier findings emphasizing the value of CMS on visual assessments of bullet matches: in those papers, assessments were based on purely visual inspection of either actual bullets or 2D microscopic images of bullets.
Neither one of these methods allows for an assessment of cross-correlations. This is one of the benefits of switching to a digitized version of the images that preserves the 3D surface structure. The findings about the discriminating power of cross-correlation are consistent with the results of the study by @ma:2004. However, in that study, the authors did not consider the number of matches and non-matches.
\begin{figure}[hbtp]
\centering
```{r tree, echo=FALSE, fig.width=7, fig.height=4, out.width='.7\\textwidth'}
vals = alpha(brewer.pal(3, name="Paired"), alpha=0.5)
names(bstats)[9:10] <- c("#matches", "#non-matches")
includesVar <- setdiff(names(bstats), c("b1", "b2", "data", "resID", "id.x", "id.y", "pred", "forest", "bullet", "span", "crosscutdist", "flagged", "km"))
#cc <- read.csv("csvs/crosscuts-25.csv")
# cc$bullet <- gsub(".*//","",cc$path)
# cc$bullet <- gsub(".x3p", "", cc$bullet)
# bstats$flagged <- bstats$b2 %in% cc$bullet[which(is.na(cc$cc))] |
# bstats$b1 %in% cc$bullet[which(is.na(cc$cc))]
#excludesObs <- which(bstats$flagged)
rp1 <- rpart(match~., subset(bstats, !flagged)[,includesVar]) # doesn't include cms at all !!!!
per <- rp1$frame$yval # predictive probability for each node in the tree
prp(rp1, extra = 101, box.col=vals[as.numeric(per > 0.5)+1])
bstats$pred <- predict(rp1, newdata=bstats)
bstats$pred[bstats$flagged] <- NA
```
\caption{\label{fig:tree}Decision tree of matching bullets based on recursive partitioning. The rectangular nodes are the leaves, giving a short summary consisting of the number of observations in the leaf (bottom left), the corresponding percentage of the total (bottom right). The number at the top shows the fraction of these observations that are a match. A 1 or a 0 therefore indicate a homogeneous (or perfect) node. }
\end{figure}
```{r rforest, echo=FALSE, message=FALSE}
set.seed(20160105)
names(bstats)[9:10] <- c("num.matches", "num.nonmatches")
includesVar <- setdiff(names(bstats), c("b1", "b2", "data", "resID", "id.x", "id.y", "pred", "forest", "bullet", "span", "crosscutdist", "flagged", "km"))
rtrees <- randomForest(factor(match)~., data=subset(bstats, !flagged)[,includesVar], ntree=300)
errors <- data.frame(rtrees$err.rate)
errors$id <- 1:nrow(errors)
bstats$forest <- predict(rtrees, type="prob", newdata=bstats)[,2]
bstats$forest[bstats$flagged] <- NA
```
Another benefit of the digitized version of the images is that we can apply several hundred decision trees to combine in a random forest [@breiman:2001, @randomForest]. For each of the trees in a random forest, only two thirds of the observations are used for fitting, while the remaining third is used to evaluate the tree's predictive power and accuracy, or its reverse, the error rate. Because errors are determined from the one third of held-back observations, this error rate is called the out-of bag (OOB) error.
Figure \ref{fig:oob} shows the cumulative out-of-bag error (OOB) rate for 300 trees.
\begin{figure}[hbtp]
\centering
```{r oob, echo=FALSE, fig.width=6, fig.height=3, out.width='.7\\textwidth'}
qplot(id, OOB, geom="line", data=errors) + theme_bw() + ylim(c(0,NA)) + xlab("Number of trees") + ylab("Out of Bag Error (OOB)")
```
\caption{\label{fig:oob}Cumulative out-of-bag error rate of a random forest fit to predict land-to-land matches from image features.}
\end{figure}
After about 100 trees, the error rate of land-to-land comparisons stabilizes at 0.0039. This is a weighted average between false positive error rate of 0.0001 and an error rate of false negatives of 0.2267. This out-of-bag error rate is over-estimating the actual error in the Hamby study: here, the final random forest based on 300 trees is able to correctly predict all known matches and non-matches (see Figure \ref{fig:tree-forest}).
Note that this error rate is based on land-to-land comparisons and is expected to be much lower for bullet-to-bullet comparisons. In the case of the Hamby data, even a single tree results in an overall error rate of zero, if we require that a match of two bullets occurs when at least two of the bullet's lands are matched. This makes the errors in the automated approach smaller than the human error in the Hamby study. Out of the 507 participants who returned results, eight (out of $15 \times 507 = 7,605$) bullets were not matched conclusively, corresponding to a rate of 0.0011.
For the Hamby data, error rates based on bullet-to-bullet matches do not carry a lot of weight because of the small size of the study: fifteen unknown bullets are successfully matched to two pairs of ten bullets. Matching bullets can only be tested realistically in a much bigger experiment.
Another thing to note about the random forest's error rates is that they are based on probability cutoffs of 0.5, i.e. whenever the predicted probability of a match exceeds 0.5, a match is declared. Basing this decision on a threshold fixed at 0.5 may not be the best approach. In practice, examiners are allowed a third option of 'inconclusive'. On a probability spectrum of outcomes we could therefore introduce an interval of 'inconclusive' results in the middle of the spectrum -- which turns out to be unnecessary in the Hamby study, because, here, the results from the random forest are very clear cut. Figure \ref{fig:tree-forest} shows a comparison of the predicted probabilities of a match by the tree and the random forest. As expected, the random forest provides a more realistic estimate of the uncertainty in the classification.
\begin{figure}[hbtp]
\centering
```{r tree:forest, echo=FALSE, fig.width=5, fig.height=2.5, out.width='.6\\textwidth', warning = FALSE}
predm <- melt(bstats, measure.var=c("pred", "forest"))
levels(predm$variable) <- c("Tree", "Forest")
predm$match <- c("KNM", "KM")[as.numeric(predm$match)+1]
predm$barrel1 <- gsub("(.*) .*-[0-9]+", "\\1", predm$b1)
predm$barrel2 <- gsub("(.*)-[0-9]+", "\\1", predm$b2)
qplot(data=predm, value, match, geom="jitter", colour=match, shape=match) + theme_bw() + facet_grid(variable~.) +
scale_colour_brewer(palette="Paired", labels=c("KNM", "KM")) +
scale_shape_discrete(labels=c("KNM", "KM")) + ylab("") +
xlab("Predicted probability of a match") +
theme(legend.position="none")
```
\caption{\label{fig:tree-forest}Prediction results from the tree and the forest. Using a cut-off probability of 0.5 the forest correctly predicts every single comparison. Compared to the tree, the forest's prediction probabilities are shrunk towards either end of the prediction range.}
\end{figure}
Besides resulting in a probabilistic quantification of matches, random forests also provide an assessment of the importance of each of the features derived from the bullets' 3D topological surface measurements. Figure \ref{fig:importance} shows an overview of the importance of each variable measured as the mean decrease in the Gini index when the variable in question is included in a tree (for the exact values please refer to Section \ref{table-of-feature-importance}).
\begin{figure}
```{r featimp, echo=FALSE, fig.width=7, fig.height=2, out.width='.7\\textwidth', fig.align='center'}
imp <- data.frame(importance(rtrees))
imp$Variable <- row.names(imp)
imp$Variable[c(7,8)] <- c("matches", "mismatches")
imp1 <- subset(imp, !(Variable %in% c("x1", "sd.D", "S")))
imp2 <- subset(imp, Variable %in% c("x1", "sd.D", "S"))
imp3 <- subset(imp, !(Variable %in% c("x1", "x2", "sd.D", "lag")))
qplot(MeanDecreaseGini,1, data=imp3, geom="point") +
geom_text(aes(y=1.2, label=Variable), angle=45, hjust=0, data=subset(imp3, Variable != "S")) +
geom_text(aes(y=1.2, label=Variable), angle=45, vjust=1.2, hjust=0, data=subset(imp3, Variable == "S")) +