-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy pathflexlibrary.py
executable file
·4021 lines (3691 loc) · 203 KB
/
flexlibrary.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
#!/home/boulgakov/anaconda2/bin/python
"""
Fluorosequencing Experiment Library (flexlibrary) manages fluorosequencing
image data, and facilitates combining this data to extract conclusions from
experiments.
flexlibrary implements three core classes: Image, Spot, and Experiment.
Image is the fundamental class in flexlibrary. Each fluorosequencing image is
an Image object. This object contains the image itself, and associated metadata
-- such as filepath, date, experimental conditions, etc.
Fluorosequencing images are analyzed to find and analyze labeled peptides.
Labeled peptides are observed as point sources under the microscope, each
observable as a spot with a characteristic point spread function (PSF). Each
spot is a Spot object identified by its source image, and its spatial
coordinates (h, w) in that image.
Information about the state of a peptide is to be inferred by analyzing its
spot's PSF. Comparing a peptide's PSFs across a series of images may yield
further information about its state, or -- if the images are temporally
separated -- changes in its state.
Defining a relationship between a set of images inherently defines a
relationship between co-localized spots in those images. For example: if ten
images are snapshots of a single field, each taken after a successive Edman
reaction, then the co-localized spots in those images are the same labeled
peptides observed through the experiment. Another example: if three images are
three color channels of the same field taken at one time, then the co-localized
spots reflect the state of orthogonal labels on each peptide. Such a
relationship is defined by an Experiment.
The Experiment class is a parent class from which various Experiment subclasses
are derived. Each subclass represents a specific, predefined relationship
between a set of images. For example, there is a TimetraceExperiment that is
used to look at continuous time traces of a field. Each subclass of Experiment
comes with its own useful methods that perform analysis on the Experiment's set
of Images and their associated Spots.
"""
import sys
sys.path.insert(0, '/home/proteanseq/pflib')
import pflib
from scipy.misc import imread
import os.path
import tempfile
import os
import numpy as np
from scipy.spatial.distance import euclidean
import glob
import cPickle
from phase_correlate import phase_correlate
import math
import time
import logging
import random
import csv
from scipy.ndimage.measurements import center_of_mass
from scipy.signal import medfilt
import photutils
import multiprocessing
import stepfitting_library
#if the calling application does not have logging setup, flexlibrary will log
#to NullHandler by default
logger = logging.getLogger(__name__)
logger.addHandler(logging.NullHandler())
class Spot(object):
"""
A Spot is a square of pixels in an image. It may contain a luminescent spot
caused by a point source.
The square must be an odd number of pixels on each side in size; there must
be a unique center pixel.
A Spot object stores only the coordinates and size of its square. To access
the actual pixels, use Spot.image_slice().
Attributes:
parent_Image: Image object this Spot exists inside of.
h: The horizontal coordinate of the center pixel of the square area.
w: The vertical coordinate of the center pixel of the square area.
size: Length of Spot square side in pixels.
gaussian_fit: Tuple characterizing the Gaussian fit of the Spot's PSF.
The tuple definition is the same as that of the tuples stored as
values in the dictionary by returned pflib.find_peptides.
gaussian_fit can remain None. For convenience, here is the tuple
order from pflib.find_peptides: (h_0, w_0, H, A, sigma_h, sigma_w,
theta, sub_img, fit_img, rmse, r_2, s_n). Note that sub_img is not
dynamically updated if parent_Image changes.
"""
def __init__(self, parent_Image, h, w, size, gaussian_fit=None):
self.parent_Image = parent_Image
if size % 2 == 0:
raise AttributeError("Spot.size must be odd.")
self.size = size
if not (0 <= h - (size - 1) / 2 and
h + (size - 1) / 2 < parent_Image.image.shape[0] and
0 <= w - (size - 1) / 2 and
w + (size - 1) / 2 < parent_Image.image.shape[1]):
if (gaussian_fit is None or not
((size - 1) / 2 <=
gaussian_fit[0] <
parent_Image.image.shape[0] - (size - 1) / 2) and
((size - 1) / 2 <=
gaussian_fit[1] <
parent_Image.image.shape[1] - (size - 1) / 2)):
raise AttributeError("Spot area of size " + str(size) +
" at " + str((h, w)) + " with " +
"gaussian_fit " + str(gaussian_fit) +
" does not fit into " +
"parent_Image.image.shape of " +
str(parent_Image.image.shape))
self.h, self.w = h, w
self.gaussian_fit = gaussian_fit
def image_slice(self, radius=None):
"""
Get the Spot's square of pixels. This is sliced from parent_Image at
the time the method is called; changes in parent_Image will reflect
themselves in the returned slice.
Optionally, this function can also return a larger slice of
parent_Image. However, the function does not guarantee that a square of
the desired size will be returned because it is possible that the
desired slice exceeds image bounds.
Arguments:
radius: If not None, returns a square slice of the desired radius.
Returns:
Spot's square of pixels as a Numpy array.
"""
if radius is None:
radius = (self.size - 1) / 2
return self.parent_Image.image[max(0, self.h - radius):
min(self.parent_Image.image.shape[0],
self.h + radius + 1),
max(0, self.w - radius):
min(self.parent_Image.image.shape[1],
self.w + radius + 1)]
def valid_slice(self, radius=None):
"""Is the slice of requested radius contained within parent image."""
if radius is None:
radius = (self.size - 1) / 2
wanted_slice_size = 2 * radius + 1
slice = self.image_slice(radius=radius)
if slice.shape[0] == slice.shape[1] == wanted_slice_size:
return True
else:
return False
def simple_photometry_metric(self, return_invalid=True):
"""
Photometry is measurement of the electromagnetic flux of an object.
Returns:
The sum of all pixel values in this Spot.
"""
if not return_invalid and not self.valid_slice():
return None
else:
return np.sum(self.image_slice())
def mexican_hat_photometry_metric(self, brim_size=6, radius=9,
return_invalid=True):
"""
Similar to simple_photometry_metric, but adjusts for background
intensity around a peptide.
Arguments:
brim_size: Size of brim to use as background adjustment.
radius: Total radius of square to use as the hat. If this remains
None, (self.size - 1) / 2 is used as the natural radius. If the
radius is overridden to be greater than self.size, this
function does not guarantee that all pixels within that radius
are within the image boundary, and hence the area used for
photometry may be truncated.
"""
logger = logging.getLogger()
logger.debug("Spot.mexican_hat_photometry_metric: locals() = " +
str(locals()))
if radius is None:
radius = (self.size - 1) / 2
if not return_invalid and not self.valid_slice(radius=radius):
photometry = None
else:
diameter = 2 * radius + 1
crown_pixels = []
brim_pixels = []
for (h, w), p in np.ndenumerate(self.image_slice(radius=radius)):
if (brim_size <= h < diameter - brim_size and
brim_size <= w < diameter - brim_size):
crown_pixels.append(p)
else:
brim_pixels.append(p)
logger.debug("Spot.mexican_hat_photometry_metric: "
"crown_pixels = " + str(crown_pixels))
logger.debug("Spot.mexican_hat_photometry_metric: brim_pixels = " +
str(brim_pixels))
photometry = (sum(crown_pixels) -
len(crown_pixels) * np.median(brim_pixels))
return photometry
def gaussian_volume_photometry_metric(self, scaling=10**6, default=0,
return_invalid=True):
"""
An alternative photometry metric based on the gaussian_fit.
Returns:
The volume of the two dimensional gaussian. In pflib.find_peptides
terms: A * sigma_h * sigma_w
"""
if not return_invalid and not self.valid_slice():
photometry = None
elif self.gaussian_fit is None:
#raise AttributeError("Cannot give gaussian_volume_metric without "
# "a self.gaussian_fit")
photometry = default
else:
photometry = (float(scaling) * self.gaussian_fit[3] *
self.gaussian_fit[4] * self.gaussian_fit[5])
return photometry
def gaussian_sigmas_photometry_metric(self, scaling=10**6,
return_invalid=True):
if not return_invalid and not self.valid_slice():
photometry = None
if self.gaussian_fit is None:
photometry = -10**9
else:
photometry = (float(scaling) * self.gaussian_fit[4] *
self.gaussian_fit[5])
return photometry
def sextractor_photometry_metric(self, radius=3, box_size=10,
filter_size=10, return_invalid=True,
**kwargs):
"""
Photutils' sextractor photometry.
"""
if not return_invalid and not self.valid_slice(radius=radius):
photometry = None
else:
background = self.parent_Image.get_photometry_background(
box_size=box_size,
filter_size=filter_size,
method='sextractor')
aperture = self.parent_Image.get_photometry_aperture(spot=self,
radius=radius)
background_array = background.background
photometry = float(photutils.aperture_photometry(
self.parent_Image.image - background_array,
aperture)['aperture_sum'])
return photometry
def maximum_photometry_metric(self, radius=5, top=1,
background_adjust='none',
return_invalid=True):
if not return_invalid and not self.valid_slice(radius=radius):
photometry = None
else:
r = np.sort(np.ravel(self.image_slice(radius=radius)))
if background_adjust == 'none':
pass
elif background_adjust == 'additive':
median = np.median(self.parent_Image.image)
r = r - median
elif background_adjust == 'multiplicative':
raise NotImplementedError("Not sure what to do if median is 0."
" This may be a poor metric.")
median = np.median(self.parent_Image.image)
else:
raise ValueError(str(background_adjust) + " is not a valid "
"option.")
photometry = float(np.sum(r[-top:]))
return photometry
def photometry(self, method='mexican_hat', photometry_method=None,
return_invalid=True, **kwargs):
"""
If return_invalid is True, return photometry even if it cannot be
calculated correctly. If return_invalid is False, return None instead
of photometry.
"""
logger = logging.getLogger()
if photometry_method is not None:
method = photometry_method
if method == 'mexican_hat':
photometry = self.mexican_hat_photometry_metric(
return_invalid=return_invalid, **kwargs)
elif method == 'gaussian_volume':
photometry = self.gaussian_volume_photometry_metric(
return_invalid=return_invalid, **kwargs)
elif method == 'simple':
photometry = self.simple_photometry_metric(
return_invalid=return_invalid, **kwargs)
elif method == 'sextractor':
photometry = self.sextractor_photometry_metric(
return_invalid=return_invalid, **kwargs)
elif method == 'maximum':
photometry = self.maximum_photometry_metric(
return_invalid=return_invalid, **kwargs)
elif method == 'sigmas':
photometry = self.gaussian_sigmas_photometry_metric(
return_invalid=return_invalid, **kwargs)
else:
raise ValueError("Uknown method specified.")
logger.debug("Spot.photometry_metric: photometry = " + str(photometry))
return photometry
def illumina_s_n(self):
return pflib.illumina_s_n(self.image_slice())
class Image(object):
"""
A fluorosequencing image, its metadata, and its Spots.
The image is represented as a two-dimensional Numpy array, with dimensions
height x width. Pixel coordinates are indexed by positive integer pairs
[h, w]. The coordinate system origin is in the upper left-hand corner, is
0-indexed, and with h(eight) and w(idth) the vertical and horizontal
indeces from the origin respectively. For subpixel resolution, the
coordinate system is extended to floating point pairs [h, w]. Pixel values
are unsigned integers, with black (i.e. no photons detected) assigned 0.
Bit depth is assumed to be at most 64 bits.
Image metadata is a dictionary that can contain arbitrary key:value pairs
as needed to document the image. Image does not inherently enforce any
rules about what this metadata can be, except for reserved keys as
follows:
1. 'filepath' keyword: value indicates where the image is stored as an
image file. Currently, only those files that ImageMagick 'convert'
can convert into PNG are supported. Files with a '.png' or '.PNG'
suffix are assumed to be PNG files that can be read directly,
otherwise conversion is attempted. The intermediate PNG is not
saved. Absolute paths are strongly recommended; use relative paths
at own risk, as flexlibrary functions will write to filenames based
on this metadata entry!
Spots in the image are stored as a list of Spot objects; see their class
docstring for further documentation.
Attributes:
image: Numpy array representation of the image. Can be passed directly
at instatiation. If not passed, 'filepath' must be passed in
metadata and a valid image be read from it. If both image and the
'filepath' metadatum are passed, image is taken as the matrix. No
safety check is performed to ensure their image array matches.
metadata: Dictionary containing image metadata. Can have any content
valid in Python. Reserved keywords as described above are assumed
to be used as indicated.
spots: List of Spot instances. See their class documentation. Can be
passed at initialization or added later, either by appending the
list or calling Spot-generating functions for an Image.
photometry_background: A dictionary of photutils.Background objects
used to cache this image's background information. They are
computed once to allow rapid photutils-based photometry
computations for each Spot. Stored as a dictionary to allow caching
of backgrounds computed based on different parameters. The three
parameters are: (box_size, filter_size, method). They are passed to
photutils.background.Background(box_shape=(box_size, box_size),
filter_shape=(filter_size,
filter_size)
method=method).
Tuples of parameters as above are the keys in this dictionary. The
resulting Background objects are the values.
photometry_apertures: Dictionary of lists of
photutils.CircularApertures, one for each Spot.
photometry_apertures is designed to memoize apertures of different
radii. Each key in the dictionary is a radius. Its corresponding
value is a list of photutils.CircularApertures, each of that
radius. The order of apertures corresponds to self.spots, such that
the i'th member of the list is the aperture for the i'th spot in
self.spots. This is computed once to allow rapid photutils-based
photometry computations for each Spot.
"""
def __init__(self, image=None, metadata=None, spots=None,
photometry_background=None, photometry_apertures=None):
self.metadata = {}
if metadata is not None:
self.metadata = metadata
self.image = None
if image is not None:
self.image = image
elif 'filepath' in metadata:
try:
tmppath = None #used to clean up tempfiles in finally block
if os.path.splitext(metadata['filepath'])[1] in ('png', 'PNG'):
self.image = imread(metadata['filepath'])
else:
tmpfile, tmppath = tempfile.mkstemp()
tmpfile.close() #close before ImageMagick writes to it
pflib.convert_image(metadata['filepath'],
output_path=tmppath)
self.image = imread(tmppath)
finally:
#make sure tmppath is cleaned up regardless whether exception
#is raised or not
if os.path.isfile(tmppath):
os.remove(tmppath)
else:
raise AttributeError("Image.image must be defined: it was neither "
"passed at initialization nor given a "
"filepath to be read from.")
self.spots = []
if spots is not None:
self.spots = spots
if photometry_background is not None:
self.photometry_background = photometry_background
else:
self.photometry_background = {}
if photometry_apertures is not None:
self.photometry_apertures = photometry_apertures
else:
self.photometry_apertures = {}
def find_gaussian_psfs(self, pflib_args=None, spots_append=True):
"""
Apply pflib.find_peptides to self.image.
Arguments:
pflib_args: Arguments to be passed to pflib.find_peptides. Useful
to modify its default search parameters.
spots_append: If True, will append any found PSFs as Spots to
self.spots. Does not check if existing Spots are identical. If
not True, will overwrite self.spots to contain only those Spots
found by this method call.
Returns:
Number of Spots found by pflib.find_peptides during this run. If
spots_append is True, this means that number of members at the end
of self.spots were appended.
"""
#using empty dictionaries/lists as a default argument is a bug factory
#because if they are modified, these modifications persist as the
#defaults for all subsequent calls. Better to use None and case it.
if pflib_args is None:
pflib_args = {}
new_fits = pflib.find_peptides(self.image, **pflib_args)
if not spots_append:
self.spots = []
for (h, w), new_fit in new_fits.iteritems():
int_h, int_w = int(round(h)), int(round(w))
spot = Spot(self, int_h, int_w, 5, gaussian_fit=new_fit)
self.spots.append(spot)
return len(new_fits)
def get_photometry_background(self, box_size=10, filter_size=10,
method='sextractor'):
"""Returns self.photometry_background. Computes it if it's None."""
if (box_size, filter_size, method) not in self.photometry_background:
box_shape = (box_size, box_size)
filter_shape = (filter_size, filter_size)
bkg = photutils.background.Background(data=self.image,
box_shape=box_shape,
filter_shape=filter_shape,
method=method)
self.photometry_background.setdefault((box_size,
filter_size,
method), bkg)
return self.photometry_background[(box_size, filter_size, method)]
def get_photometry_aperture(self, spot, radius=3):
"""
Returns aperture corresponding to spot. Creates all apertures for the
requested radius if they've not yet been made.
"""
if radius not in self.photometry_apertures:
#Remember in photutils the coordinates are swapped.
self.photometry_apertures.setdefault(radius,
[photutils.CircularAperture([s.w, s.h], r=radius)
for s in self.spots])
elif len(self.photometry_apertures[radius]) != len(self.spots):
#Must re-compute if self.spots has been changed.
self.photometry_apertures[radius] = \
[photutils.CircularAperture([s.w, s.h], r=radius)
for s in self.spots]
return_aperture = None
for i, s in enumerate(self.spots):
if spot is s:
return_aperture = self.photometry_apertures[radius][i]
break
else:
raise Exception("argument spot not in self.spots; locals() = " +
str(locals()))
assert return_aperture is not None
return return_aperture
def append_photometry_apertures(self):
"""
A Spot was just appended to self.spots. Append to all radius entries in
self.photometry_aptertures to match.
"""
h, w = self.spots[-1].h, self.spots[-1].w
for radius, apertures in self.photometry_apertures.iteritems():
apertures.append(photutils.CircularAperture([w, h], r=radius))
return self.photometry_apertures
class Experiment(object):
"""
The base class for all Experiments. It does not define any kind of
experiment itself, and hence is never instantiated. It is useful as a
repository of methods useful for all Experiment subclasses.
"""
@staticmethod
def easy_load_processed_image(image_filepath, psf_pkl_filepath=None,
load_psfs=True):
"""
Utility function to load a pre-processed image and its PSF pkl files as
produced by pflib into Image and Spot objects.
Arguments:
image_filepath: Path to PNG image file.
psf_pkl_filepath: Path to PSF pkl file for the image, as produced
by pflib. If None, follows pflib._psfs_filename and uses glob
to try to find the pkl at image_filepath + '_psfs_*.pkl'.
load_psfs: If False, does not load PSF pkl files. The resulting
Image objects have no Spots.
Returns:
Creates and returns the Image object with all its Spot objects to
describe the image and the PSFs contained in the pkl. Also returns
number of Spots not loaded from the pkl file due to errors.
"""
image = imread(image_filepath)
image_object = Image(image=image,
metadata={'filepath': image_filepath}, spots=None)
discarded_spots = 0
if load_psfs:
if psf_pkl_filepath is None:
pkl_files = sorted(glob.glob(image_filepath + '*_psfs_*.pkl'))
if len(pkl_files) == 0:
raise ValueError("For image_filepath = " + image_filepath +
" psf_pkl_filepath passed as None when " +
"no pkl files available.")
psf_pkl_filepath = pkl_files[-1]
psfs = cPickle.load(open(psf_pkl_filepath))
spot_objects = []
for (h, w), gaussian_fit in psfs.iteritems():
(h_0, w_0, H, A, sigma_h, sigma_w, theta, sub_img, fit_img,
rmse, r_2, s_n) = gaussian_fit
try:
int_h, int_w = int(round(h)), int(round(w))
new_spot = Spot(parent_Image=image_object,
h=int_h, w=int_w,
size=fit_img.shape[0], gaussian_fit=gaussian_fit)
spot_objects.append(new_spot)
except Exception as e:
logger.info("flexlibrary.easy_load_processed_image: "
"Ignoring Spot due to Spot.__init__ exception.")
logger.exception(e, exc_info=True)
discarded_spots += 1
image_object.spots = spot_objects
return image_object, discarded_spots
@staticmethod
def accumulate_offsets(offsets):
"""
Given a sequence of image offsets with respect to the directly
preceding image, convert to offsets with respect to the first frame.
Arguments:
offsets: List of offsets of successive images, each with respect
to the directly preceeding one.
Returns:
List of offsets for those with respect to the first image.
"""
logger = logging.getLogger()
if offsets[0] != (0, 0):
raise ValueError("The first image's offset must be (0, 0) by "
"definiton.")
cumulative_offsets = []
for f, offset in enumerate(offsets):
cumulative_h_offset = sum([offset[0]
for offset in offsets[:f + 1]])
cumulative_w_offset = sum([offset[1]
for offset in offsets[:f + 1]])
cumulative_offsets.append((cumulative_h_offset,
cumulative_w_offset))
logger.debug("accumulate_offsets: offsets = " + str(offsets))
logger.debug("accumulate_offsets: cumulative_offsets = " +
str(cumulative_offsets))
return cumulative_offsets
@staticmethod
def get_cumulative_offset(offsets, f, g=0):
"""Get cumulative offset for frame f, with respect to frame g."""
cf = Experiment.accumulate_offsets(offsets)[f]
cg = Experiment.accumulate_offsets(offsets)[g]
return (cf[0] - cg[0], cf[1] - cg[1])
@staticmethod
def round_coordinates(h, w):
return int(round(h)), int(round(w))
@staticmethod
def apply_offset(coordinates, offset):
h, w = coordinates
d_h, d_w = offset
return h + d_h, w + d_w
@staticmethod
def unapply_offset(offset_coordinates, offset):
h, w = offset_coordinates
d_h, d_w = offset
return h - d_h, w - d_w
@staticmethod
def offset_frame_coordinates(offsets, coordinate, f, g):
"""Given coordinate in frame g, find its coordinate in frame f."""
gf_offset = Experiment.get_cumulative_offset(offsets=offsets, f=f, g=g)
return Experiment.apply_offset(coordinate, gf_offset)
@staticmethod
def discard_dropouts(spots, spot_cumulative_offsets,
frame_cumulative_offsets, image_shape, spot_radius=0):
"""
Discard Spots that align to outside of a sequence of frames.
When tracking spots through frames, stage drift may place some of them
(or their former position, if they are no longer visible) outside of
the field of view at some point. We want to ignore them. This function
filters out spots whose position would end up outside of the field of
view of a sequence of frames. Because some functions use coordinates
rounded to the nearest whole number, discard_dropouts also eliminates
any points whose offset coordinates would round up past the size of the
image (i.e. its upper cutoff is 0.5 below the image boundary).
Arguments:
psfs: List of Spot objects.
spot_cumulative_offsets: List of offsets corresponding to each Spot's
frame, with respect to the primary field of view.
frame_cumulative_offsets: List of cumulative offsets for all frames
through which the Spot's position will be monitored.
image_shape: Shape of the frame's Numpy array.
spot_radius: Discard spots within this distance of the edge.
Returns:
The original list of Spots, with those that would be out of frame
at any point removed, and the number of Spots discarded.
"""
logger = logging.getLogger()
filtered_spots = []
number_discarded = 0
for i, spot in enumerate(spots):
#oh, ow are the coordinates of the spot in frame 0's coordinate sys
oh, ow = Experiment.apply_offset((spot.h, spot.w),
spot_cumulative_offsets[i])
logger.debug("discard_dropouts: oh, ow = " + str((oh, ow)))
#track spot position through each frame, check to see if it is ever
#outside the field of view
for offset in frame_cumulative_offsets:
gh, gw = Experiment.unapply_offset((oh, ow), offset)
logger.debug("discard_dropouts: gh, gw = " + str((gh, gw)))
if not (spot_radius <= gh <
image_shape[0] - 0.5 - spot_radius
and
spot_radius <= gw <
image_shape[1] - 0.5 - spot_radius):
number_discarded += 1
logger.debug("discard_dropouts: discarded")
break
else:
logger.debug("discard_dropouts: NOT discarded")
filtered_spots.append(spot)
return filtered_spots, number_discarded
@staticmethod
def greedy_particle_tracking(frame_spots, frame_shape, candidate_radius=2,
offsets=None, spot_radius=0):
"""
Track Spots across frames.
One of the basic problems flexlibrary needs to solve is associating
spots that are co-localized across frames. Co-localization across
frames implies persistence through time and/or across multiple
channels. Hence, this is essentially the act of tracking spots.
Tracking spots across frames is a particle tracking problem under the
simplifying condition that the particles cannot move further than a
very small distance. Therefore, a very simple and very greedy algorithm
should be sufficient for our needs.
Each nonempty frame has a set of Spots, each uniquely defined by their
(h, w) coordinates. For this algorithm, all that matters are the Spots'
coordinates.
For each Spot, the goal of the algorithm is to associate it to a Spot
in the subsequent frame, such that it is plausible that it is the same
particle. Two complicating factors arise: (1) a Spot may not be present
in a succeeding frame, and (2) multiple Spots in the first frame may be
potentially assigned to one Spot in the succeeding frame.
The algorithm approaches this task as follows, comparing two
consecutive frames at a time:
1. For each Spot, all Spots in the following frame that are within
a given radius are considered potential candidates. For
convenience, we call Spots in the first frame "ancestors", and
Spots in the second frame "descendants".
2. For each pair of ancestor-descendant, Eucledian distance is
computed.
3. All ancestor-descendant pairs are sorted by their Eucledian
distances in ascending order.
4. Shorter distance indicates a better pair. The sorted list of
ancertor-descendant pairs is traversed. If a pair where neither
the ancestor nor the descendant have been assigned are
encountered, the assignment is made and all subsequent pairs
that involve the ancestor or the descendant are ignored.
5. Any ancestors that do not have a descendat assigned to them
because either (a) no descendants are found within the given
radius or (b) all potential descendants were paired with nearer
ancestors, are considered not to have any descendants in that
frame. Their coordinates are merged with ancestor PSFs for
repeating the algorithm for the next frame and its successor.
6. For each Spot, its succeeding chain of pairwise ancestor-
candidate assignments is its path across the frames. This is
true whether the Spot skips one or more frames.
Due to stage drift, captured image fields may be offset. Providing a
sequence of image alignments allows Spot tracking in this scenario.
However, any Spots that would align to positions outside of any frames
in the sequence are ignored and not tracked.
Arguments:
frame_spots: A iterable of iterables of Spots:
[[frame0_Spot1, ..., frame0_SpotI, ...], #frame 1
...,
[frameN_Spot1, ..., frameN_SpotJ, ...], #frame N
... ]
Each frame for which Spots are to be tracked is summarized as
an iterable of Spot objects (containing their coordinates
(h, w)). In the above list notation of these iterables, frameN
indicates it's the N'th frame and SpotI indicates that is the
I'th Spot in the frame. Spot objects for each frame are not
assumed to have any particular order. The sequence of frames,
however, is assumed to follow the order in which the Spots will
be tracked. It is assumed that there are no Spots within two
pixels of each other in any one frame.
frame_shape: This method assumes all frames have the same shape.
frame_shape should be the frame's shape tuple (h, w) as given
by one of the parent_Image.image.shape.
candidate_radius: Radius within which candidates in the following
frame are paired with precursors. The default is set to 2
because our peptides shouldn't move more than that. Change only
after thinking about it really hard.
offsets: Specifies offsets (delta_h, delta_w) between frames.
spot_radius: Discard spots within this distance of any frame's
edge.
Returns:
A tuple of the following items:
1. The traces themselves as a list of lists of Spot objects:
[[Spot1_frame0, Spot1_frame2, ..., Spot1_frameI, ...], #Spot 1
...,
[SpotN_frame0, SpotN_frame2, ..., SpotN_frameJ, ...], #Spot N
... ]
Each Spot's existence in each frame is represented by the Spot
object in that frame. Hence, for each spot being tracked across
frames, its existence is a sequence of Spot objects. Each of
these sequences is represented as a list, and a list of these
lists is the object returned. In those cases where the spot is
not present in a frame, None is used as a placeholder in the
list. Thus, all of the Spot lists are the same length, matching
the number of frames that were to be processed.
2. Number of Spots discarded and not incorporated into any traces
due to stage drift.
"""
logger = logging.getLogger()
if offsets is None:
offsets = [(0, 0) for f in len(frame_spots)]
#cumulative offsets are useful in some contexts
cumulative_offsets = Experiment.accumulate_offsets(offsets)
#This algorithm speeds things up by binning all Spot objects into the
#Mumpy array frame_bins by rounding their (h, w) to the nearest whole
#number. The behavior of the Python round function is the reason this
# method requires all Spots to be at least two pixels apart.
#
#frame_bins is the central data structure used within this function.
#For each frame, frame bins has a corresponding, identically-shaped
#numpy array. Each pixel in this array contains the dictionary
#
#{'spt': spot,
# 's_L': (own_frame, own_h, own_w),
# 'a_L': (prior_frame, prior_h, prior_w),
# 'd_L': (next_frame, next_h, next_w)}
#
#spt is the Spot object whose (h, w) coordinates round to this pixel.
#If there is no Spot in this pixel, then all four items -- spt, and all
#three tuples -- are set to None. The three tuples in the pixel perform
#the functions of a doubly-linked linked list that connects Spots
#across frames. The first tuple indicates the coordinates of the spot
#itself: its frame #, and its rounded (h, w) coordinates. The second
#tuple indicates the coordinates of the ancestor spot in frame_bins.
#The third tuple does the same for the spots descendant. If there is no
#ancestor or descendant, then the corresponding tuple is remains a None
#object.
#The objective of this method is then to essentially fill in the
#ancestor and descendant links such that the PSFs are tracked across
#frames.
frame_bins = [np.full(frame_shape,
{'spt': None, 's_L': None,
'a_L': None, 'd_L': None},
dtype=np.object)
for frame in frame_spots]
#ancestor_cache stores those Spots who do not have descendants
ancestor_cache = np.full(frame_shape,
{'spt': None, 's_L': None,
'a_L': None, 'd_L': None},
dtype=np.object)
#ignore spots that would be outside of the first frame when aligned;
#using filtered_spots as temporary variable so as to not modify
#frame_spots that could be being used elsewhere
filtered_spots = [[] for frame in frame_spots]
total_discarded_spots = 0
for f, frame in enumerate(frame_spots):
filtered_spots[f], number_discarded = \
Experiment.discard_dropouts(spots=frame,
spot_cumulative_offsets=[cumulative_offsets[f]
for x in frame_spots[f]],
frame_cumulative_offsets=cumulative_offsets,
image_shape=frame_shape,
spot_radius=spot_radius)
total_discarded_spots += number_discarded
frame_spots = filtered_spots #stop using filtered_spots as temp var
#Populate frame_bins with all Spots.
for f, frame in enumerate(frame_spots):
for spt in frame:
h, w = Experiment.apply_offset((spt.h, spt.w),
cumulative_offsets[f])
rh, rw = int(round(h)), int(round(w))
assert all([frame_bins[f][rh, rw]['spt'] is None,
frame_bins[f][rh, rw]['s_L'] is None,
frame_bins[f][rh, rw]['a_L'] is None,
frame_bins[f][rh, rw]['d_L'] is None]), \
(str((rh, rw)) + " is already filled in frame_bins[" +
str(f) + "]")
frame_bins[f][rh, rw] = {'spt': spt, 's_L': (f, rh, rw),
'a_L': None, 'd_L': None}
#Main loop: go through all the frames, and make the best connections
for f, frame in enumerate(frame_bins):
if f == 0:
continue #skip first frame; it won't have any ancestor links
else:
#Merge Spots from prior frame into ancestor_cache
for (rh, rw), fbin in np.ndenumerate(frame_bins[f - 1]):
(spt, s_L, a_L, d_L) = (fbin['spt'], fbin['s_L'],
fbin['a_L'], fbin['d_L'])
if spt is None:
continue
assert rh == s_L[1] and rw == s_L[2], \
"s_L and (rh, rw) mismatch"
o_h, o_w = Experiment.apply_offset((spt.h, spt.w),
cumulative_offsets[f - 1])
logger.debug("greedy_particle_tracking: (spt.h, spt.w) " +
str((spt.h, spt.w)) + " at frame f = " + str(f))
logger.debug("greedy_particle_tracking: " +
"cumulative_offsets[f] = " +
str(cumulative_offsets[f - 1]) +
" at frame f = " + str(f))
logger.debug("greedy_particle_tracking: (o_h, o_w) = " +
str((o_h, o_w)) + " at frame f = " + str(f))
ro_h, ro_w = int(round(o_h)), int(round(o_w))
logger.debug("greedy_particle_tracking: (ro_h, ro_w) = " +
str((ro_h, ro_w)) + " at frame f = " + str(f))
assert (rh, rw) == (ro_h, ro_w), \
(str(rh) + ", " + str(rw) + " != " + str(ro_h) + ", " +
str(ro_w))
logging.debug("greedy_particle_tracking: " +
"ancestor_cache[rh, rw] = " +
str(ancestor_cache[rh, rw]) +
" at frame f = " + str(f))
#I am removing this assertion because I came across cases
#where there are two very closely competing ancestors from
#different frames
#assert all([ancestor_cache[rh, rw]['spt'] is None,
# ancestor_cache[rh, rw]['s_L'] is None,
# ancestor_cache[rh, rw]['a_L'] is None,
# ancestor_cache[rh, rw]['d_L'] is None]), \
# str((rh, rw)) + " already filled in ancestor_cache " + \
# " (s_L, a_L, d_L) = " + str((s_L, a_L, d_L)) + \
# " at f = " + str(f)
ancestor_cache[rh, rw] = {'spt': spt,
's_L': (f - 1, rh, rw),
'a_L': None, 'd_L': None}
logger.debug("greedy_particle_tracking: Setting " +
"ancestor_cache[" + str(rh) + ", " + str(rw) +
"] to s_L = " + str((f - 1, rh, rw)))
#used to sort all ancestor-descendant pairs by Eucledian distance
ancestor_descendant_pairs = []
for (ah, aw), abin in np.ndenumerate(ancestor_cache):
(a_spt, as_L, aa_L, ad_L) = (abin['spt'], abin['s_L'],
abin['a_L'], abin['d_L'])
if a_spt is None:
continue
assert ad_L is None, "ancestor_cache shouldnt have descendants"
assert ah == as_L[1] and aw == as_L[2], \
"as_L and (ah, aw) mismatch"
aaf = as_L[0]
frame_slice = frame[max(ah - candidate_radius - 2, 0):
ah + candidate_radius + 3,
max(aw - candidate_radius - 2, 0):
aw + candidate_radius + 3]
for (dh, dw), dbin in np.ndenumerate(frame_slice):
(d_spt, ds_L, da_L, dd_L) = (dbin['spt'], dbin['s_L'],
dbin['a_L'], dbin['d_L'])
if d_spt is None:
assert ds_L is None and da_L is None and dd_L is None,\
"spt is None, but has links!"
continue
if ah - candidate_radius - 2 > 0:#sliced close to the edge?
dh += ah - candidate_radius - 2
if aw - candidate_radius - 2 > 0:#sliced close to the edge?
dw += aw - candidate_radius - 2
assert dh == ds_L[1] and dw == ds_L[2], \
"ds_L and (dh, dw) mismatch"
ddf = ds_L[0]
distance = euclidean(
Experiment.apply_offset((a_spt.h, a_spt.w),
cumulative_offsets[as_L[0]]),
Experiment.apply_offset((d_spt.h, d_spt.w),
cumulative_offsets[f]))
if distance < candidate_radius:
ancestor_descendant_pairs.append((a_spt, aaf, ah, aw,
d_spt, ddf, dh, dw,
distance))
ancestor_descendant_pairs = sorted(ancestor_descendant_pairs,
key=lambda x:x[8])
for (a_spt, aaf, ah, aw,
d_spt, ddf, dh, dw,
distance) in ancestor_descendant_pairs:
if ancestor_cache[ah, aw] == {'spt': None, 's_L': None,
'a_L': None, 'd_L': None}:
assert frame_bins[aaf][ah, aw]['d_L'] is not None, \
"Unpaired ancestor was removed from ancestor_cache."
logger.debug("greedy_particle_tracking: frame[" + str(dh) +
", " + str(dw) + "]" + " skipped because " +
"ancestor has been paired. Frame f = " +
str(f) + ", distance = " + str(distance))
continue #This indicates that ancestor has been paired
elif frame[dh, dw]['a_L'] is not None:
logger.debug("greedy_particle_tracking: frame[" + str(dh) +
", " + str(dw) + "]" + " skipped because " +
"descendant has been paired. Frame f = " +
str(f) + ", distance = " + str(distance))
continue #This indicates that descendant has been paired
else:
assert frame[dh, dw]['a_L'] is None, \
"Descendant being paired more than once."
frame[dh, dw]['a_L'] = (aaf, ah, aw)
logger.debug("greedy_particle_tracking: frame[" + str(dh) +
", " + str(dw) + "][a_L] = " +
str((aaf, ah, aw)) + ", frame = " + str(f) +
", distance = " + str(distance))
assert frame_bins[aaf][ah, aw]['d_L'] is None, \
"Ancestor being paired more than once."
frame_bins[aaf][ah, aw]['d_L'] = (ddf, dh, dw)
logger.debug("greedy_particle_tracking: frame[" + str(ah) +
", " + str(aw) + "][d_L] = " +
str((ddf, dh, dw)) + ", frame = " + str(f) +
", distance = " + str(distance))
ancestor_cache[ah, aw] = {'spt': None, 's_L': None,
'a_L': None, 'd_L': None}
logger.debug("greedy_particle_tracking: " +
"ancestor_cache[" + str(ah) + ", " + str(aw) +
"] set to None. Frame f = " + str(f) +
", distance = " + str(distance))
#Should now have all Spots linked. Extract them by following links.
#traces will be the links of Spots that this method returns
traces = []
#first find all head Spots throughout all frame_bins
heads = []
for f, frame in enumerate(frame_bins):
for (h, w), fbin in np.ndenumerate(frame):
(spt, s_L, a_L, d_L) = (fbin['spt'], fbin['s_L'],
fbin['a_L'], fbin['d_L'])
if spt is not None and a_L is None:
heads.append((spt, s_L, a_L, d_L))
#now follow the segments from all heads
for spt, s_L, a_L, d_L in heads:
assert s_L is not None, \
"s_L cannot be None for a head spot " + str((spt, s_L, a_L, d_L))
#prepend Nones if psf is not in the first frame