-
Notifications
You must be signed in to change notification settings - Fork 2
/
Copy pathChanges.txt
186 lines (155 loc) · 7.83 KB
/
Changes.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
List of changes to JMultiSliceLib
!( this copy is not up to date, further dev was done off-sides the GitHub master )
!( this copy is partially updated to provide more functions to the Dr. Probe GUI )
2022-04-14: 0.46 JB
- Added pixel-shift beam-tilt to incident wave function offset.
This can be used to do precession diffraction calculations while
scanning. Take care to set the de-scan paramaters accordingly,
otherwise the output remains tilted. The tilt is applied before
scan shift and defocus, thus a tilted wavefunction is scanned and
defocussed with respective consequences.
- Inverted beam tilt action. The beam tilt given determines now the
point in Fourier-space at which the average incident beam direction
is placed. The same effect is caused by shifting the probe aperture.
- Separated incident probe aperture and phase plate application.
2020-06-10: 0.45 JB
- Added and tested code for separating elastic and tds channels
for STEM, diffraction and probe imaging.
- Added multi-thread code for the multislice calculations. The
code runs now with a hybrid setup.
2020-06-02: 0.44 JB
- Added code to calculate object transmission functions from
atomic structure data (single thread & multi thread)
(CPU and GPU)
- Added code to handle atomic structure data with file I/O.
- Tested the code for calculating object transmission
functions.
2020-04-23: 0.43 JB & EW
- Modified integration on GPU by reduction running directly on
wave function data (EW)
- Modified detector readout on GPU to adopt on detection setup
and either use precalculated power or direct integration from
wave function data, for integarting and imaging detectors.
- pre-implemented FFT callbacks in the forward FT. The callbacks
are not used yet, because this is only supported on Linux64
with the static cufft library. Maybe we activate this later,
if Nvidia releases a static library for all operating systems.
2020-04-07: 0.42 JB
- Added internal pattern averaging
2020-03-05: 0.41 JB
- Fixed a bug causing artifacts of the low-loss inelastic
scattering code with non-square grid sizes.
2020-02-20: 0.40 JB
- Improved speed by replacing some frequent CUDA mallocs by
static allocations in the integrated detection scheme
(contribution by E. Westphal, FZ-Jülich)
2020-02-19: 0.39 JB
- Added an allocation memory check before the actual allocation
call on the CUDA device to prevent persistence of out of memory
errors into kernel launch.
2020-01-13: 0.38 JB
- Resolved a memory management bug in the results handling.
- Moved the CUDA code compilation to CUDA Toolkit v.10.2.
The package should be shipped with 'cufft64_10.dll' or the
library must be made available on the target machine by installing
the respective CUDA Toolkit version.
2019-12-19: 0.36 JB
- Removed all remaining MS Windows specfic includes and declarations.
- Changed random number generator calls to <random> functions.
- Started test project JMSBench1 for benchmarking JMultiSlice.
2019-12-16: 0.35 JB
- Removed a few MS Windows specfic includes and declarations.
2019-11-19: 0.34 JB
- Added utility functions for pre-calculation settings
SetGPUPgrLoading: set phase grating loading scheme to switch
between pre-loading and on-demand loading
GetDetetionSliceNum: returns number of readout planes
- Modified InitCore to be more predictable in memory usage.
2019-10-30: 0.33 JB
- Added memory cleanup calls for MKL closing leaks.
FreeLibMem: call this before closing the calling process.
- Fixed issues with the plasmon scattering code.
2019-10-18: 0.32 JB
- Added plasmon scattering Monte-Carlo to CPU & GPU multislice.
SetPlasmonMC: call to trigger and setup plasmon scattering
2019-10-15: 0.31 JB
- Added automatism to copy phase grating data from host memory
on demand in case of insufficient memory on device for holding
all phase gratings. This can affect the GPU multislice, leading
to increased computation time by approximately a factor of 2.
2019-10-02: 0.30 JB
- Added CJFFTMKLcore for CPU FFT calculations. This code links to
the Intel Math Kernel Library
(mkl_intel_lp64.lib;mkl_sequential.lib;mkl_core.lib)
Make sure to add the MKL include path and the 64-bit library
path to the directory lists of the JMultislice Project and the
include path to the calling project.
The library object code will get very large (~600MB), the final
binary if the calling code will increase by about 30 MB.
Speed gain can be significant for CPU calculations on smaller
scales (<1k pixel) while not very significant on larger problems.
2019-09-16: 0.25 JB
- Modified the aperture functions to consistency with MSA and
internally between different aperture function models.
2019-05-24: 0.24 JB
- Fixed a minor scaling issue in the sensitivity profile.
2019-05-03: 0.24 JB
- Added probe-forming aperture asymmetry.
2019-01-08: 0.23 JB
- Modified the butterfly summation unroll and fallback scenario.
- Added a class for calculating atomic form factors using the
parameterization of Weickenmeier and Kohl, including absorptive
form factors (still not tested).
2018-12-21: 0.22 JB fix
- Bug fix in the 2-fold summation routine.
2018-12-20: 0.22 JB
- Implemented a strided two-fold butterfly float summation speeding
up the integration of diffraction detector signal by a significant
ammount. The routine is slightly faster than straight summation
but doesn't require error correction (which was 4 times more
expensive than a straight float summation.)
2018-12-18: 0.21 JB
- Added diffraction de-scan (global m_dif_descan == 0: off, 1: on)
per thread de-scan parameters: m_d/h_dif_ndescanx/y in [pixel]
De-scan shifts the diffraction pattern before accumulation.
Use DiffractionDescan(true/false) to turn the de-scan on/off.
Use SetDiffractionDescan(whichcode, ndescanx, ndescany, iThread)
to set de-scan values of the thread before calling the multislice
routines. The de-scan values should be identical to the scan values.
- Added more device and host SetData and GetData functions (CJFFTCUDAcore).
- Added external array transform interfaces FT_h and IFT_h (CJFFTWcore).
- Added wave function accumulation and readout. The detection channels
for wave functions are triggered in the DetectorSetup() and GetResult()
interfaces with flags _JMS_DETECT_WAVEREAL and _JMS_DETECT_WAVEFOURIER.
Accumulating many multislice runs with these flags will sum up over
wave functions. Renormalization of these results yields wave functions
of the elastic channel which can be used to separate elastic images and
TDS images from the total intensities recorded with the detection flags
_JMS_DETECT_IMAGE and _JMS_DETECT_DIFFRACTION, respectively. Due to
the potential hybrid multi-threading, the final separation is not
provided by JMultiSlice and should be done by the calling code.
- Extended the LIB interface .
2018-12-15: 0.20 JB
- Added a module for probe aberrations, apertures, and wavefunctions
- Added functions to set incoming wave functions directly
2018-11-12: 0.15 JB
- Implementation of propagators and detector functions
- Fixed bugs in the multislice chain discovered when using with the Dr. Probe GUI
2018-07-11: 0.14 JB
- Added functions using smaller phase gratings than wave-functions
- Added periodically repeated sub-frame multiplications to CPU and GPU cores
- Added slice thickness data handling
- Added pure GPU memory info function
- Added function to calculate propagtors (Fresnel and planar)
- Added GNU GPL 3 License information to all source files.
2018-05-04: 0.13 JB
- Changed the implementation to a Class CJMultiSlice since the previous
static scope implementation turned out to be unusable in a multi-thread
scheme.
2018-05-02: 0.12 JB
- Optimized incident wave function offset calculations for GPU code.
2018-04-30: 0.11 JB
- Multiple bug removals.
- Added masked integrating detector readout functions.
2018-04-26: 0.1 JB
- First core and interface implementations. Working!