Georgia Tech CS-6476: Spring 2022
If you're studying Computer Vision, or Reinforcement Learning, parameter tuning is probably causing you some angst. The goal is to give you a python execution harness for parameter tuning that's easy to use and minimally disruptive to your CV code. You get to skip learning about openCV's HighGUI and Trackbars APIs, and focus instead on the joys of particle filtering. Here's a 5 minute introduction to the essentials (the code for this example is in `example.py`). Importing this component, and copying 3 lines into your code will get you the tuner
ThetaUI
.
This function and its invocation...
def find_circle(image, radius):
# your implementation
return results
if __name__ == "__main__":
find_circle(image, 42)
... hooked up to ThetaUI, become:
#new import
import TunedFunction
#new decorator, and a 'tuner' param
@TunedFunction()
def find_circle(image, radius, tuner=None)
#new line of code to display an updated image in ThetaUI
if not tuner is None: tuner.image = updated_image
return results
Your (unchanged) invocation from 'main' now shows ThetaUI
: a python tkinter GUI with a spinbox called 'radius' which ranges from 0 to 42.
- the window title shows the name of your tuned function,
- various parts of the status bar tell you:
- the image title (when you pass in a file name),
- the frame number (when you pass in a carousel of images)
- code timing in h,m,s and process time
- whether the image displayed was sampled/interpolated for this display,
- whether there were exceptions during execution (click when red to view exceptiosn).
- the menus let you traverse the carousel, start a grid search, save results and images etc.
- on the left, the tree shows you json representing what your invocation: args, and results.
- the picture is your image once you are done processing it
- typically this shows the last couple images you've specified, that number is configurable
Each time you change a parameter,
ThetaUI
calls your code find_circle()
with a new value for 'radius'.
And that folks, is pretty much it. Here's a good stopping point; try this out on your CV code.
There's more to ThetaUI
, like:
- it runs a systematic grid search over the space of your args (exhausts the search space),
- tagging args (note when theta is cold/warm/on-the-money),
- json serialization of invocation trees
So... read on...
- Decorate the function you want to tune (referred to as target) with
@TunedFunction()
, and add a 'tuner' param to its signature. (Note: there should be no other decorator ontarget
.) - Begin tuning by calling
target
.@TunedFunction
creates an instance of ThetaUI (passed totarget
via thetuner
param). You are now in the tuning loop: - Switch to the Tuner GUI and adjust the trackbars.
- Tuner will invoke your function on each change made to a trackbar.
- Set
tuner.image
to the processed image from withintarget
. This refreshes the display in Tuner's GUI. - End your tuning session by pressing the Esc (or any non function key)
To restore normal operation of your function, comment out or delete the @TunedFunction() decorator.
Positional and keyword parameters (not varargs, or varkwargs) in your function signature are candidates for tuning. If your launch call passes an int, boolean, list or dict to any of these, then that parameter is tuned; the others are passed through to your function unchanged (I.E. they are automatically "pinned"). Images, e.g., can't be tuned - so np.ndarray arguments are passed through to your function unchanged. Tuples of 3 ints also work, and are interpreted in a special way.
If you want to skip tuning (aka "pin") some parameters in your target
's signature, you have the following options, choose what works best for your workflow:
It's the type of the argument passed in your launch call that drives Tuner behavior, not the annotation on the parameters.
#image is passed through, radius is tuned - min 0, max 50
find_circle(image, radius=50)
#same as above
find_circle( image, 50 )
#radius is tuned with values ranging between 20, and 50
find_circle( image, (50,20) )
#radius is tuned and the slider selects among [10, 50, 90]
find_circle(image, [10,50,90])
#radius is tuned and target receives one of [10, 50, 90]
#The difference is that Tuner GUI dispalys "small", "med", "large"
j = {"small":10, "med":50, "large":90}
find_circle( image, radius=j )
- int The trackbar's max is set to the int value passed in.
- tuple The trackbar's
- boolean The trackbar will have two settings
- list This is a good way to specify non int values of some managable length. Strings, floats, tuples all go in lists.
- The trackbar will have as many ticks as there are items in the list.
- Changing the trackbar selects the corresponding item from the list.
- The argument passed to target is the list item.
E.g., when your launch call passes ['dog','cat','donut'] to the
radius
parameter, Tuner will:- create a trackbar with 3 positions.
- call target passing one of the following ['dog','cat','donut'] to
radius
- whichever you've selected with the trackbar.
- dict or json object Very similar to list above. obj[key] is returned in the arg to target.
(max,min,default)
are taken from the tuple.
0, 1
which correspond to False, True
. The default value is whatever you have passed in. Tuner will call target with one of False, True
depending on trackbar selection.
Trivially, [(3,3), (5,5), (7,7)]
is a list you might use for tuning the ksize
parameter of cv2.GaussianBlur()
Consider the json below. Passing that to a parameter would create a trackbar that switches amongst "gs", "gs_blur" and "gs_blur_edge". When target
is invoked in a tuning call, the argument passed in is the json corresponding to the selected key.
preprocessing_defs = {
"gs" :{
"img_mode": "grayscale"
, "blur":{"apply": false}
, "edge": {"detect" : false}
}
,"gs_blur":{
"img_mode": "grayscale"
, "blur":{"apply": true, "ksize": (5, 5), "sigmaX": 2}
, "edge": {"detect" : false}
}
, "gs_blur_edge": {
"img_mode": "grayscale"
, "blur":{"apply": true, "ksize": (5, 5), "sigmaX": 2}
, "edge": {"detect" : true, "threshold1": 150, "threshold2": 100, "apertureSize": 5}
}
- F1 : runs a grid search
- F2 : saves the image
- F3 : saves your Invocation Tree
- F8 - F10 : tags and saves your Invocation Tree (see below).
- ...hook up Tuner and invoke your function to tune it
- ...save your observations (tags) along with theta
- ...and finally, come back and analyse the Invocation Tree saved to your output file to narrow in on your ideal theta
Saving behavior is determined principally by a couple of statics in TunerConfig.
TunerConfig.output_dir: by default this is set to `./wip` Change this before you use the other functions of Tuner.
TunerConfig.save_style: This should be set to some valid combination of the flags found in `constants.SaveStyles`. The default is to overwrite the contents of the output file on each run, and to only save when explicitly asked to.
The following are always tracked, although only serialized to file under certain circumstances:
- args: The set of args to an invocation.
- results: This could be explicitly set by your code like so
tuner.results=...
. If you do not set this value, tuner captures the values returned bytarget
and saves them as long as they are json serializable - errored: Whether an error took place during
target
invocation. - error: These are execution errors encountered during
target
invocation. BTW, the most recent call is first in this formatted list, not last as you would expect from typical python output. - [insert your tag here] : A complete list of all the custom tags with the value set to false, unless you explicitly tag the invocation, in which case the particular tag(s) are set to
True
.
- You explicitly save - F3.
- You tag an invocation.
- An exception was encountered during the invocation.
- The title of the image from your carousel (see explicit instantiation below), defaulting to 'frame'
- The invocation key (what you see is the md5 hash of theta)
- args (contains each element of theta)
- results (contains the saved or captured results of
target
) - the custom tags that you set up in Tuner GUI, defaulting to False
- errored
- errors (contains your execution exceptions)
This feature runs through a cartesian product of the parameter values you have set up. target
is invoked with each theta, and Tuner waits indefinitely for your input before it proceeds to the next theta.
Here's my workflow:
- I start with a small range of inputs, and let Tuner search through that space.
- When Tuner waits for input, I tag the current set of args (e.g., 'avoid' or 'close'); or just 'press any key'. I can also hit Esc to cancel the grid search.
- After I've run through the cart (cartesian product of all arguments), I query the (json) output file to find my theta, or something close.
This is about as much code as I can give you without running afoul of the GA Tech Honor Code. We can spitball some ideas to help you get more value out of the data that's captured if you follow the "Search-Inspect-Tag" workflow I've outlined above.
- If you find a number of 'close' thetas, build a histogram of the various args to EACH param, using only thetas that are 'close'. That should highlight a useful arg to that param :)
- Implement a Kalman Filter to help you narrow the grid search.
Here's another good stopping point. Read on for more fine grained control.
With explicit instantiation, you give up the convenience of automatic trackbar GUI configuration, but gain more control over features. If you like the UX of @TunedFunction
, see the benefits section below to determine if it's worth it to wade through the rest of this.
Instead of TunedFunction, you import ThetaUI and TunerConfig. ThetaUI is the facade you work with. You could ignore TunerConfig if the default settings (e.g. when and where to save) work for you.
Workflow:
- import ThetaUI
- Instantiate tuner, choosing between one and two functions to watch :
main
anddownstream
. - Each func must accept a
tuner
param with the default value of None... - Make calls to
tuner.track()
,track_boolean()
,track_list()
ortrack_dict()
to define tracked/tuned parameters tomain
- Make a call to
tuner.begin()
, or totuner.grid_search()
. Each of these calls accepts a carousel. You do not use a launch call, as you did withTunedFunction()
.- This launches tuner, and then, as usual, each change to a trackbar results in a tuning call to
target
. - Tuner passes args to formal parameters which match by name to a tracked parameter.
- All tracked parameters are also accessible off
tuner
. E.g.,tuner.radius
. This enables you to tune variables that are not part of the formal arguments to your function. Wondering if you should setreshape=True
in a call tocv2.resize()
? Well, just add a tracked parameter for that (without adding a parameter to your function), and access its value offtuner
. The idea is to keep your function signature the same as what the auto-grader would expect - minimizing those 1:00am exceptions that fill one with such bonhomie. These args are also accesible as a dict via tuner.args
- This launches tuner, and then, as usual, each change to a trackbar results in a tuning call to
- set
tuner.image
to the processed image before you return... - optionally - set
tuner.results
to something that is json serializable before you return.
You cannot mix Tuner with partials and decorators (things blow up unperdictably) - just the func please. You could have two distinct functions called by Tuner -
main
(called first) and downstream
(called after main
).
- There's only one set of trackbars - these appear on
main
's window. - Args (other than tuner) are not curried into
downstream
, so set defaults. - When
downstream
accessestuner.image
, it gets a fresh copy of the current image being processed. To get the image processed bymain
, accesstuner.main_image
. tuner.image
andtuner.results
set frommain
are displayed in the main window (the one with the trackbars).tuner.image
andtuner.results
set indownstream
are displayed in thedownstream
window which does not have trackbars. Usually, the downstream image obscures the main one on first show; you'll need to move it out of the way.- Tuner will save images separately on F2, but will combine the results of both, along with args (tuned parameters) and writes it to one json file when you press F3. Remember to keep your results compatible with json serialization.
- Use the helper call
tuner.carousel_from_images()
to set up a carousel. This takes 2 lists.- The first is the list of names of parameters in
target
that take images.target
might work with multiple images, and this list is where you specify the name of each parameter that expects an image. - The second is a list of image files (full path name). Each element of this list should be a tuple of file names.
- If
target
works with 2 images, then each element of this second list must be a tuple of two image paths. - If it works with three images, then each element must be a tuple of three image paths, et cetera.
- The first is the list of names of parameters in
- When
Tuner
is aware of image files, it uses the file name inThetaUI
's window title, (instead of just 'frame'). - You can specify openCV imread codes to be used when reading files.
- A video file can be used as a frame generator [untested as of April 2021]
- Being able to tune hyper-parameters, or other control variables, without having them be parameters to your function. This keeps your signature what your auto-grader expects. Once ascertained, you should remove these from
Tuner
- Process a carousel of images, remembering settings between images.
- Insert a thumbnail into the main image (set
tuner.thumbnail
before you settuner.image
. This is useful, e.g., when you are matching templates. You could do this with@TunedFunction()
as well. - View the results of two processes in side by side windows. A few use cases for side-by-side comparison of images:
- Show your pre-processing output in
main
; and traffic sign identification output indownstream
. match_template()
output in one vs.harris_corners()
output in the other.- What your noble code found, vs. what the built in CV functions found (I find this view particularly revealing, also, character building).
- Show your pre-processing output in
- Controlling aspects of
tuner.grid_search()
. Please see the docstrings for more information. - You get to control whether the GUI returns list items vs list indices; keys vs dict objects etc.
- You get to create tuners by spec'ing them in json.
- Finally, as anyone who has written a Decorator knows, things can get squirrelly when exceptions take place within a partial... you could avoid that whole mess with explicit instantiation of ThetaUI.
Apart from the few differences above, ThetaUI
and TunedFunction()
will give you pretty much the same UX. If none of the above are dealbreakers for you, stick with the decorator.
Your experience of this GUI is going to be determined by the version of various components - OpenCV, and the Qt backend. Tuner does take advantage of a couple of the features of the Qt backend, but those are guarded in try
blocks, so you shouldn't bomb.
If you're in CS-6476, you've installed opencv-contrib-python
. If not, might I suggest...
If you don't see the status bar in Tuner GUI, you are missing opencv-contrib-python
If you don't see the overlay menu, you are missing Qt backend
The accompanying example.py
illustrates some uses. Refer to the docstrings for ThetaUI's interface for details. Play around, and let me know if you think of ways to improve this.
I've debugged this thing extensively, but I haven't had the time to bullet proof it. It will behave if your arguments are well behaved; but caveat emptor...
Arguments curried into your functions follow usual call semantics, so modifying those args will have the usual side effects. Accessing tuner.image always gives you a fresh copy of the image - but this is the exception.
(tldr: work on a copy of the image
parameter - not directly on it, or else side effects will accumulate...)
Don't forget to remove the @TunedFunction() decorator; the auto-grader won't much care for it :)
It's only licensed the way it is to prevent commercial trolling. For all other purposes...
Fork it, make something beautiful.