Autonomous Art Generation
Robot will autonomously detect objects and perform style transfers on images to make interesting interpretations of the world it is taking in
The robot will be equipped with a ultrasound sensor to aid in the initial detection of objects.- 2 gear motors
- 1 ultrasonic distance sensor
- 1 switch
- 1 motor driver
- jumper wires
- 1 sparkfun redboard
Does the Ultrasound sensor detect an obstruction?
- yes: back up until there is enough distance to perform a 180 degree rotation, perform 180 back up while continuously taking images of the prospective object, until an obstruction is dettected from behind randomly select a new direction to travel in and repeat
- no: continue driving straight
Distance travelled
We must be able to measure how far the robot has travelled backwards and forwards:
- Ultrasound Sensor- When travelling in areas of which an obstruction is detectable, we can utilize the ultrasound
sensor to measure the distance travelled (units = Inches)
- used in state 2 to measure distance backed up from obstruction
- Measure time travelled- Using rough estimates, we can determine the amount of distance travelled by using the time
the motors were rotating and in what direction they were rotating in. (Units = ms)
- system travels at a approximate rate of 0.05 Inches/ms
- used in state 1, and state 4 to measure the distance travelled based on time of motor activation
distance = (TravelTimeRef * VELOCITY_I_PER_MS) + UltraSoundRef
With every sequence of travel, the relative position of the system will also depoend on the trajectory of the system prior to engaging in forward or backwards motion
- Measure Rotation Time - A very crude estimate of trajectory can be by measuring the amount of time the
system spends rotating. (Units = degrees)
- system rotates at approximately 10 degrees/ms
- used in 3 and 5 to determine the amount of rotation performed by the system.
With the distance travelled and trajectory, we can determine the systems position relative to its starting point. This is done using some basic trig, and adding up the trajectories component wise.
The image processing will occur simultaneously with image collection. The system operates in the following way:
- For each image collected, perform object detection
- if object detected, perform style transfer and store artistic interpretation where original was stored
- if no object detected, remove image from model
Object Detection
The current implementation of the wandering artist is using the centernet model, trained on ms-coco dataset. No further training is performed. The model is obtained through the tensorflow object detection api, and is not trained on any new data. The MS-COCO dataset contains 80 different objects, so this means that the Wandering Artist has a fairly wide range of objects to potentially detect (more info at
Style Transfer
The style transfer is performed using the Van Gogh Starry Night art piece for all examples shown. The style transfer utilizes the first 5 blocks of the VGG-19 Model, trained on the imagenet dataset ( The first 5 layers of this model will contain information about high level relationships, and so prove to be useful in the style transfer operation.
Problems with Computation
The above described Deep Learning image manipulation techniques are realtively computationally heavy and therefore create bottlenecks in the systems ability to perform them in real time. Because of this, these tasks are performed separately from the image collection process.
- More sensors will allow for more accurate internal models of the environment
- The models should be used to influence the systems actions, for example modify the probabilities associated with the random perturbations, such that it does not head in directions that have already been explored.
- Later implementations should include more deep learning models to create different "lifelike" features a Natural Language Processing Model would be useful in creating realistic captions for the images, or could even tell a story with the images.
- Reinforcement learning could be used in the collection of images, the goal of the system would be to maximize object detection and centre the object in the frame.
- Faster deep learning models could promote real time object detections and style transfer.
- simple image analysis could be used to take properties of images such as dominant colours to further improve NLP possibilities.