This project brings the MQTT protocol via WiFi to generic host devices running MicroPython but lacking a WiFi interface. A cheap ESP8266 board running firmware from this repository supplies WiFi connectivity.
It is designed to be resilient coping with WiFi or broker outages and ESP8266 failures in as near a transparent fashion as possible.
Connection between the host and the ESP8266 is via five GPIO lines.The means of communication, and justification for it, is documented here. It is designed to be hardware independent requiring three output lines and two inputs. It uses no hardware-specific features like timers, interrupts, special code emitters or machine code. Nor does it make assumptions about processor speed. It should be compatible with any hardware running MicroPython and having five free GPIO lines.
The driver is event driven using uasyncio for asynchronous programming. Applications can run unaffected by delays experienced on the ESP8266.
This document assumes familiarity with the umqtt and uasyncio libraries.
Unofficial guides may be found via these links:
umqtt FAQ.
uasyncio tutorial.
The ESP8266 operates in station mode. The host interface supports the MQTT functionality provided in the official umqtt library. It aims to keep the link to the broker open continuously, enabling applications which seldom or never publish to receive messages. The host implements a watchdog to reboot the ESP8266 in the event of fatal errors or crashes.
This project is no longer maintained and has been archived prior to development
of a replacement. The Pyboard cient requires uasyncio
V2 with firmware V1.12.
This is obsolete. A new version will be released to use uasyncio
V3 and with
a view to improving the API and easing portability.
V0.22 Jan 2018/April 2020
Now uses the resilient
MQTT library. The ESP8266 is now rebooted only in the
event of ESP8266 failure such as a fatal input buffer overflow. The resilient
library has some significant bugfixes.
Allows custom args to subscribe
and wifi_handler
callbacks.
API Changes
The Pyboard code now uses the new task cancellation functionality inuasyncio
.
User programs will need to be adapted to use the @asyn.cancellable
decorator:
see pb_simple.py
.
Test status
Testing was performed using a Pyboard V1.0 as the host. The following boards have run as ESP8266 targets: Adafruit Feather Huzzah, Adafruit Huzzah and WeMos D1 Mini.
Testing was performed using a local broker and a public one.
I have had no success with SSL/TLS. This may be down to inexperience on my part so if anyone can test this I would welcome a report. Please raise an issue - including to report a positive outcome :).
- Wiring Connections between host and ESP8266.
- The Host Software on the host.
2.1 Files
2.1.1 Dependencies
2.1.2 Test programs
2.2 Quick start guide
2.3 The MQTTlink class The host API.
2.3.1 Constructor
2.3.2 Methods
2.3.3 Class Method
2.3.4 The user_start callback
2.3.5 Intercepting status values
2.4 Application design
2.4.1 User coroutines
2.4.2 WiFi Link Behaviour - The ESP8266 Installing and modifying the ESP8266 build.
3.1 Installing the precompiled build Quickstart.
3.2 Files For users wishing to modify the ESP8266 code.
3.3 Pinout - Mode of operation How it works under the hood.
4.2 Protocol
4.2.1 Initialisation
4.2.2 Running - Limitations
5.1 Speed
5.2 Reliability - References
Connections to the ESP8266 are as follows.
In the table below Feather refers to the Adafruit Feather Huzzah reference board
or to the Huzzah with serial rather than USB connectivity. Mini refers to the
WeMos D1 Mini. Pyboard refers to any Pyboard version. Pins are for the Pyboard
test programs, but host pins may be changed at will in net_local.py
.
Signal | Feather | Mini | Pyboard | Signal |
---|---|---|---|---|
mckin | 12 | D6 | Y6 | sckout |
mrx | 13 | D7 | Y5 | stx |
mtx | 14 | D5 | Y7 | srx |
mckout | 15 | D8 | Y8 | sckin |
reset | reset | rst | Y4 | reset |
Host and target must share a common ground. They need not share a common power source - the order in which they are powered up is not critical.
Note on the reset connection. The default net_local.py
instantiates the pin
with Pin.OPEN_DRAIN
because some boards have a capacitor to ground. On a low
to high transition a push-pull pin could cause spikes on the power supply. The
truly paranoid might replace the reset wire with a 100Ω resistor to limit
current when the pin goes low.
The MQTT API is via the MQTTlink
class described below.
The first two files originate from the
micropython-async
library. For convenience all files are provided here.
asyn.py
Synchronisation primitives.
syncom.py
Bitbanged communication library.
pbmqtt.py
Python MQTT interface.
status_values.py
Numeric constants shared between user code, the ESP8266
firmware and pbmqtt.py
; including status values sent to host from ESP8266.
net_local.py
This enables custom settings to be shared between projects. Edit
this for WiFi credentials; also for MQTT parameters and host pin numbers if
these differ from the defaults.
pb_simple.py
Minimal publish/subscribe test. A remote client can turn the
Pyboard green LED on and off and can display regular publications from the
host.
pbmqtt_test.py
Demonstrates the ramcheck facility.
pbrange.py
Tests WiFi range and demos operation near the limit of range using
the Pyboard LED's for feedback.
pb_status.py
Demonstrates the interception of status messages.
Bash scripts to periodically publish to the above test programs. Adapt to your
broker address.
pubtest
For pb_simple.py
and pbmqtt_test.py
.
pubtest_range
For pbrange.py
.
Ensure you have a working MQTT broker on a known IP address, and that you have
PC client software. This document assumes mosquitto_pub
and mosquitto_sub
as clients. For test purposes it's best to start with the broker on the local
network - mosquitto
is recommended as broker. The public brokers referenced
here may also be
used. Clients may be run on any connected PC.
Modify net_local.py
to match your MQTT broker address, WiFi SSID and
password.
Copy the above dependencies to the Pyboard. Install the supplied firmware to
the ESP8266 section 3.1.
Copy pb_simple.py
to the Pyboard and run it. Assuming the broker is on
192.168.0.9, on a PC run:
mosquitto_sub -h 192.168.0.9 -t result
The test program publishes an incrementing count every 10 seconds under the "result" topic. It subscribes to a topic "green" and responds to messages "on" or "off" to control the state of the Pyboard green LED. To test this run
mosquitto_pub -h 192.168.0.9 -t green -m on
mosquitto_pub -h 192.168.0.9 -t green -m off
This provides the host API. MQTT topics and messages are strings restricted to
7-bit ASCII characters with ord()
values in range 1..126 inclusive.
This takes a single mandatory argument which is a dictionary of args. Default
values are defined in pbmqtt.py
. User overrides may be provided in
net_local.py
or in the application. Dictionary entries are as follows
(defaults in parens):
Hardware related:
reset
A Signal
instance associated with the reset output pin. (Y4)
stx
Initialised output pin. (Y5)
sckout
Initialised output pin with value 0. (Y6)
srx
Initialised input pin. (Y7)
sckin
Initialised input pin. (Y8)
timeout
Duration of ESP8266 watchdog (secs). If the ESP8266 crashes, after
this period the ESP8266 will be hard-reset. (10 secs)
fast
Run ESP8266 at 160MHz (recommended) (True
)
Callback:
user_start
A callback to run when communication link is up. Mandatory.
args
Optional args for above. (()
)
The user_start
callback runs when the link between the boards has
initialised. This is where subscriptions are registered and publishing coros
are launched. Its use is covered in detail
below.
WiFi parameters:
ssid
Mandatory. No default.
password
Mandatory. No default.
use_default_net
Use default network if possible. (True
)
If True
, tries to connect to the network stored on the ESP8266. If this
fails, it will connect to the specified network.
If False
, ignores the saved LAN. The specified LAN becomes the new default.
MQTT parameters:
broker
IP address of broker. Mandatory. No default.
mqtt_user
Username ('')
mqtt_pw
Password ('')
ssl
Use SSL (False
)
ssl_params
Repr of dict. (repr({})
)
port
If 0 uses the default MQTT port. (0)
keepalive
Broker keepalive time (secs) (60)
ping_interval
Time between broker pings (secs) (0) (0 == use default)
max_repubs
Max number of qos==1 republications before reonnection is
initiated (4).
clean_session
Behaviour after an outage. (True
)
The Clean Session flag controls behaviour of qos == 1 messages from the broker
after a WiFi outage which exceeds the broker's keepalive time. (MQTT spec
section 3.1.2.4).
If set, such messages from the broker during the outage will be lost. If cleared the broker will send them once connectivity is restored. This presents a hazard in that the ESP8266 WiFi stack has a buffer which can overflow if messages arrive in quick succession. This could result in an ESP8266 crash with a consequent automatic reboot, in which case some of the backlog will be lost.
The client pings the broker up to four times in the keepalive
period. In the
case of applications which publish rarely or never, pinging more frequently
speeds the detection of outages. The ping_interval
parameter enables this to
be accomplished. The default value of 0 results in standard behaviour.
Optional RTC synchronisation:
rtc_resync
(secs). (-1)
0 == disable.
-1 == Synchronise once only at startup.
If interval > 0 the ESP8266 will periodically retrieve the time from an NTP
server and send the result to the host, which will adjust its RTC. The local
time offset specified below will be applied.
local_time_offset
If the host's RTC is to be synchronised to an NTP
server, this allows an offset to be added. Unit is hours. (0)
Broker/network response
response_time
Max expected time in secs for the broker to respond to a
qos == 1 publication. If this is exceeded the message is republished with the
dup flag set.
Verbosity:
verbose
Pyboard prints diagnostic messages. (False
)
debug
ESP8266 prints diagnostic messages. (False
)
publish
Args: topic (str), message (str), retain (bool), qos (0/1). Puts publication on a queue and returns immediately. Defaults: retainFalse
, qos 0.publish
can be called at any time, even if an ESP8266 reboot is in progress.subscribe
Mandatory args: topic (str), qos (0/1), callback. Further positional args may be supplied.
Subscribes to the topic. The callback will run when a publication to the topic is received. The callback args are the topic, message plus any optional args supplied. Subscriptions should be performed in theuser_start
callback to re-subscribe after an ESP8266 reboot. Multiple subscriptions may have separate callbacks.wifi
No args. ReturnsTrue
if WiFi and broker are up. See note below.pubq_len
No args. Returns the length of the publication queue.rtc_syn
No args. ReturnsTrue
if the RTC has been synchronised to an NTP time server.wifi_handler
Mandatory arg: a callback. Further positional args may be supplied.
The callback will run each time the WiFi status changes. Callback arg is abool
followed by any user supplied args. Thebool
indicates if WiFi is up and the broker is accessible. It is first called withTrue
after theuser_start
callback completes. See note below.status_handler
Arg: a coroutine. Overrides the default status handler. The coro takes two args, theMQTTlink
instance and the status value.
Detection of outages can be slow depending on application code. The client
pings the broker, but infrequently. Detection will occur if a publication
fails provoking automatic reconnection attempts. The ping_interval
config
value may be used to speed detection.
Methods intended for debug/test:
running
No args. ReturnsTrue
if WiFi and broker are up and system is running normally.command
Takes an arbitrary number of positional args, formats them and sends them to the ESP8266. Currently the only supported command isMEM
with no args. This causes the ESP8266 to return its memory usage, which the host driver will print. This was to check for memory leaks. None have been observed. Seepbmqtt_test.py
.
will
Args topic (str), msg (str), retain, qos. Set the last will. Must be
called before instantiating the MQTTlink
. Defaults: retain False
, qos
0.
This callback runs when broker connectivity is first established. In the event of an ESP8266 crash, the Pyboard will reset it; the callback will subsequently run again.
Its purpose is to register subscriptions and to launch coros which use the API. MQTT message processing begins on the callback's return so it should run to completion quickly.
Coroutines launched by it which communicate with the ESP8266 should have
provision to be cancelled if connectivity with the ESP8266 is lost. This can
occur if the ESP8266 crashes. The technique for doing this relies on the
cancellation API in asyn.py
and is shown here (taken from pb_simple.py
).
@asyn.cancellable
async def publish(_, mqtt_link, tim):
count = 1
while True:
mqtt_link.publish('result', str(count), 0, qos)
count += 1
await asyn.sleep(tim) # Use asyn.sleep for fast cancellation response
Note the use of asyn.sleep()
for delays of more than around 1s. This speeds
the response to task cancellation, which would otherwise be pending until an
asyncio.sleep()
had elapsed.
See pb_simple.py
and the
synchronisation primitives docs.
A typical reason for interception is to handle fatal errors on initial startup, for example where the WiFi network or broker is unavailable. Options might be to prompt for user intervention or pausing for a period before rebooting.
The ESP8266 can send numeric status values to the host. These are defined and
documented in status_values.py
. The default handler specifies how a network
connection is established after a reset. Initially, if the ESP8266 fails to
connect to the default LAN stored in its flash memory, it attempts to connect
to the network specified in INIT
. On ESP8266 reboots (caused by a crash) it
saves flash wear by avoiding the specified LAN; it waits 30 seconds and
reboots again.
The behaviour in response to status messages may be modified by replacing the
default handler with a user supplied coroutine as described in 2.3.2 above;
the test program pb_status.py
illustrates this.
The driver waits for the handler to terminate, then responds in a way dependent on the status value. If it was a fatal error the ESP8266 will be rebooted. For informational values execution will continue.
The return value from the coroutine is ignored except in response to a
SPECNET
message. If it returns 1 the driver will attempt to connect to the
specified network. If it returns 0 it will reboot the ESP8266.
Where possible these should periodically yield to the scheduler with a nonzero
delay. An asyncio.sleep(secs)
or aysncio.sleep_ms(ms)
will reduce
competition with the bitbanging communications, minimising any impact on
throughput. Issue a zero delay (or yield
) only when a fast response is
required.
The implicit characteristics of radio links mean that WiFi is subject to outages of arbitrary duration: RF interference may occur, or the unit may move out of range of the access point.
This driver aims to handle outages as transparently as possible. If an outage
occurs the ESP8266 signals the driver that this has occurred, signalling again
when connectivity is restored. These events may be trapped by intercepting the
status messages (see pbrange.py
) or - simpler - by using the wifi_handler
(pb_simple.py
).
During an outage publications will be queued. An ongoing qos==1 publication
will be delayed until connectivity is restored. Messages from the broker with
qos==1 will be queued by the broker and will be received when connectivity
recovers. This will end when the broker's keepalive time expires, when any last
will is published. Whether the qos==1 messages are retransmitted then depends
on the state of the Clean Session
flag in net_local.py
.
Note that the ESP8266 vendor network stack has a buffer which can overrun if
messages are sent in rapid succession. If you encounter lost messages and see
LmacRxBlk:1
on the ESP8266 UART this is the cause.
To use the precompiled build, follow the instructions in 3.1 below. The remainder of the ESP8266 documentation is for those wishing to modify the ESP8266 code.
The firmware toggles pin 0 to indicate that the code is running. Pin 2 is driven low when broker connectivity is present. On the reference board this results in the blue LED indicating connectivity status and the red LED flashing while running.
Since the Pyboard and the ESP8266 communicate via GPIO pins the UART/USB interface is available for checking status messages and debugging.
You will need the esptool utility which runs on a PC. It may be found here. Under Linux after installation you will need to assign executable status. On my system:
sudo chmod a+x /usr/local/bin/esptool.py
Erase the flash with
esptool.py --port /dev/ttyUSB0 --baud 115200 erase_flash
Then, from the project directory, issue
esptool.py --port /dev/ttyUSB0 --baud 115200 write_flash --verify --flash_size=detect -fm qio 0 firmware-combined.bin
These args for the reference board may need amending for other hardware.
In the precompiled build all modules are implemented as frozen bytecode. The precompiled build's modules directory comprises the following:
- The uasyncio library (including collections directory,
errno.py
,logging.py
). mqtt.py
Main module.mqtt_as.py
Asynchronous MQTT module.syncom.py
Bitbanged communications driver.status_values.py
Numeric status codes._boot.py
Modified to create main.py in filesystem (see below).
If flash space is limited unused drivers may be removed from the project's
modules
. The following standard files are required:
flashbdev.py
inisetup.py
The mqtt
module needs to auto-start after a hard reset. This requires a
main.py
file. If the standard _boot.py
is used you will need to create
the file as below and copy it to the filesystem:
import mqtt
The modified _boot.py
in this repository removes the need for this step
enabling the firmware image to be flashed to an erased flash chip. After boot
if main.py
does not exist it is created in the filesystem.
This is defined in net_local.py
and passed to the Channel
constructor in
mqtt.py
. Pin 15 is used for mckout because this has an on-board pull down
resistor. This ensures that the ESP8266 clock line is zero while the host
asserts Reset: at that time GPIO lines are high impedance. If the pin lacks a
pull down one should be supplied. A value of 10KΩ or thereabouts will suffice.
This describes the basic mode of operation for anyone wishing to modify the
host or target code. The host sends commands to the ESP8266 target, which
returns reponses. The target is responsible for keeping the link to the broker
open and reconnecting after outages. It handles qos==1 messages checking for
the correct PUBACK
and sending duplicate messages if necessary. If a
subscribed message is received it informs the host which runs the callback.
In the event of an outage the publication response message from the target will be delayed until the outage has ended and reconnection has occurred.
The host and target communicate by a symmetrical bidirectional serial protocol.
At the hardware level it is full-duplex, synchronous and independent of
processor speed. At the software level it is asynchronous. In this application
the unit of communication is a string. When a SynCom
is instantiated it
does nothing until its asynchronous start
method is launched. This takes a
coroutine as an argument. It waits for the other end of the link to start,
synchronises the interface and launches the coro.
In the case of the host this runs forever except on error when it terminates.
The host has a means of issuing a hardware reset to the target, triggered by
the coro terminating. The SynCom
instance resets the target, waits for synch,
and re-launches the coro (SynCom
start method).
The ESP8266 has no means of resetting the host, so there is no reason for its
coro (main_task
) to end.
The interface also provides a means for the host to detect if the ESP8266 has crashed or locked up. To process incoming messages it issues
chan_state = channel.any()
A result of None
means that the channel has timed out which is a result of
ESP8266 failure. In this instance the coro quits causing the ESP8266 to be
reset.
The host instantiates an MQTTlink
object which creates a channel
being
a SynCom
instance. This issues the start
method with its own start
method as the coro argument. This will run every time the ESP8266 starts. If it
returns it will cause an ESP8266 reset once user coros have aborted.
The host can send commands to the ESP8266 which replies with a status response.
The ESP8266 can also send unsolicited status messages. When a command is sent
the host waits for a response as described above, handling a None
response.
The string is parsed into a command - typically STATUS
- and an action, a
list of string arguments. In the case of STATUS
messages the first of these
args is the status value.
Status messages are first passed to the do_status
method which performs
some basic housekeeping and provides optional 'verbose' print messages. It
returns the (possibly amended) status value as an integer. It then waits on the
asynchronous method s_han
which by default is default_status_handler
. This
can be overridden by the user.
Each time the start
method runs it behaves as follows. If the user has set
up a will, it sends a will
command to the ESP8266 and waits for a status
response.
Assuming success it then sends an init
command with the INIT
parameters
which causes the ESP8266 to connect to the WiFi network and then to the broker.
The initialisation phase ends when the ESP8266 sends a RUNNING
status to
the host, when _running
is set (by do_status
). In the meantime the
ESP8266 will send other status messages:
DEFNET
It is about to try the default network in its flash ROM.SPECNET
It has failed to connect to this LAN and wants to connect to the one specified inINIT
. Unless the status handler has been overriddendefault_status_handler
ensures this is done on the first boot only.BROKER_CHECK
It is about to connect to the broker.BROKER_OK
Broker connection established.
Once running it launches the user supplied coroutine. It also launches a coro
to handle publications: the _publish
asynchronous method. It triggers the
wifi callback to indicate readiness; the initialisation phase is now complete
and it enters the running phase.
This continuously running loop exits only on error when the ESP8266 is to be
rebooted. It waits on incoming messages from the ESP8266 (terminating on
None
which indicates a watchdog timeout).
The ESP8266 can send various messages, some such as SUBSCRIPTION
asynchronously in response to a broker message and others such as a PUBOK
status in response to having processed a qos == 1 'publish' message from the
host. Unsolicited messages are:
SUBSCRIPTION
A message published to a user subscription was received.TIME
, value The ESP8266 has contacted a timeserver and has received this time value.STATUS
,WIFI_UP
STATUS
,WIFI_DOWN
STATUS
,UNKNOWN
This should never occur. ESP8266 has received an unknown command from the host or is failing to respond correctly. The driver reboots it.
Expected messages are:
MEM
, free, allocated Response to a 'mem' command.STATUS
,PUBOK
Response to a qos == 1 publication.
User publications are placed on a queue which is serviced by the host's
_publish
coroutine. When it issues a publication it informs the ESP8266 and
sets a flag. This locks out further publications until a PUBOK
is received
from the ESP8266. In the case of qos==1 this occurs when the broker sends a
PUBACK with the correct PID. A PUBOK
clears the flag, re-enabling
publications which resume if any are queued. See pub_free()
.
In the case of a qos==0 publication the ESP8266 will respond with PUBOK
immediately as no response is expected from the broker.
There is a potential for overloading the ESP8266 if the publication queue fills
during an outage. The _publish
coro pauses after completion of a publication
before sending another. It also implements a timeout where no response arrives
from the ESP8266 when the network is available; in this case the ESP8266 is
assumed to have failed and is reset.
The performance of MQTT can be limited by the connection to the broker, which can be slow if the broker is on the internet. This implementation is also constrained by the performance of the serial interface. Under operational conditions this was measured at 118 chars/sec (chars are 7-bit).
In applications such as data logging this is not usually an issue. If latency matters, keep topic names and messages short and (if possible) use a broker on the LAN.
Latency will degrade if using qos==1 on a poor WiFi link, because retransmissions will occur. If WiFi connectivity fails then it will persist for the duration.
Under good conditions latency can be reduced to around 250ms.
The ESP8266 is prone to unexplained crashes. In trials of extended running
these occurred about once every 24 hours. The ESP8266 UART produced repeated
LmacRxBlk:1
messages, locking the scheduler and provoking the Pyboard to
reboot it. Such a reboot normally occurs without data loss.
The system can fail to recover from a crash in the following circumstances. If the broker sends qos==1 messages at a high enough rate, during the ESP8266 reboot the broker accumulates a backlog. When connectivity is restored the broker floods the ESP8266 and its buffer overflows. If the broker's backlog continues to grow this can result in an endless boot loop.
As noted above a backlog of qos==1 messages and consequent flooding can also occur if the ESP8266 moves out of WiFi range for a long enough period.
In testing where qos==1 messages were sent at a rate of every 20s the system was stable and recovered without data loss from the occasional ESP8266 crash.
mqtt introduction
mosquitto server
mosquitto client publish
mosquitto client subscribe
MQTT spec
python client for PC's
Unofficial MQTT FAQ