Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use PROGMEM for lookup tables on AVR platforms #202

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

StarGate01
Copy link
Contributor

This PR moves the static sbox, rsbox and Rcon substitution boxes from the SRAM into the Flash memory on AVR microcontrollers, which is typically larger. This frees up half a kilobyte of RAM.

The PROGMEM attribute is used to specify the memory location, and the pgm_read_byte to access the Flash memory.

@kokke
Copy link
Owner

kokke commented Dec 20, 2021

Hi @StarGate01 and thanks for your interest in contributing :)

I feel that this change is a bit too platform-specific and conflicts a bit with the "portable"-part of the description. I see the point for sure, but I'd rather not support platform-specific code.

@DamonHD
Copy link

DamonHD commented Dec 20, 2021

Note that this was valuable for us when we did it in OTAESGCM for AVR and it may be useful in other MCUs where you can keep the constants in Flash rather than copying them to RAM.

Maybe if this was reworked into something a bit more generic for that wider set of cases, which just happens to come with a first full implementation for AVR?

Rgds

Damon

@kokke
Copy link
Owner

kokke commented Dec 20, 2021

Hi @DamonHD - long time since our last encounter :) I hope you're well and good and that the pandemic hasn't messed up your Christmas.

I assume you're thinking of some macro, like when a project defines "API_CALL" and puts it in front of the public/exported functions, to define calling-conventions etc.?

I see the point and could probably be convinced to change my mind, but I can't quickly come up with something good myself.

@DamonHD
Copy link

DamonHD commented Dec 20, 2021

Hi!

Covid is definitely still complicating my life generally, including Christmas, but I am still alive and well: thank you!

That is indeed the sort of thing I had in mind, and no I don't know what it should look like to cover more than the AVR case. However I can say that our code builds and runs on AVR and non-AVR (eg on our laptops for unit tests), and pretty much the only kludge required is like this IIRC:

#if defined(__AVR_ARCH__) || defined(ARDUINO_ARCH_AVR) // Atmel AVR only.
#include <avr/pgmspace.h>
#else
// Kludge code to treat PROGMEM as part of uniform memory space.
#define PROGMEM
inline uint8_t pgm_read_byte(const uint8_t *p) { return(*p); }
#endif

See:

https://github.com/opentrv/OTAESGCM/blob/master/content/OTAESGCM/utility/OTAESGCM_OTAES128AVR.cpp

With use of PROGMEM etc where needed.

That may then be a clean enough cut for you now, and allow another more general scheme to be wedged in later if needed.

Rgds

Damon

@kokke
Copy link
Owner

kokke commented Dec 20, 2021

I'm glad to hear you're coping @DamonHD :) Denmark (where I live) has been setting infection "high-scores" almost daily for a few weeks. Hope they figure something out soon 🤞 But hey I can still work and buy groceries etc. so I shouldn't really complain...

It's not a bad idea at all, but I would like to reduce the number of #ifdefs if possible.

Assuming you understood my "explanation" of the idea/concept, would you be willing to give it a shot @StarGate01 ?

@StarGate01
Copy link
Contributor Author

Thank you for the suggestions!

I agree that a platform-specific implementation kind of goes against the idea of being portable. However, I think we can work out a solution that we can all agree on.

The implementation by @DamonHD is cool, because specialized code has to be implemented in one place only - as opposed to my initial implementation. I don't think the generic implementation of pgm_read_byte has any impact on runtime once it is inlined properly.

I suggest to create an additional header file (e.g. platform_optimizations.h), where specific optimizations for accessing those static memory sections are implemented. This way the logic implementation in aes.c is separated from the flash memory handling.

What do you think? I'll go ahead and update this PR.

@kokke
Copy link
Owner

kokke commented Dec 21, 2021

I suggest to create an additional header file (e.g. platform_optimizations.h), where specific optimizations for accessing those static memory sections are implemented. This way the logic implementation in aes.c is separated from the flash memory handling.

I think we should keep the macro in aes.h for now, since it's the only platform-optimization.
In case other such platform-optimizations are contributed, we can make a new header for them.

@StarGate01
Copy link
Contributor Author

StarGate01 commented Dec 22, 2021

I implemented the macro in aes.h, the code checks for __AVR_ARCH__ to enable it. I did not want to use any Arduino-specific macros. The rest of the implementation is pretty much the same as @DamonHD suggested.

Other platforms that wish to optimize memory use could then define the PROGMEM attribute (for example, on AVR it is __attribute__((__progmem__))), and the inline uint8_t pgm_read_byte(const uint8_t* p) function - if the compiler does not provide it in the first place anyway.

I successfully ran the tests on my host machine (64bit x86), and checked the generated assembly to verify the inlining.

@kokke
Copy link
Owner

kokke commented Dec 22, 2021

Looking good @StarGate01 - thanks for contributing :)

I successfully ran the tests on my host machine (64bit x86), and checked the generated assembly to verify the inlining.

I intend to do the same for ARM (thumb/thumb2) just FYI, but otherwise I think this looks very good now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants