Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize function layout in hps_accel project #323

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

j-piecuch
Copy link

This PR makes the LDSCRIPT make variable overridable from a project, and uses a project-specific linker script in hps_accel that prevents instruction cache conflicts between ConvPerChannel4x4() and LoadInput(). It also fixes a small bug in hps_accel/Makefile.

To avoid the possibility of the LoadInput function conflicting with
ConvPerChannel4x4 (its caller) in the instruction cache, lay them out
sequentially in the final binary.

Signed-off-by: Jakub Piecuch <[email protected]>
If we don't supply an explicit TARGET to make, it should be set to the default
value in proj.mk, which is digilent_arty. However, this assignment of the
default value is currently done _after_ the ifeq in hps_accel/Makefile, so the
clock frequency isn't reduced to 75MHz even though it should.

Signed-off-by: Jakub Piecuch <[email protected]>
{
_ftext = .;
*(.text.start)
*/conv_accel.o(.text.*ConvPerChannel4x4*)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like this file may have mixed spaces and tabs for indents, it would be good to fix that up for clarity.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, but first let's decide whether this linker script should be here at all.

@@ -0,0 +1,65 @@
INCLUDE output_format.ld
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please add the copyright header to this file? Just grab it from one of the others. "Copyright the CFU-Playground authors" etc.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I no longer think that providing custom linker scripts for each project is a viable solution, so this file will probably be removed. Please see my latest comments for an alternative approach.

*(.text .stub .text.* .gnu.linkonce.t.*)
_etext = .;
} > main_ram
/*} > rom */
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm why are these commented out?

Actually now that I think about it, it looks like this linker script is correct for Arty (where we run the whole CFU-Playground application from RAM) but won't work on HPS boards (where we run the application from flash)? I guess for HPS boards you have to uncomment these lines instead, right?

Is there a way we can have a single linker script that will place things in the right regions in all cases?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This linker script is essentially copy-pasted from common/ld/linker.ld, including the fragments which are commented out.
You're right, this script isn't suitable for every board. I was not aware of the platform-specific scripts in common/_hps etc.
I will write down some ideas on how to solve this in a comment outside of this conversation.

@tcal-x
Copy link
Collaborator

tcal-x commented Oct 15, 2021

@j-piecuch @danc86 I should point out that we use an overlay system for the default linker script for HPS: https://github.com/google/CFU-Playground/tree/main/common/_hps/hps/ld . I guess here, the override LDSCRIPT might need to choose between two or more different scripts depending on the PLATFORM:TARGET.

@j-piecuch
Copy link
Author

@danc86 @tcal-x apologies, I was not aware of the existing overlay system.
In that case, we should probably come up with some other way of letting projects override the order of functions in the .text section.

One simple, but somewhat hacky solution is to use sed to patch the linker script once it has been copied to the build directory. For this to work, each linker script would need to include some specific string that tells us where to insert the overrides.
For instance, the .text section in common/ld/linker.ld script could look like this:

	.text :
	{
		_ftext = .;
		*(.text.start)
		/* TEXT_OVERRIDES */
		*(.text .stub .text.* .gnu.linkonce.t.*)
		_etext = .;
	} > main_ram

In the project's Makefile, we could define a make variable that contains the overrides:

TEXT_OVERRIDES := */conv_accel.o(.text.*ConvPerChannel4x4*) */blocks.o(.text.*LoadInput*)

We can perform the substitution in the recipe for build-dir in proj.mk:

.PHONY: build-dir
build-dir: $(BUILD_DIR)/src tflite-micro-src $(BUILD_DIR_EXTRA_DEP) 
	@echo "build-dir: copying source to build dir"
	$(COPY) $(COMMON_DIR)/*              $(BUILD_DIR)
	$(COPY) $(MLCOMMONS_SRC_DIR)/*       $(BUILD_DIR)/src
	$(COPY) $(SAXON_SRC_DIR)/riscv.h     $(BUILD_DIR)/src
	$(COPY) $(SRC_DIR)/*                 $(BUILD_DIR)/src
	$(RM)			             $(BUILD_DIR)/_*
# Overlay platform / target specific changes.
ifneq ($(wildcard $(COMMON_DIR)/_$(PLATFORM)/$(TARGET)/*),)
	$(COPY) $(COMMON_DIR)/_$(PLATFORM)/$(TARGET)/* $(BUILD_DIR)
endif
# Allow project makefile to override placement of functions in the .text section
	sed -i -e '/.*TEXT_OVERRIDES.*/c\$(TEXT_OVERRIDES)' $(BUILD_DIR)/ld/linker.ld

A somewhat less hacky way to accomplish the same goal would be to use some macro/templating engine like jinja or the C preprocessor.

@alanvgreen
Copy link
Collaborator

It's good to know that this approach works, but let's hold off on introducing specific hacks until the code is more settled. I think LoadInput() may be deleted soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants