Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[URGENT] Reducing our usage of GitHub Runners #14376

Open
lupyuen opened this issue Oct 17, 2024 · 58 comments · Fixed by #14377, apache/nuttx-apps#2750, #14386, apache/nuttx-apps#2753 or #14400
Open
Assignees

Comments

@lupyuen
Copy link
Member

lupyuen commented Oct 17, 2024

Hi All: We have an ultimatum to reduce (drastically) our usage of GitHub Actions. Or our Continuous Integration will halt totally in Two Weeks. Here's what I'll implement within 24 hours for nuttx and nuttx-apps repos:

  1. When we submit or update a Complex PR that affects All Architectures (Arm, RISC-V, Xtensa, etc): CI Workflow shall run only half the jobs. Previously CI Workflow will run arm-01 to arm-14, now we will run only arm-01 to arm-07. (This will reduce GitHub Cost by 32%)

  2. When the Complex PR is Merged: CI Workflow will still run all jobs arm-01 to arm-14

    (Simple PRs with One Single Arch / Board will build the same way as before: arm-01 to arm-14)

  3. For NuttX Admins: Our Merge Jobs are now at github.com/NuttX/nuttx. We shall have only Two Scheduled Merge Jobs per day

    I shall quickly Cancel any Merge Jobs that appear in nuttx and nuttx-apps repos. Then at 00:00 UTC and 12:00 UTC: I shall start the Latest Merge Job at nuttxpr. (This will reduce GitHub Cost by 17%)

  4. macOS and Windows Jobs (msys2 / msvc): They shall be totally disabled until we find a way to manage their costs. (GitHub charges 10x premium for macOS runners, 2x premium for Windows runners!)

    Let's monitor the GitHub Cost after disabling macOS and Windows Jobs. It's possible that macOS and Windows Jobs are contributing a huge part of the cost. We could re-enable and simplify them after monitoring.

    (This must be done for BOTH nuttx and nuttx-apps repos. Sadly the ASF Report for GitHub Runners doesn't break down the usage by repo, so we'll never know how much macOS and Windows Jobs are contributing to the cost. That's why we need CI: Disable all jobs for macOS and Windows #14377)

    (Wish I could run NuttX CI Jobs on my M2 Mac Mini. But the CI Script only supports Intel Macs sigh. Buy a Refurbished Intel Mac Mini?)

We have done an Analysis of CI Jobs over the past 24 hours:

https://docs.google.com/spreadsheets/d/1ujGKmUyy-cGY-l1pDBfle_Y6LKMsNp7o3rbfT1UkiZE/edit?gid=0#gid=0

Many CI Jobs are Incomplete: We waste GitHub Runners on jobs that eventually get superseded and cancelled

Screenshot 2024-10-17 at 1 18 14 PM

When we Half the CI Jobs: We reduce the wastage of GitHub Runners

Screenshot 2024-10-17 at 1 15 30 PM

Scheduled Merge Jobs will also reduce wastage of GitHub Runners, since most Merge Jobs don't complete (only 1 completed yesterday)

Screenshot 2024-10-17 at 1 16 16 PM

See the ASF Policy for GitHub Actions

lupyuen added a commit to lupyuen2/wip-nuttx that referenced this issue Oct 17, 2024
This PR disables all CI Jobs for macOS and Windows, to reduce GitHub Cost. Details here: apache#14376
lupyuen added a commit to lupyuen2/wip-nuttx-apps that referenced this issue Oct 17, 2024
This PR disables all CI Jobs for macOS and Windows, to reduce GitHub Cost. Details here: apache/nuttx#14376
@lupyuen
Copy link
Member Author

lupyuen commented Oct 17, 2024

As commented by @xiaoxiang781216:

can we reduce the board on Linux host to keep macOS/Windows? it's very easy to break these host if without these basic coverage.

I suggest that we monitor the GitHub Cost after disabling macOS and Windows Jobs. It's possible that macOS and Windows Jobs are contributing a huge part of the cost. We could re-enable and simplify them after monitoring.

@raiden00pl
Copy link
Contributor

One of the methods proposed by, if I remember correctly @btashton, is to replace many simple configurations for some boards (mostly for peripherals testing) with one large jumbo config activating everything possible.
This won't work for chips with low memory, but it will save some CI resources anyway.

@lupyuen
Copy link
Member Author

lupyuen commented Oct 17, 2024

@raiden00pl Yep I agree. Or we could test a complex target like board:lvgl?

@lupyuen
Copy link
Member Author

lupyuen commented Oct 17, 2024

Here's another comment about macOS and Windows by @yamt: #14377 (comment)

@yamt
Copy link
Contributor

yamt commented Oct 17, 2024

sorry, let me ask a dumb question.
what plan are we using? https://github.com/pricing
is apache paying for it?

@lupyuen
Copy link
Member Author

lupyuen commented Oct 17, 2024

what plan are we using? https://github.com/pricing

@yamt It's probably a special plan negotiated by ASF and GitHub? It's not mentioned in the ASF Policy for GitHub Actions: https://infra.apache.org/github-actions-policy.html

I find this "contract" a little strange. Why are all ASF Projects subjected to the same quotas? And why can't we increase the quota if we happen to have additional funding?

Update: More info here: https://cwiki.apache.org/confluence/display/INFRA/GitHub+self-hosted+runners

If your project uses GitHub Actions, you share a queue with all other Apache projects using Github Actions, which can quickly lead to frustration for everyone involved. Builds can be stuck in "queued" for 6+ hours.

One option (if you want to stick with GitHub and don't want to use the Infra-managed Jenkins) is for your project to create its own self-hosted runners, which means your jobs will run on a virtual machine (VM) under your project's control. However this is not something to tackle lightly, as Infra will not manage or secure your VM - that is up to you.

Update 2: This sounds really complicated. I'd rather use my own Mac Mini to execute the NuttX CI Tests, once a day?

@yamt
Copy link
Contributor

yamt commented Oct 17, 2024

what plan are we using? https://github.com/pricing

@yamt It's probably a special plan negotiated by ASF and GitHub? It's not mentioned in the ASF Policy for GitHub Actions: https://infra.apache.org/github-actions-policy.html

do you know if the macos/windows premium applies as usual?
the policy page seems to have no mention about it.

I find this "contract" a little strange. Why are all ASF Projects subjected to the same quotas? And why can't we increase the quota if we happen to have additional funding?

yea, i guess projects have very different sizes/demands.
(i feel nuttx is using too much anyway though :-)

@TimJTi
Copy link
Contributor

TimJTi commented Oct 17, 2024

...I'd rather use my own Mac Mini to execute the NuttX CI Tests, once a day?

Is there any merit in "farming out" CI tests to those with boards? I think there was a discussion about NuttX owning a suite of boards but not sure where that got to - and would depend on just 1 or 2 people managing it.

As an aside, is there a guide to self-running CI? As I work on a custom board it would be good for me to do this occasionally but I have noi idea where to start!

@lupyuen
Copy link
Member Author

lupyuen commented Oct 17, 2024

@TimJTi Here's how I do daily testing on Milk-V Duo S SBC: https://lupyuen.github.io/articles/sg2000a

@TimJTi
Copy link
Contributor

TimJTi commented Oct 17, 2024

@TimJTi Here's how I do daily testing on Milk-V Duo S SBC: https://lupyuen.github.io/articles/sg2000a

And I just RTFM...the "official" guide is here so I'll review both and hopefully get it working - and submit any tweaks/corrections/enhancements I find are needed to the NuttX "How To" documentation

@jerpelea
Copy link
Contributor

jerpelea commented Oct 17, 2024 via email

@michallenc
Copy link
Contributor

michallenc commented Oct 17, 2024

@TimJTi Here's how I do daily testing on Milk-V Duo S SBC: https://lupyuen.github.io/articles/sg2000a

And I just RTFM...the "official" guide is here so I'll review both and hopefully get it working - and submit any tweaks/corrections/enhancements I find are needed to the NuttX "How To" documentation

These work, but it does not describe the entire CI, just how to run pytest checks for sim:citest configuration.

@cederom
Copy link
Contributor

cederom commented Oct 17, 2024

Yes let's cut what we can (but to keep at least minimal functional configure, build, syntax testing) and see what are the cost reduction. We need to show Apache we are working on the problem. So far optimitzations did not cut the use and we are in danger of loosing all CI :-(

On the other hand that seems not fair to share the same CI quota as small projects. NuttX is a fully featured RTOS working on ~1000 different devices. In order to keep project code quality we need the CI.

Maybe its time to rethink / redesign from scratch the CI test architecture and implementation?

@cederom
Copy link
Contributor

cederom commented Oct 17, 2024

Another problem is that people very often send unfinished undescribed PRs that are updated without a comment or request that triggers whole big CI process several times :-(

Some changes are sometimes required and we cannot avoid that this is part of the process. But maybe we can make something more "adaptive" so only minimal CI is launched by default, preferably only in area that was changed, then with all approvals we can make one manual trigger final big check before merge?

Long story short: We can switch CI test runs to manual trigger for now to see how it reduces costs. I would see two buttons to start Basic and Advanced (maybe also Full = current setup) CI.

@lupyuen
Copy link
Member Author

lupyuen commented Oct 17, 2024

@cederom Maybe our PRs should have a Mandatory Field: Which NuttX Config to build, e.g. rv-virt:nsh. Then the CI Workflow should do tools/configure.sh rv-virt:nsh && make. Before starting the whole CI Build?

@cederom
Copy link
Contributor

cederom commented Oct 17, 2024

@cederom Maybe our PRs should have a Mandatory Field: Which NuttX Config to build, e.g. rv-virt:nsh. Then the CI Workflow should do tools/configure.sh rv-virt:nsh && make. Before starting the whole CI Build?

People often cant fill even one single sentence to describe Summary, Impact, Testing :D This may be detected automatically.. or we can just see what architecture is the cheapest one and use it for all basic tests..?

@raiden00pl
Copy link
Contributor

Another problem is that people very often send unfinished undescribed PRs that are updated without a comment or request that triggers whole big CI process several times :-(

Often contributors use CI to test all configuration instead of testing changes locally. On one hand I understand this because compiling all configurations on a local machine takes a lot of time, on the other hand I'm not sure if CI is for this purpose (especially when we have limits on its use).

@cederom Maybe our PRs should have a Mandatory Field: Which NuttX Config to build, e.g. rv-virt:nsh. Then the CI Workflow should do tools/configure.sh rv-virt:nsh && make. Before starting the whole CI Build?

It won't work. Users are lazy, and in order to choose what needs to be compiled correctly, you need a comprehensive knowledge of the entire NuttX, which is not that easy.
The only reasonable option is to automate this process.

@cederom
Copy link
Contributor

cederom commented Oct 17, 2024

So it looks like for now, where dramatic steps need to be taken, we need to mark all PR as drafts and start CI by hand when we are sure all is ready for merge? o_O

@jerpelea
Copy link
Contributor

jerpelea commented Oct 17, 2024 via email

xiaoxiang781216 pushed a commit to apache/nuttx-apps that referenced this issue Oct 17, 2024
This PR disables all CI Jobs for macOS and Windows, to reduce GitHub Cost. Details here: apache/nuttx#14376
lupyuen added a commit to lupyuen2/wip-nuttx that referenced this issue Oct 17, 2024
When we submit or update a Complex PR that affects All Architectures (Arm, RISC-V, Xtensa, etc): CI Workflow shall run only half the jobs. Previously CI Workflow will run `arm-01` to `arm-14`, now we will run only `arm-01` to `arm-07`.

When the Complex PR is Merged: CI Workflow will still run all jobs `arm-01` to `arm-14`

Simple PRs with One Single Arch / Board will build the same way as before: `arm-01` to `arm-14`

This is explained here: apache#14376

Note that this version of `arch.yml` has diverged from `nuttx-apps`, since we are unable to merge apache#14377
@lupyuen
Copy link
Member Author

lupyuen commented Oct 21, 2024

What if we could run the CI Jobs on our own Ubuntu PCs? Without any help from GitHub Actions?

I'm experimenting with a "Build Farm" at home (refurbished PC) that runs NuttX CI Jobs all day non-stop 24 x 7:

  • Check out master branch of nuttx, run CI Job arm-01
  • Wait for arm-01 to complete (roughly 1.5 hours)
  • Check out master branch of nuttx, run CI Job arm-02
  • Wait for arm-02 to complete (roughly 1.5 hours)
  • Do the same until arm-14, then loop back to arm-01
  • Here's the CI Output Log

How does it work?

  • run-job.sh will run a single CI Job, by calling the NuttX Docker Image, which is called by...
  • run-ci.sh looping forever through arm-01 to arm-14, running the job, searching for errors and uploading the logs

ci2-title

lupyuen added a commit to lupyuen2/wip-nuttx that referenced this issue Oct 21, 2024
This PR splits the CI Build Job sim-01 and adds sim-03:

Before the Split: Simulator Jobs take up to 1.5 hours to complete
- sim-01 (1 hour 31 mins): adb, citest, lvgl, matter
- sim-02 (28 mins): posix_test, sqlite

After the Split: Simulator Jobs will complete within 1 hour
- sim-01 (58 mins): adb, citest
- sim-02 (35 mins): lvgl, matter
- sim-03 (28 mins): posix_test, sqlite

This will help us comply with the ASF Policy for GitHub Actions, as explained here: apache#14376
@lupyuen
Copy link
Member Author

lupyuen commented Oct 21, 2024

8 Days to Diwali: Will our CI Servers go Dark? Sorry we're not sure, because the ASF Infra Reports are Down (sigh). But I think we briefly hit a peak of 21 Full-Time GitHub Runners, which is still within our target of 25 Full-Time Runners.

Screenshot 2024-10-22 at 6 07 29 AM

xiaoxiang781216 pushed a commit that referenced this issue Oct 22, 2024
This PR splits the CI Build Job sim-01 and adds sim-03:

Before the Split: Simulator Jobs take up to 1.5 hours to complete
- sim-01 (1 hour 31 mins): adb, citest, lvgl, matter
- sim-02 (28 mins): posix_test, sqlite

After the Split: Simulator Jobs will complete within 1 hour
- sim-01 (58 mins): adb, citest
- sim-02 (35 mins): lvgl, matter
- sim-03 (28 mins): posix_test, sqlite

This will help us comply with the ASF Policy for GitHub Actions, as explained here: #14376
@lupyuen
Copy link
Member Author

lupyuen commented Oct 22, 2024

ASF Infra Reports are still down. But now we have our own Live Metrics for Full-Time GitHub Runners! (reload for updates)

(Live Image) (Live Log)

This shows the number of Full-Time Runners for the Day, computed since 00:00 UTC. (Remember: We should keep this below 25)

  • Date: We compute the Full-Time Runners for today's date only (UTC)
  • Elapsed Hours: Number of hours elapsed since 00:00 UTC
  • GitHub Job Hours: Duration of all nuttx and nuttx-apps GitHub Jobs (cancelled / completed / failed). This data is available only AFTER the job has been cancelled / completed / failed (might be a lag of 1.5 hours). This is the Elapsed Job Duration, it doesn't say that we're running 8 smaller jobs in parallel, that's why we need...
  • GitHub Runner Hours: Number of GitHub Runners * Job Duration, which is effectively the Chargeable Minutes by GitHub. We compute this as 8 * GitHub Job Hours. This is averaged from past data. (Remember: One GitHub Runner will run One Single Sub-Job, like arm-01)
  • Full-Time GitHub Runners: Equals GitHub Runner Hours / Elapsed Hours. It means "How many GitHub Runners, running Full-Time, in order to consume the GitHub Runner Hours". (We should keep this below 25 per day, per week, per month, etc)

How it works:

  • compute-github-runners.sh calls GitHub API to add up the Duration of All Completed GitHub Jobs for today. Then it extrapolates the Number of Full-Time GitHub Runners. (1 GitHub Job Hour roughly equals 8 GitHub Runner Hours, which equals 8 Full-Time Runners Per Hour)
  • run.sh calls the script above to render the Full-Time GitHub Runners as a PNG (with ImageMagick)

@xiaoxiang781216
Copy link
Contributor

@lupyuen the new number is very small, should we try restoring some macOS/msys2/windows ci?

@lupyuen
Copy link
Member Author

lupyuen commented Oct 22, 2024

@xiaoxiang781216 Let's monitor for the rest of the day. Towards the end of the day, the number of Full-Time Runners will probably jump to 21. (Like yesterday)

We could mirror the NuttX Repo to another GitHub Org account and run the Windows and macOS Jobs there (so they won't add to our quota). We just need minor changes to build.yml and arch.yml: #14407 (at the bottom of the doc)

@xiaoxiang781216
Copy link
Contributor

xiaoxiang781216 commented Oct 22, 2024

could we use this https://github.com/nuttx account, which is our official mirror?

@lupyuen
Copy link
Member Author

lupyuen commented Oct 22, 2024

@xiaoxiang781216 The macOS and Windows Builds are now running in our NuttX Mirror: https://github.com/NuttX/nuttx/actions/workflows/build.yml

I made 2 fixes to enable the macOS and Windows Builds: build.yml and arch.yml

Lemme figure out how to automate this 🤔

Update: This script will enable macOS and Windows Builds for our NuttX Mirror. Our Merge Jobs are now at github.com/NuttX/nuttx.

Daily at 00:00 UTC and 12:00 UTC: I click Sync Fork > Discard Commits then I run enable-macos-windows.sh.

(Remember: We run this script to kill the Merge Jobs on the old nuttx and nuttx-apps repos. Otherwise the GitHub Runners will spike!)

@simbit18
Copy link
Contributor

@lupyuen Thank you for this amazing work!

@lupyuen
Copy link
Member Author

lupyuen commented Oct 22, 2024

7 Days to Event Horizon: Yesterday we consumed 4 Full-Time GitHub Runners (too low?). Let's keep this number below 25 Full-Time Runners, so our CI Servers won't sink into the black hole!

  • Last 5 days: We consumed 13 Full-Time Runners. That's below ASF's Limit of 25 Full Time Runners. So we're still on target! I shared the good news with ASF Infra Team in my email (Update: They responded yay!)

  • Which means I shall Freeze our CI Workflow. Wait for ASF Infra Team to approve on 30 Oct. Then we decide how to reorg our Build Targets.

  • Our Merge Jobs are now at github.com/NuttX/nuttx. (Includes macOS and Windows Builds)

  • How It Happens: Daily at 00:00 UTC and 12:00 UTC, I click Sync Fork > Discard Commits. Then I run this script to enable the macOS and Windows Builds.

  • Don't Forget: I'm still running this script to kill the Merge Jobs on the old nuttx and nuttx-apps repos. (Otherwise the GitHub Runners will spike!)

The numbers from yesterday are a little sus, lemme verify (they lost data during outage?)

Screenshot 2024-10-23 at 7 07 18 AM

Alternatively: We have our cool purple report (14 Full-Time Runners for yesterday)...

(Live Image) (Live Log)

Thanks everyone for the kudos! :-)

@lupyuen
Copy link
Member Author

lupyuen commented Oct 23, 2024

6 Days to Serenity: Yesterday we consumed 13 Full-Time GitHub Runners. (That's half of the ASF Quota for GitHub Runners) Everything is coming up nicely (even ASF agrees), thanks to everyone! 🙏

Screenshot 2024-10-24 at 6 11 46 AM

Past 7 Days: Our consumption of GitHub Runners has dropped from 76 down to 20 (yay!)

Screenshot 2024-10-24 at 6 37 02 AM

Our Cool Purple Report now turns Orange when the load goes up (yep our estimated data is a little higher than the ASF data, just to be safe)

(Live Image) (Live Log)

lupyuen added a commit to lupyuen2/wip-nuttx-apps that referenced this issue Oct 24, 2024
CI Build Job sim-02 was disabled to reduce our usage of GitHub Runners, to comply with ASF Policy: apache/nuttx#14376 (comment)

However this causes the Scheduled Merge Job to fail, due to reduced CI Checks: https://github.com/NuttX/nuttx/actions/runs/11490041505/job/31980056690#step:7:465

This PR re-enables sim-02 when we create or update a Complex PR.
lupyuen added a commit to lupyuen2/wip-nuttx that referenced this issue Oct 24, 2024
CI Build Job sim-02 was disabled to reduce our usage of GitHub Runners, to comply with ASF Policy: apache#14376 (comment)

However this causes the Scheduled Merge Job to fail, due to reduced CI Checks: https://github.com/NuttX/nuttx/actions/runs/11490041505/job/31980056690#step:7:465

This PR re-enables sim-02 when we create or update a Complex PR.
lupyuen added a commit to apache/nuttx-apps that referenced this issue Oct 24, 2024
CI Build Job sim-02 was disabled to reduce our usage of GitHub Runners, to comply with ASF Policy: apache/nuttx#14376 (comment)

However this causes the Scheduled Merge Job to fail, due to reduced CI Checks: https://github.com/NuttX/nuttx/actions/runs/11490041505/job/31980056690#step:7:465

This PR re-enables sim-02 when we create or update a Complex PR.
xiaoxiang781216 pushed a commit that referenced this issue Oct 24, 2024
CI Build Job sim-02 was disabled to reduce our usage of GitHub Runners, to comply with ASF Policy: #14376 (comment)

However this causes the Scheduled Merge Job to fail, due to reduced CI Checks: https://github.com/NuttX/nuttx/actions/runs/11490041505/job/31980056690#step:7:465

This PR re-enables sim-02 when we create or update a Complex PR.
@lupyuen
Copy link
Member Author

lupyuen commented Oct 24, 2024

5 Days to Freedom: Yesterday we consumed 14 Full-Time GitHub Runners. That's 56% of the ASF Quota for Full-Time Runners. Looking good!

stbenn pushed a commit to stbenn/nuttx that referenced this issue Oct 25, 2024
Initial STM32H5 Commit

Initial commit of what I deemed essential files for bringing up the STM32H5. src/stm32h5/hardware files were edited by me, but need review. files in src/stm32h5 all need review and edits. include/stm32h5 files need review, some were edited by me.

Add Nucleo-H563ZI Folder

Add the board folder for the nucleo-h563zi. Right now this is largely a copy of the stm32l562e-dk configuration. Some files may be deleted in the future. Also made minor modifications to arch/arm/src/stm32h5/Kconfig file.

hardware/stm32h562xx_rcc.h update

Finished register and bit mapping for STM32H5 RCC

Rename hardware/stm32h5_rcc.h

Renamed stm32h562xx_rcc.h to stm32h5_rcc.h. The RCC register is the same for all versions of the STM32H5.

Defined rcc_enableperipherals functions

Defined all the functions wihtin rcc_enableperipherals. Getting started on stm32h5_stdclockconfig.

Incremental STM32H5 RCC Updates

Incremental Updates apache#2

Added stm32h5_lse.c and stm32h5_lsi.c files. Incremental updates to board.h, stm32h5xx_rcc.c, and hardware/stm32h5_rcc.h

Incremental Updates apache#3

Added stm32h5_hsi48.c and stm32h5_hsi48.h files. Incremental updates to board.h, stm32h5xx_rcc.c, and hardware/stm32h5_rcc.h. Renamed hardware crs file. Fixed lse.c and lsi.c for STM32H5.

Incremental Updates apache#4

Updated setting of VOS for STM32H5. Added HSIDIV definition to hardware/stm32h5_rcc.h for potential of changing HSIDIV from default. Changed board.h to use HSI of 32 MHz, which is the default. We still set SYSCLK to the max of 250MHz.

First STM32H5 PWR Commit

Rewrote hardware/stm32h5_pwr.h. Added stm32h5_pwr.c and stm32h5_pwr.h. Made minor changes to RCC files based on PWR peripheral.

PWR Peripheral Changes

Removed enablesmps function. LDO or SMPS is decided by hardware. Removed enablepwrclk. There is no PWREN for the STM32H5. Rewrote adustvcore. vcore must be adjusted incrementally.

Incremental Updates apache#5

Changed stm32 to stm32h5 in pwr.c. Added additional logic for selecting PLL sources. Added additional logic for enabling LSE or LSI. Set VCORE properly with stm32h5_pwr.c function. Fixes to adjustvcore function.

STM32H5 Power and RCC cleanup

Fixed some errors with private functions and incorrect preprocessor variables. Changed adjustvcore to not select intermediate VOS levels. Figure 49 in RM shows changing directly from VOS3 to VOS1. Added function adjustvos_ext for externally supplied VCORE. However I'm not sure if VOS should be incremented, then voltage incremented, then frequency incremented, or if VOS should be incremented one by one to final setting, then adjust voltage, then frequency. adjustvos does the former. Won't be used in stdclockconfig.

STM32H5 serial update

This commit primarily adds functionality taken from the stm32g4 lpuart implementation. The template I used, from the stm32l5, already had the LPUART in there but did not calculate the baud correctly. Added more USARTS and UARTS supported by STM32H5. Minor changes to chip.h, stm32h5_start.c, and Kconfig.

STM32H5 Serial Update apache#2

Added support for additional USARTS and UARTS on STM32H5. Other minor serial updates.

Build Fixes

Various fixes to get the stm32h5 arch to build. Many changes to follow. But for now, Nuttx builds.

Remove unnecessary hardware files from STM32H5 directory

More build changes

Even more build fixes

Minor fixes in stm32h5_rcc.c and stm32h5_pwr.c. Changed nucleo-h563zi defconfig to use std clock config. This resulted in errors that were fixed here. Also added stm32h5_lse.c and stm32h5_lsi.c to Make.defs.

Removed legacy pinmap. It is deprecated and should not be used on new designs.

Confirmed hardware crs and i2c files are correct. Will keep them for now.

IRQ info for STM32H52, STM32H53, STM32H56, STM32H57

libcxx: fix compile error

                 from ServiceManager.cpp:17:
/home/ligd/platform/dev/apps/external/android/frameworks/native/libs/binder/ndk/include_cpp/android/binder_to_string.h:71:24: error: expected nested-name-specifier before numeric constant
   71 |     template <typename _U>
      |                        ^~
/home/ligd/platform/dev/apps/external/android/frameworks/native/libs/binder/ndk/include_cpp/android/binder_to_string.h:71:24: error: expected ‘>’ before numeric constant
In file included from /home/ligd/platform/dev/apps/external/android/frameworks/native/libs/binder/aidl/android/os/ConnectionInfo.h:3,
                 from /home/ligd/platform/dev/apps/external/android/frameworks/native/libs/binder/aidl/android/os/IServiceManager.h:3,
                 from /home/ligd/platform/dev/apps/external/android/frameworks/native/libs/binder/aidl/android/os/BnServiceManager.h:4:
/home/ligd/platform/dev/apps/external/android/frameworks/native/libs/binder/ndk/include_cpp/android/binder_to_string.h:72:56: error: no matching function for call to ‘declval<1>()’
   72 |     static auto _test(int) -> decltype(std::declval<_U>().toString(), std::true_type());
      |                                        ~~~~~~~~~~~~~~~~^~
In file included from /home/ligd/platform/dev/nuttx/include/libcxx/__type_traits/is_convertible.h:18,

Signed-off-by: ligd <[email protected]>

libc string:Separate code.

Separate the code that follows the BSD license into independent files.

Signed-off-by: yangguangcai <[email protected]>

arch/sim/cmake: remove the host specific -U when HOSTSRCS

fix macos compile hostfs.c compile issue.
/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX15.0.sdk/usr/include/_string.h:131:62: error: expected function body after function declarator
  131 | char    *stpncpy(char *__dst, const char *__src, size_t __n) __OSX_AVAILABLE_STARTING(__MAC_10_7, __IPHONE_4_3);
      |                                                              ^

Signed-off-by: buxiasen <[email protected]>

Revert "libc/lib_bzero:Add bzero prototype."

This reverts commit 908814a.

In macos, memset will be automatic optmize to bzero, caused dead loop, as we not using bzero, macro re-define should ablt to cover the requirements.

Signed-off-by: buxiasen <[email protected]>

arhc/arm64: vector table may be far away form arm64_fatal_handle

use 33-bit (+/-4GB) pc-relative addressing to load
the address of arm64_fatal_handle

Signed-off-by: lipengfei28 <[email protected]>

sim: fix asan address space conflict

Modify the starting position of the elf segment to 0x5000000

==2561587==Shadow memory range interleaves with an existing memory mapping. ASan cannot proceed correctly. ABORTING.
==2561587==ASan shadow was supposed to be located in the [0x1ffff000-0x3fffffff] range.
==2561587==Process memory map follows:

Signed-off-by: yinshengkai <[email protected]>

arm64/toolchains:Add the following kasan compilation options

Signed-off-by: wangmingrong1 <[email protected]>

remove unused variable 'cpu_freq'

Signed-off-by: lipengfei28 <[email protected]>

drivers/timers/arch_alarm.c: Remove ndelay_accurate

Using ONESHOT_CURRENT retrieves the tick number multiplied by tick time; thus
it doesn't give the accurate monotonic time - it is quantized by
the tick time. This cannot be used as a ndelay timer, it would always loop
at least to the end of the ongoing tick.

Revert the up_udelay to use the original "coarse" looping. The "accurate" udelay,
if such is needed, should either be done under arch specific code, or there should be
a function for getting the accurate time that is available for all the platforms.

Signed-off-by: Jukka Laitinen <[email protected]>

boards/imx93-evk: Define CONFIG_BOARD_LOOPSPERMSEC

Use value measured with 1.8GHz CPU speed

Signed-off-by: Jukka Laitinen <[email protected]>

arch/x86_64:Fix variable used before assignment

Signed-off-by: liwenxiang1 <[email protected]>

arch/arm64: vector table 2K align

Signed-off-by: lipengfei28 <[email protected]>

arm/build: suppress LOAD RWX linker warning

Add --no-warn-rwx-segments in case of RAM boot mode to linker to
suppress the below warning:
"nuttx has a LOAD segment with RWX permissions"

Signed-off-by: Jinliang Li <[email protected]>

arch/arm64/src/imx9/imx9_lpspi.c: Fix 9-16 bit transfers

Signed-off-by: Jukka Laitinen <[email protected]>

arch/arm64/src/imx9/imx9_lpspi.c: Small cache operation optimization

There is no need to invalidate the RX buffer before every transfer.
It is never gets dirty, so it is good to invalidate initially after allocation,
and after each transfer.

Signed-off-by: Jukka Laitinen <[email protected]>

libxx: C++ low level library select LIBSUPCXX by default.

Signed-off-by: cuiziwei <[email protected]>

nuttx/sim: Fix m64 build error.

LD:  nuttx
 nuttx.rel: in function `ff_dct32_float_sse2':
 (.text+0x66f9e): relocation truncated to fit: R_X86_64_32S against symbol `ff_cos_32' defined in .bss.ff_cos_32 section in nuttx.rel
 (.text+0x66fa7): relocation truncated to fit: R_X86_64_32S against symbol `ff_cos_32' defined in .bss.ff_cos_32 section in nuttx.rel
 (.text+0x672a6): relocation truncated to fit: R_X86_64_32S against symbol `ff_cos_16' defined in .bss.ff_cos_16 section in nuttx.rel
 (.text+0x672ae): relocation truncated to fit: R_X86_64_32S against symbol `ff_cos_16' defined in .bss.ff_cos_16 section in nuttx.rel
 nuttx.rel: in function `ff_imdct_calc_sse':
 (.text+0x67905): relocation truncated to fit: R_X86_64_32S against symbol `ff_cos_64' defined in .bss.ff_cos_64 section in nuttx.rel
 (.text+0x67948): relocation truncated to fit: R_X86_64_32S against symbol `ff_cos_128' defined in .bss.ff_cos_128 section in nuttx.rel
 (.text+0x67988): relocation truncated to fit: R_X86_64_32S against symbol `ff_cos_256' defined in .bss.ff_cos_256 section in nuttx.rel
 (.text+0x679c8): relocation truncated to fit: R_X86_64_32S against symbol `ff_cos_512' defined in .bss.ff_cos_512 section in nuttx.rel
 (.text+0x67a08): relocation truncated to fit: R_X86_64_32S against symbol `ff_cos_1024' defined in .bss.ff_cos_1024 section in nuttx.rel
 (.text+0x67a48): relocation truncated to fit: R_X86_64_32S against symbol `ff_cos_2048' defined in .bss.ff_cos_2048 section in nuttx.rel
 (.text+0x67a88): additional relocation overflows omitted from the output

Signed-off-by: cuiziwei <[email protected]>

tls.h: list.h should depends on CONFIG_PTHREAD_ATFORK

Signed-off-by: ligd <[email protected]>

bluetooth: fix bt missing header files nuttx/wqueue.h

Signed-off-by: ligd <[email protected]>

lib_gdbstub: fix container of

Signed-off-by: buxiasen <[email protected]>
Signed-off-by: ligd <[email protected]>

container_of: fix compile failed cause of list.h not support container_of

Signed-off-by: ligd <[email protected]>

nuttx/arch:Enabling ARCH_MATH_H is required when compiling sim with the 13.2 version of the toolchain.

Signed-off-by: cuiziwei <[email protected]>
Signed-off-by: ligd <[email protected]>

arm/stm32f401rc-rs485: Add support to WS2812 addressable LED

Signed-off-by: Rodrigo Sim <[email protected]>

syslog: Don't allow blocking when in signal handler

Blocking while running a signal handler is not advisable, instead write
the log string character by character.

There is also a potential for a deadlock, as discussed in apache#6618

Note: querying for rtcb->sigdeliver is not 100% ideal, as it only tells
_if_ a signal handler has been queued, not if it is running. However, it
makes syslog safe / usable which is a debug feature anyhow.

boards/risc-v: Remove ref to riscv_internal.h

`riscv_internal.h` is a private chip level header file,
and it should not be included in the board files.

Signed-off-by: Huang Qi <[email protected]>

boards/esp32s3: Merge MCUboot and "simple-boot" linker scripts

To make it easier to keep the linker scripts updated for both
MCUboot and "simple-boot", this commit merges them into a single
linker script with macros to enable/disable specific sections.

task_exit.c: Add missing sched_note_stop()

A regression from apache#13728 ; sched_note_stop() is never called for tasks
that exit normally via exit().

nuttx: Add LIBSUPCXX_TOOLCHAIN to link the prebuilt library provide by toolchain.

Signed-off-by: cuiziwei <[email protected]>

serial/gdbstub:Adjust serial port gdbstub Kconfig dependencies

Signed-off-by: anjiahao <[email protected]>

gdbstub:fix typo

Signed-off-by: anjiahao <[email protected]>

coredump: coredump_add_memory_region need use flags

Signed-off-by: anjiahao <[email protected]>

arm64: fix fvp smp faild to boot

reason:
we should give a busy wait addr

This commit fixes the regression from apache#13640

Signed-off-by: hujun5 <[email protected]>

CI: Enable sim-02 build when we create or update a Complex PR

CI Build Job sim-02 was disabled to reduce our usage of GitHub Runners, to comply with ASF Policy: apache#14376 (comment)

However this causes the Scheduled Merge Job to fail, due to reduced CI Checks: https://github.com/NuttX/nuttx/actions/runs/11490041505/job/31980056690#step:7:465

This PR re-enables sim-02 when we create or update a Complex PR.

arch/Kconfig: remove ARCH_MATH_H if LIBCXX

Because some libraries do require a full libm implementation.

Signed-off-by: zhanghongyu <[email protected]>

Documentation: migrate README.txt from boards and fixes for mps boards

migrate some README.txt form boards/ and fixes for mps boards rst

samv7: fix QSPI build

Commit 313d6df caused the following build error:

CC:  fixedmath/lib_b16atan2.c chip/sam_qspi.c: In function 'qspi_memory':
chip/sam_qspi.c:1552:7: warning: implicit declaration of function 'IS_ALIGNED' [-Wimplicit-function-declaration]
 1552 |       IS_ALIGNED((uintptr_t)meminfo->buffer, 4) &&
      |       ^~~~~~~~~~
In file included from chip/sam_qspi.c:41:
chip/sam_qspi.c: In function 'qspi_alloc':
chip/sam_qspi.c:1591:21: warning: implicit declaration of function 'ALIGN_UP' [-Wimplicit-function-declaration]
 1591 |   return kmm_malloc(ALIGN_UP(buflen, 4));

This was caused by missing include of nuttx.h header defining ALIGN_UP
and IS_ALIGNED.

Signed-off-by: Michal Lenc <[email protected]>

mmcsd: SDIO_CAPS_4BIT_ONLY set buswidth MMCSD_SCR_BUSWIDTH_4BIT

uint8_t buswidth:4;              /* Bus widths supported (SD only) */

Signed-off-by: zhangshoukui <[email protected]>

armv8m/clang.cmake: add armv8m clang config

Its makefile is implemented in arch/arm/src/armv8-m/Toolchain.defs as follows:
ifeq ($(CONFIG_ARM_TOOLCHAIN_CLANG),y)

  ifeq ($(CONFIG_ARCH_CORTEXM23),y)
    TOOLCHAIN_CLANG_CONFIG = armv8m.main_soft_nofp
  else ifeq ($(CONFIG_ARCH_CORTEXM33),y)
    ifeq ($(CONFIG_ARCH_FPU),y)
      TOOLCHAIN_CLANG_CONFIG = armv8m.main_hard_fp
    else
      TOOLCHAIN_CLANG_CONFIG = armv8m.main_soft_nofp
    endif
  else ifeq ($(CONFIG_ARCH_CORTEXM35P),y)
    ifeq ($(CONFIG_ARCH_FPU),y)
      TOOLCHAIN_CLANG_CONFIG = armv8m.main_hard_fp
    else
      TOOLCHAIN_CLANG_CONFIG = armv8m.main_soft_nofp
    endif
  else ifeq ($(CONFIG_ARCH_CORTEXM55),y)
    ifeq ($(CONFIG_ARCH_FPU),y)
      TOOLCHAIN_CLANG_CONFIG = armv8.1m.main_hard_fp
    else
      TOOLCHAIN_CLANG_CONFIG = armv8.1m.main_soft_nofp_nomve
    endif
  else ifeq ($(CONFIG_ARCH_CORTEXM85),y)
    ifeq ($(CONFIG_ARCH_FPU),y)
      TOOLCHAIN_CLANG_CONFIG = armv8.1m.main_hard_fp
    else
      TOOLCHAIN_CLANG_CONFIG = armv8.1m.main_soft_nofp_nomve
    endif
  endif

Signed-off-by: wangmingrong1 <[email protected]>

Writing documentation related to SPI slave.

Fix build issues

Fix xtensa build error with choice LIBSUPCXX by default.

Signed-off-by: cuiziwei <[email protected]>

sim/cmake: compatible when nuttx COMPILE_OPTIONS is not set yet

Signed-off-by: buxiasen <[email protected]>

Fix cdcncm printf formatter compiler warning

esp32s3: Increase the init task stask size when using NSH

After recent changes on nuttx-apps (not limited to, but related
to nuttx-apps#2738, for instance), the stack usage for the NSH
task increased, causing stack overflows under specific situations
(when running `ps` command, for instance). This commit increases
the init task stack size to avoid it. Please note that, even before
these changes, the stack usage of the NSH task was around 90% and,
then, increasing the stack size of it was recommended.

kconfig: Add link parameters that can print remaining memory information

LD: nuttx
Memory region         Used Size  Region Size  %age Used
           flash:      284272 B       512 KB     54.22%
           sram1:       13296 B         2 MB      0.63%
           sram2:          0 GB         2 MB      0.00%
CP: nuttx.hex
CP: nuttx.bin

Signed-off-by: wangmingrong1 <[email protected]>

Fixed selection of irq file.

Added flash.ld script to nucleo-h563zi/scripts folder. Changed Make.defs to use it. Minor change to Kconfig regarding flash configurations.

Various changes

Fix include guards.
@stbenn
Copy link

stbenn commented Oct 25, 2024

@lupyuen It looks I made a mistake with some commit messages, that caused our branch to get referenced to a few issues in the apache repo. My apologies. I believe I have removed the commit message references, but if there is anything else I need to do to fix this, please let me know and I will get on it ASAP.

@lupyuen
Copy link
Member Author

lupyuen commented Oct 25, 2024

@stbenn No worries thanks :-)

@lupyuen
Copy link
Member Author

lupyuen commented Oct 25, 2024

4 Days to Festivity: Yesterday we consumed 13 Full-Time GitHub Runners (half of the ASF Quota for GitHub Runners)...

Screenshot 2024-10-26 at 7 32 10 AM

Past 7 Days: We used an average of 9 Full-Time GitHub Runners...

Screenshot 2024-10-26 at 7 37 14 AM

So we're on track to make ASF very happy on 30 Oct! Let's monitor today...

(Live Image) (Live Log)

@cederom
Copy link
Contributor

cederom commented Oct 26, 2024

Thank you @lupyuen for your amazing work!! Have a good calm weekend :-) :-)

@lupyuen
Copy link
Member Author

lupyuen commented Oct 26, 2024

3 Days to Tranquility: Yesterday was a quiet Saturday (no more Release Builds yay!). We consumed only 4 Full-Time GitHub Runners...

Screenshot 2024-10-27 at 6 08 34 AM

Let's hope today will be a peaceful Sunday...

(Live Image) (Live Log)

@lupyuen
Copy link
Member Author

lupyuen commented Oct 27, 2024

Something strange about Network Timeouts in our Docker Workflows: First Run fails while downloading something from GitHub:

Configuration/Tool: imxrt1050-evk/libcxxtest,CONFIG_ARM_TOOLCHAIN_GNU_EABI
curl: (28) Failed to connect to github.com port 443 after 134188 ms: Connection timed out
make[1]: *** [libcxx.defs:28: libcxx-17.0.6.src.tar.xz] Error 28

Second Run fails again, while downloading NimBLE from GitHub:

Configuration/Tool: nucleo-wb55rg/nimble,CONFIG_ARM_TOOLCHAIN_GNU_EABI
curl: (28) Failed to connect to github.com port [443](https://github.com/nuttxpr/nuttx/actions/runs/11535899222/job/32112716849#step:7:444) after 134619 ms: Connection timed out
make[2]: *** [Makefile:55: /github/workspace/sources/apps/wireless/bluetooth/nimble_context] Error 2

Third Run succeeds. Why do we keep seeing these errors: GitHub Actions with Docker, can't connect to GitHub itself?

Is something misconfigured in our Docker Image? But the exact same Docker Image runs fine on my own Build Farm. It doesn't show any errors.

Is GitHub Actions starting our Docker Container with the wrong MTU (Network Packet Size)? 🤔

Meanwhile I'm running a script to Restart Failed Jobs on our NuttX Mirror Repos: restart-failed-job.sh

@lupyuen
Copy link
Member Author

lupyuen commented Oct 27, 2024

2 Days to Transcendence: Yesterday we consumed 10 Full-Time GitHub Runners. We peaked briefly at 21 while compiling a few NuttX Apps.

Screenshot 2024-10-28 at 6 16 33 AM

Let's keep on monitoring thanks!

(Live Image) (Live Log)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment