Skip to content

Conversation

@michaellee1019
Copy link
Member

@michaellee1019 michaellee1019 commented Nov 11, 2025

Two improvements to the reload process:

  • Adds an automatic retry to the shell copy. I've seen various scenarios where shell service can fail and I think we should be defensive in the reload process for it. Due to networking situations or user error, subsequent copies to the part can pass once the part is online.

  • Allows for copying an existing build artifact that you have locally to the same part again or another part using --no-build. Before this change --no-build was used only for reload-local to do the same thing. I reorganized the folders to omit the part-id from the file path and we now persist the latest binary for each platform in the same folder. I feel like its more organized this way.

Testing:

Retry is displayed to the user and correctly connects and copies the file after a few failures. To simulate this I restarted the part while the copy was happening.

michaellee@ROBOT-MYHLX2FTJN working-wheel % viam module reload --part-id 9e373bb2-8024-4cbc-ad12-fae3791499c0
 …  Preparing for build...
 ✓     → Module is registered (0s)                                                                                                           
 ✓     → Source code archive created (0s)                                                                                                    
 ✓     → Source code uploaded (1s)                                                                                                           
 ✓  Prepared for build (2s)
 …  Building...
 ✓     → Build started (ID: 70b2348e-db5b-4c2e-a5b8-cf0c301b3f75) (5s)                                                                       
 ✓     → Spin up environment (24s)                                                                                                           
 ✓     → Install Viam CLI (2s)                                                                                                               
 ✓     → Download repository archive (2s)                                                                                                    
 ✓     → Build module (56s)                                                                                                                  
 ✓     → Uploading artifacts (6s)                                                                                                            
 ✓  Built (1m34s)
 …  Reloading to part...
 ✓     → Build artifact downloaded (22s)                                                                                                     
 ✓     → Shell service already exists (0s)                                                                                                   
 ⠼     → Uploading package... (1m17s)
 ✗       → Upload attempt 1/6...: could not connect to machine part: rpc error: code = Unavailable desc = requestID=06ed9fa8a10c4fd26c5abdc39a5d135b: rpc error: code = Unavailable desc = host appears to be offline; ensure machine is online and try again; context deadline exceeded; mDNS query failed to find a candidate
 ✗       → Upload attempt 2/6...: could not connect to machine part: rpc error: code = Unavailable desc = requestID=c8a9de0e879e8e1734be2305c76657b7: rpc error: code = Unavailable desc = host appears to be offline; ensure machine is online and try again; context deadline exceeded; mDNS query failed to find a candidate
 ✗       → Upload attempt 3/6...: could not connect to machine part: rpc error: code = Unavailable desc = requestID=ae31b49ef24ac827ad72d95a1af8111e: rpc error: code = Unavailable desc = host appears to be offline; ensure machine is online and try again; context deadline exceeded; mDNS query failed to find a candidate
 ✓       → Upload attempt 4 succeeded (57s)                                                                                                  
 ✓     → Package uploaded (4m50s)
 ✓     → Module added to part (1s)                                                                                                           
 ✓  Reloaded to part (5m13s)

Output when all attempts to copy fails (changed to two retries temporarily to speed up manual testing).

michaellee@ROBOT-MYHLX2FTJN working-wheel % viam module reload --part-id 9e373bb2-8024-4cbc-ad12-fae3791499c0 --no-build
Info: Starting reload onto part with existing artifact at: reload-dist/linux-arm64.tar.gz...
 ✓     → Shell service already exists (0s)                                                                                                   
 ⠹     → Uploading package... (1m17s)
 ✗       → Upload attempt 1/2...: could not connect to machine part: rpc error: code = Unavailable desc = requestID=0f3be2005138eaafc48796128c820d9c: rpc error: code = Unavailable desc = host appears to be offline; ensure machine is online and try again; context deadline exceeded; mDNS query failed to find a candidate
 ✗       → Upload attempt 2/2...: could not connect to machine part: rpc error: code = Unavailable desc = requestID=a25385c8eca7069762c76896934f66d0: rpc error: code = Unavailable desc = host appears to be offline; ensure machine is online and try again; context deadline exceeded; mDNS query failed to find a candidate
 ✗     → Uploading package...: all 2 upload attempts failed. You can retry the copy later, skipping the build step with: viam module reload --no-build --part-id 9e373bb2-8024-4cbc-ad12-fae3791499c0
 ✗  Reloading to part...
Error: All 2 upload attempts failed. You can retry the copy later, skipping the build step with: viam module reload --no-build --part-id 9e373bb2-8024-4cbc-ad12-fae3791499c0

Output with --no-build option:

michaellee@ROBOT-MYHLX2FTJN working-wheel % viam module reload --part-id 9e373bb2-8024-4cbc-ad12-fae3791499c0 --no-build
Info: Starting reload onto part with existing artifact at: reload-dist/linux-arm64.tar.gz...
 ✓     → Shell service already exists (0s)                                                                                                   
 ✓     → Package uploaded (52s)                                                                                                              
 ✓     → Module already exists on part (1s)                                                                                                  
 ✓     → Module restarted successfully (5s)                                                                                                  
 ✓  Reloaded to part

@viambot viambot added the safe to test This pull request is marked safe to test from a trusted zone label Nov 11, 2025
@viambot viambot added safe to test This pull request is marked safe to test from a trusted zone and removed safe to test This pull request is marked safe to test from a trusted zone labels Nov 11, 2025
@michaellee1019 michaellee1019 marked this pull request as ready for review November 11, 2025 23:49
@michaellee1019 michaellee1019 requested a review from a team as a code owner November 11, 2025 23:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

safe to test This pull request is marked safe to test from a trusted zone

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants