Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

reduce amount of data loaded during atx validation #6326

Draft
wants to merge 41 commits into
base: develop
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
41 commits
Select commit Hold shift + click to select a range
d71f60f
reduce hare commitee to 50 (#5793)
dshulyak Jul 8, 2024
219c3b3
update changelog for v1.6.2
poszu Jul 8, 2024
b611b1f
Merge pull request #6105 from spacemeshos/backport/1.6/5793
poszu Jul 8, 2024
a67ef78
api: Add go-http-metrics to collect API metrics (#6099)
kacpersaw Jul 6, 2024
44307e1
Update CHANGELOG
kacpersaw Jul 10, 2024
eb7bf9f
Merge pull request #6111 from spacemeshos/http-metrics-backport
kacpersaw Jul 10, 2024
b081ff3
api: v2alpha1: Add vesting, vault and drain vault contents to Transac…
kacpersaw Jul 10, 2024
a2a399b
Update CHANGELOG
kacpersaw Jul 10, 2024
e9c3554
Update CHANGELOG
kacpersaw Jul 10, 2024
4e44ec1
Merge pull request #6118 from spacemeshos/backport/1.6/6112
kacpersaw Jul 10, 2024
2f80519
Backport 6155 - Upgrade to support 8.0 Mio ATXs (#6123)
fasmat Jul 10, 2024
7a2ba9a
Fix ATX syncer hangs (#6137)
poszu Jul 15, 2024
e64a1dc
update changelog for v1.6.3
poszu Jul 16, 2024
69618c8
Merge pull request #6138 from spacemeshos/backport/1.6/6137
poszu Jul 16, 2024
7dfddaf
Backport #6107: Cache poet certifier info (#6166)
fasmat Jul 22, 2024
dc2b018
Backport #6152: fix(fetch): close completed channel once (#6167)
fasmat Jul 22, 2024
7279e67
Backport #6142: ATX handler rejects invalid ATXs on pubsub lvl (#6168)
fasmat Jul 22, 2024
e6bc4b6
Backport #6185: optimize mempool (#6186)
acud Jul 30, 2024
05b6b00
Fix merge tool not ignoring OS files (#6187) (#6190)
fasmat Jul 30, 2024
7a16b18
[backport] Reducing log spam (#6128) (#6191)
poszu Jul 30, 2024
52b14e6
Backport swagger release workflow for v1.6 branch (#6200)
fasmat Aug 1, 2024
334d6cc
[backport] poet caching opts (#6201)
poszu Aug 1, 2024
bbcfad4
fallback to PoW if cannot recertify for poet (#6197)
poszu Aug 1, 2024
88813a3
update changelog
poszu Aug 1, 2024
2d031c4
Merge pull request #6202 from spacemeshos/backport/1.6/fix-poet-pow-f…
poszu Aug 1, 2024
cf74f71
use singleflight to deduplicate ATX processing (#6106) (#6211)
poszu Aug 5, 2024
e4839b8
api: v2alpha1: Add labels_per_unit to NetworkService.Info endpoint (#…
kacpersaw Aug 6, 2024
95df96f
Update changelog
kacpersaw Aug 6, 2024
f07c531
tidy
kacpersaw Aug 6, 2024
ca92135
Merge pull request #6216 from spacemeshos/backport/1.6/6213
kacpersaw Aug 6, 2024
8d594d0
Backport 6241: Speed up ATX cache warmup (#6246)
fasmat Aug 12, 2024
e1b52c2
removed redudnant contextual validation in ATX V1 handler (#6243) (#6…
fasmat Aug 12, 2024
cc52738
Backport 6248: Fix response data slice too small (#6249)
fasmat Aug 13, 2024
da10d43
Backport 6248: Fix response data slice too small (#6249)
fasmat Aug 13, 2024
5362d6e
Fix CHANGELOG for v1.6 branch (#6250)
fasmat Aug 13, 2024
e8a0dec
Merge branch 'tmp' into v1.6
fasmat Aug 13, 2024
faf78f5
[backport] api: v2alpha1: Use subquery instead of left join for trans…
kacpersaw Aug 21, 2024
8ca900d
backport #6278: silence logs (#6280)
acud Aug 22, 2024
172f4cf
Backport 6268: Update time measurement of metrics for proposal builde…
fasmat Aug 22, 2024
d87adec
Backport 6279: Use ATXData during tortoise init (#6281)
fasmat Aug 22, 2024
f1a6929
followup for profiling with stacks
dshulyak Sep 12, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
72 changes: 72 additions & 0 deletions .github/workflows/api-swagger-ui.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,72 @@
name: Build and Push Swagger-UI to R2 testnet-api-docs.spacemesh.network

env:
go-version: "1.22"

on:
release:
types: [created]

jobs:
check-version:
runs-on: ubuntu-22.04
outputs:
go-sm-api-version: ${{ steps.go-sm-api-version.outputs.GO_SM_API_VERSION }}
steps:
- name: Checkout target repository on last release
uses: actions/checkout@v4

- name: Set up Go
uses: actions/setup-go@v4
with:
check-latest: true
go-version: ${{ env.go-version }}

- name: Extract dependency version
id: go-sm-api-version
run: |
version=$(go list -m 'github.com/spacemeshos/api/release/go' | awk '{print $2}')
echo "GO_SM_API_VERSION=$version" > $GITHUB_OUTPUT

deploy:
runs-on: ubuntu-22.04
needs: check-version
steps:
- uses: actions/checkout@v4
with:
repository: spacemeshos/api
path: api
fetch-depth: 0
ref: 'refs/tags/${{ needs.check-version.outputs.go-sm-api-version }}'

- name: upload to testnet
uses: jakejarvis/s3-sync-action@master
with:
args: --acl public-read --follow-symlinks --delete
env:
AWS_S3_BUCKET: ${{ secrets.CLOUDFLARE_TESTNET_API_BUCKET }}
AWS_ACCESS_KEY_ID: ${{ secrets.CLOUDFLARE_ACCESS_KEY_ID }}
AWS_SECRET_ACCESS_KEY: ${{ secrets.CLOUDFLARE_SECRET_ACCESS_KEY }}
SOURCE_DIR: api/release/openapi/swagger/src
DEST_DIR: '/${{ github.event.release.tag_name }}'
AWS_S3_ENDPOINT: https://${{ secrets.CLOUDFLARE_ACCOUNT_ID }}.r2.cloudflarestorage.com

- name: update url json file for testnet
working-directory: api/release/openapi/swagger/src
run: |
curl -o spec_urls.json https://testnet-api-docs.spacemesh.network/spec_urls.json
new_url="{\"url\":\"https://testnet-api-docs.spacemesh.network/${{ github.event.release.tag_name }}/api.swagger.json\",\"name\":\"${{ github.event.release.tag_name }}\"}"
jq ". += [$new_url]" spec_urls.json > tmp.json && mv tmp.json spec_urls.json

- name: upload new testnet json file
uses: jakejarvis/s3-sync-action@master
with:
args: --acl public-read --follow-symlinks --delete
env:
AWS_S3_BUCKET: ${{ secrets.CLOUDFLARE_TESTNET_API_BUCKET }}
AWS_ACCESS_KEY_ID: ${{ secrets.CLOUDFLARE_ACCESS_KEY_ID }}
AWS_SECRET_ACCESS_KEY: ${{ secrets.CLOUDFLARE_SECRET_ACCESS_KEY }}
SOURCE_DIR: api/release/openapi/swagger/src/spec_urls.json
DEST_DIR: ''
AWS_S3_ENDPOINT: https://${{ secrets.CLOUDFLARE_ACCOUNT_ID }}.r2.cloudflarestorage.com

91 changes: 91 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,97 @@

See [RELEASE](./RELEASE.md) for workflow instructions.

## Release v1.6.8

### Improvements

* [#6269](https://github.com/spacemeshos/go-spacemesh/pull/6269) Use sub query instead of left join to speed up
transaction list query.

* [#6278](https://github.com/spacemeshos/go-spacemesh/pull/6278) Reduce logging levels of some messages to reduce noise.

* [#6279](https://github.com/spacemeshos/go-spacemesh/pull/6279) Speed up initialization of tortoise by reducing the
number of queries to the DB.

* [#6268](https://github.com/spacemeshos/go-spacemesh/pull/6268) Fix incorrect timing collection for proposal builder.

* [#6278](https://github.com/spacemeshos/go-spacemesh/pull/6278) Reduce logging levels of some messages to reduce noise.

## Release v1.6.7

### Improvements

* [#6197](https://github.com/spacemeshos/go-spacemesh/pull/6197) Fix falling back to poet PoW if recertification
failed after getting a 401 on registering

* [#6241](https://github.com/spacemeshos/go-spacemesh/pull/6241) Improve speed of ATX cache warmup.

### Features

* [#6213](https://github.com/spacemeshos/go-spacemesh/pull/6213) Adds `labels_per_unit` to the v2alpha1 Network Info API.

## Release v1.6.6-hotfix1

### Improvements

* [#6248](https://github.com/spacemeshos/go-spacemesh/pull/6248) Fixed node not being able to handle more than 6.55M
ATXs per epoch.

## Release v1.6.6

### Improvements

* [#6198](https://github.com/spacemeshos/go-spacemesh/pull/6198) Configure default TTL for caching poet's /v1/info

* [#6199](https://github.com/spacemeshos/go-spacemesh/pull/6199) Cache poet's /v1/pow_params

## Release v1.6.5

### Improvements

* [#6185](https://github.com/spacemeshos/go-spacemesh/pull/6185) Optimize mempool

* [#6187](https://github.com/spacemeshos/go-spacemesh/pull/6187) The merge tool now ignores files that are not `.key`
files in the `identities` directory when merging two nodes.

* [#6128](https://github.com/spacemeshos/go-spacemesh/pull/6128) Reduce logs spam

## Release v1.6.4

### Improvements

* [#6107](https://github.com/spacemeshos/go-spacemesh/pull/6107) Cache PoET queries between multiple identities on the
same node. This will reduce the number of requests the node makes to the PoET server during the cyclegap.

* [#6152](https://github.com/spacemeshos/go-spacemesh/pull/6152) Fixed a bug where in rare cases the node would panic
due to the closing of a closed channel in the fetcher.

* [#6142](https://github.com/spacemeshos/go-spacemesh/pull/6142) Fix node not dropping peers that are broadcasting
invalid ATXs.

## Release v1.6.3

### Improvements

* [#6137](https://github.com/spacemeshos/go-spacemesh/pull/6137) Fix hanging ATX sync.

## Release v1.6.2

### Improvements

* [#5793](https://github.com/spacemeshos/go-spacemesh/pull/5793) Reduced hare committee 8x from 400 to 50 to decrease
network traffic caused by Hare.

* [#6099](https://github.com/spacemeshos/go-spacemesh/pull/6099) Adds new metrics to the API to provide insights into
the performance and behavior of the node's APIs.

* [#6115](https://github.com/spacemeshos/go-spacemesh/pull/6115) Increase the number of supported ATXs to 8.0 Mio.

### Features

* [#6112](https://github.com/spacemeshos/go-spacemesh/pull/6112) Adds vesting, vault, and drain vault contents to the
v2alpha2 Transaction API. Fixes the 'unspecified' transaction type.

## Release v1.6.1

### Improvements
Expand Down
30 changes: 19 additions & 11 deletions activation/activation.go
Original file line number Diff line number Diff line change
Expand Up @@ -45,13 +45,17 @@ type PoetConfig struct {
RequestTimeout time.Duration `mapstructure:"poet-request-timeout"`
RequestRetryDelay time.Duration `mapstructure:"retry-delay"`
PositioningATXSelectionTimeout time.Duration `mapstructure:"positioning-atx-selection-timeout"`
CertifierInfoCacheTTL time.Duration `mapstructure:"certifier-info-cache-ttl"`
PowParamsCacheTTL time.Duration `mapstructure:"pow-params-cache-ttl"`
MaxRequestRetries int `mapstructure:"retry-max"`
}

func DefaultPoetConfig() PoetConfig {
return PoetConfig{
RequestRetryDelay: 400 * time.Millisecond,
MaxRequestRetries: 10,
RequestRetryDelay: 400 * time.Millisecond,
MaxRequestRetries: 10,
CertifierInfoCacheTTL: 5 * time.Minute,
PowParamsCacheTTL: 5 * time.Minute,
}
}

Expand Down Expand Up @@ -415,7 +419,13 @@ func (b *Builder) run(ctx context.Context, sig *signing.EdSigner) {
for _, poet := range b.poets {
eg.Go(func() error {
_, err := poet.Certify(ctx, sig.NodeID())
if err != nil {
switch {
case errors.Is(err, ErrCertificatesNotSupported):
b.logger.Debug("not certifying (not supported in poet)",
log.ZShortStringer("smesherID", sig.NodeID()),
zap.String("poet", poet.Address()),
)
case err != nil:
b.logger.Warn("failed to certify poet", zap.Error(err), log.ZShortStringer("smesherID", sig.NodeID()))
}
return nil
Expand Down Expand Up @@ -500,7 +510,6 @@ func (b *Builder) BuildNIPostChallenge(ctx context.Context, nodeID types.NodeID)
// build new challenge
logger.Info("building new NiPOST challenge", zap.Uint32("current_epoch", current.Uint32()))
case err != nil:
logger.Info("failed to load NiPoST challenge from local state", zap.Error(err))
return nil, fmt.Errorf("get nipost challenge: %w", err)
case challenge.PublishEpoch < current:
logger.Info(
Expand All @@ -519,7 +528,7 @@ func (b *Builder) BuildNIPostChallenge(ctx context.Context, nodeID types.NodeID)
}
default:
// challenge is fresh
logger.Info("loaded NiPoST challenge from local state",
logger.Debug("loaded NiPoST challenge from local state",
zap.Uint32("current_epoch", current.Uint32()),
zap.Uint32("publish_epoch", challenge.PublishEpoch.Uint32()),
)
Expand Down Expand Up @@ -624,7 +633,7 @@ func (b *Builder) BuildNIPostChallenge(ctx context.Context, nodeID types.NodeID)
PositioningATX: posAtx,
}
}
logger.Info("persisting the new NiPOST challenge", zap.Object("challenge", challenge))
logger.Debug("persisting the new NiPOST challenge", zap.Object("challenge", challenge))
if err := nipost.AddChallenge(b.localDB, nodeID, challenge); err != nil {
return nil, fmt.Errorf("add nipost challenge: %w", err)
}
Expand Down Expand Up @@ -685,7 +694,6 @@ func (b *Builder) PublishActivationTx(ctx context.Context, sig *signing.EdSigner
return fmt.Errorf("wait for publication epoch: %w", ctx.Err())
case <-b.layerClock.AwaitLayer(challenge.PublishEpoch.FirstLayer()):
}
b.logger.Debug("publication epoch has arrived!", log.ZShortStringer("smesherID", sig.NodeID()))

for {
b.logger.Info(
Expand Down Expand Up @@ -915,13 +923,13 @@ func (b *Builder) getPositioningAtx(
return types.EmptyATXID, err
}

b.logger.Info("found candidate positioning atx",
b.logger.Debug("found candidate positioning atx",
log.ZShortStringer("id", id),
log.ZShortStringer("smesherID", nodeID),
)

if previous == nil {
b.logger.Info("selected atx as positioning atx",
b.logger.Info("selected positioning atx",
log.ZShortStringer("id", id),
log.ZShortStringer("smesherID", nodeID))
return id, nil
Expand Down Expand Up @@ -999,14 +1007,14 @@ func findFullyValidHighTickAtx(

// iterate trough epochs, to get first valid, not malicious ATX with the biggest height
atxdata.IterateHighTicksInEpoch(publish+1, func(id types.ATXID) (contSearch bool) {
logger.Info("found candidate for high-tick atx", log.ZShortStringer("id", id))
logger.Debug("found candidate for high-tick atx", log.ZShortStringer("id", id))
if ctx.Err() != nil {
return false
}
// verify ATX-candidate by getting their dependencies (previous Atx, positioning ATX etc.)
// and verifying PoST for every dependency
if err := validator.VerifyChain(ctx, id, goldenATXID, opts...); err != nil {
logger.Info("rejecting candidate for high-tick atx", zap.Error(err), log.ZShortStringer("id", id))
logger.Debug("rejecting candidate for high-tick atx", zap.Error(err), log.ZShortStringer("id", id))
return true
}
found = &id
Expand Down
38 changes: 18 additions & 20 deletions activation/certifier.go
Original file line number Diff line number Diff line change
Expand Up @@ -117,7 +117,15 @@ func (c *Certifier) Certificate(
case !errors.Is(err, sql.ErrNotFound):
return nil, fmt.Errorf("getting certificate from DB for: %w", err)
}
return c.Recertify(ctx, id, certifier, pubkey)
cert, err = c.client.Certify(ctx, id, certifier, pubkey)
if err != nil {
return nil, fmt.Errorf("certifying POST at %v: %w", certifier, err)
}

if err := certifierdb.AddCertificate(c.db, id, *cert, pubkey); err != nil {
c.logger.Warn("failed to persist poet cert", zap.Error(err))
}
return cert, nil
})

if err != nil {
Expand All @@ -126,21 +134,11 @@ func (c *Certifier) Certificate(
return cert.(*certifierdb.PoetCert), nil
}

func (c *Certifier) Recertify(
ctx context.Context,
id types.NodeID,
certifier *url.URL,
pubkey []byte,
) (*certifierdb.PoetCert, error) {
cert, err := c.client.Certify(ctx, id, certifier, pubkey)
if err != nil {
return nil, fmt.Errorf("certifying POST at %v: %w", certifier, err)
}

if err := certifierdb.AddCertificate(c.db, id, *cert, pubkey); err != nil {
c.logger.Warn("failed to persist poet cert", zap.Error(err))
func (c *Certifier) DeleteCertificate(id types.NodeID, pubkey []byte) error {
if err := certifierdb.DeleteCertificate(c.db, id, pubkey); err != nil {
return err
}
return cert, nil
return nil
}

type CertifierClient struct {
Expand Down Expand Up @@ -209,7 +207,7 @@ func (c *CertifierClient) obtainPostFromLastAtx(ctx context.Context, nodeId type
}
}

c.logger.Info("found POST in an existing ATX", zap.String("atx_id", atxid.Hash32().ShortString()))
c.logger.Debug("found POST in an existing ATX", zap.String("atx_id", atxid.Hash32().ShortString()))
return &nipost.Post{
Nonce: post.Nonce,
Indices: post.Indices,
Expand All @@ -222,21 +220,21 @@ func (c *CertifierClient) obtainPostFromLastAtx(ctx context.Context, nodeId type
}

func (c *CertifierClient) obtainPost(ctx context.Context, id types.NodeID) (*nipost.Post, error) {
c.logger.Info("looking for POST for poet certification")
c.logger.Debug("looking for POST for poet certification")
post, err := nipost.GetPost(c.localDb, id)
switch {
case err == nil:
c.logger.Info("found POST in local DB")
c.logger.Debug("found POST in local DB")
return post, nil
case errors.Is(err, sql.ErrNotFound):
// no post found
default:
return nil, fmt.Errorf("loading initial post from db: %w", err)
}

c.logger.Info("POST not found in local DB. Trying to obtain POST from an existing ATX")
c.logger.Debug("POST not found in local DB. Trying to obtain POST from an existing ATX")
if post, err := c.obtainPostFromLastAtx(ctx, id); err == nil {
c.logger.Info("found POST in an existing ATX")
c.logger.Debug("found POST in an existing ATX")
if err := nipost.AddPost(c.localDb, id, *post); err != nil {
c.logger.Error("failed to save post", zap.Error(err))
}
Expand Down
2 changes: 1 addition & 1 deletion activation/e2e/checkpoint_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -188,7 +188,7 @@ func TestCheckpoint_PublishingSoloATXs(t *testing.T) {
// 3. Spawn new ATX handler and builder using the new DB
poetDb = activation.NewPoetDb(newDB, logger.Named("poetDb"))
cdb = datastore.NewCachedDB(newDB, logger)
atxdata, err = atxsdata.Warm(newDB, 1)
atxdata, err = atxsdata.Warm(newDB, 1, logger)
poetService = activation.NewPoetServiceWithClient(poetDb, client, poetCfg, logger)
validator = activation.NewValidator(newDB, poetDb, cfg, opts.Scrypt, verifier)
require.NoError(t, err)
Expand Down
2 changes: 1 addition & 1 deletion activation/e2e/poet_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ func (h *HTTPPoetTestHarness) Client(
db *activation.PoetDb,
cfg activation.PoetConfig,
logger *zap.Logger,
opts ...activation.PoetClientOpt,
opts ...activation.PoetServiceOpt,
) (activation.PoetService, error) {
return activation.NewPoetService(
db,
Expand Down
Loading