Skip to content

Commit

Permalink
Merge branch 'accelerator' into accelerator-test
Browse files Browse the repository at this point in the history
  • Loading branch information
SpyCheese committed Aug 9, 2024
2 parents a2b7fd2 + c740b09 commit dbab317
Show file tree
Hide file tree
Showing 18 changed files with 237 additions and 19 deletions.
4 changes: 2 additions & 2 deletions .github/workflows/build-ton-linux-x86-64-shared.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ jobs:
strategy:
fail-fast: false
matrix:
os: [ubuntu-20.04, ubuntu-22.04]
os: [ubuntu-20.04, ubuntu-22.04, ubuntu-24.04]
runs-on: ${{ matrix.os }}

steps:
Expand All @@ -21,7 +21,7 @@ jobs:
sudo apt-get update
sudo apt-get install -y build-essential git cmake ninja-build zlib1g-dev libsecp256k1-dev libmicrohttpd-dev libsodium-dev liblz4-dev libjemalloc-dev
- name: Install clang-16
- if: matrix.os != 'ubuntu-24.04'
run: |
wget https://apt.llvm.org/llvm.sh
chmod +x llvm.sh
Expand Down
17 changes: 17 additions & 0 deletions Changelog.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,20 @@
## 2024.08 Update

1. Introduction of dispatch queues, message envelopes with transaction chain metadata, and explicitly stored msg_queue size, which will be activated by `Config8.version >= 8` and new `Config8.capabilities` bits: `capStoreOutMsgQueueSize`, `capMsgMetadata`, `capDeferMessages`.
2. A number of changes to transcation executor which will activated for `Config8.version >= 8`:
- Check mode on invalid `action_send_msg`. Ignore action if `IGNORE_ERROR` (+2) bit is set, bounce if `BOUNCE_ON_FAIL` (+16) bit is set.
- Slightly change random seed generation to fix mix of `addr_rewrite` and `addr`.
- Fill in `skipped_actions` for both invalid and valid messages with `IGNORE_ERROR` mode that can't be sent.
- Allow unfreeze through external messages.
- Don't use user-provided `fwd_fee` and `ihr_fee` for internal messages.
3. A few issues with broadcasts were fixed: stop on receiving last piece, response to AdnlMessageCreateChannel
4. A number of fixes and improvements for emulator and tonlib: correct work with config_addr, not accepted externals, bounces, debug ops gas consumption, added version and c5 dump, fixed tonlib crashes
5. Added new flags and commands to the node, in particular `--fast-state-serializer`, `getcollatoroptionsjson`, `setcollatoroptionsjson`

Besides the work of the core team, this update is based on the efforts of @krigga (emulator), stonfi team, in particular @dbaranovstonfi and @hey-researcher (emulator), and @loeul, @xiaoxianBoy, @simlecode (typos in comments and docs).



## 2024.06 Update

1. Make Jemalloc default allocator
Expand Down
6 changes: 6 additions & 0 deletions adnl/adnl-peer.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -504,6 +504,12 @@ void AdnlPeerPairImpl::create_channel(pubkeys::Ed25519 pub, td::uint32 date) {

void AdnlPeerPairImpl::process_message(const adnlmessage::AdnlMessageCreateChannel &message) {
create_channel(message.key(), message.date());
if (respond_to_channel_create_after_.is_in_past()) {
respond_to_channel_create_after_ = td::Timestamp::in(td::Random::fast(1.0, 2.0));
std::vector<OutboundAdnlMessage> messages;
messages.emplace_back(adnlmessage::AdnlMessageNop{}, 0);
send_messages(std::move(messages));
}
}

void AdnlPeerPairImpl::process_message(const adnlmessage::AdnlMessageConfirmChannel &message) {
Expand Down
1 change: 1 addition & 0 deletions adnl/adnl-peer.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -214,6 +214,7 @@ class AdnlPeerPairImpl : public AdnlPeerPair {
pubkeys::Ed25519 channel_pub_;
td::int32 channel_pk_date_;
td::actor::ActorOwn<AdnlChannel> channel_;
td::Timestamp respond_to_channel_create_after_;

td::uint64 in_seqno_ = 0;
td::uint64 out_seqno_ = 0;
Expand Down
27 changes: 16 additions & 11 deletions recent_changelog.md
Original file line number Diff line number Diff line change
@@ -1,11 +1,16 @@
## 2024.04 Update

1. Make Jemalloc default allocator
2. Add candidate broadcasting and caching
3. Limit per address speed for external messages broadcast by reasonably large number
4. Overlay improvements: fix dropping peers in small custom overlays, fix wrong certificate on missed keyblocks
5. Extended statistics and logs for celldb usage, session stats, persistent state serialization
6. Tonlib and explorer fixes
7. Flags for precize control of Celldb: `--celldb-cache-size`, `--celldb-direct-io` and `--celldb-preload-all`
8. Add valiator-console command to stop persistent state serialization
9. Use `@` path separator for defining include path in fift and create-state utilities on Windows only.
## 2024.08 Update

1. Introduction of dispatch queues, message envelopes with transaction chain metadata, and explicitly stored msg_queue size, which will be activated by `Config8.version >= 8` and new `Config8.capabilities` bits: `capStoreOutMsgQueueSize`, `capMsgMetadata`, `capDeferMessages`.
2. A number of changes to transcation executor which will activated for `Config8.version >= 8`:
- Check mode on invalid `action_send_msg`. Ignore action if `IGNORE_ERROR` (+2) bit is set, bounce if `BOUNCE_ON_FAIL` (+16) bit is set.
- Slightly change random seed generation to fix mix of `addr_rewrite` and `addr`.
- Fill in `skipped_actions` for both invalid and valid messages with `IGNORE_ERROR` mode that can't be sent.
- Allow unfreeze through external messages.
- Don't use user-provided `fwd_fee` and `ihr_fee` for internal messages.
3. A few issues with broadcasts were fixed: stop on receiving last piece, response to AdnlMessageCreateChannel
4. A number of fixes and improvements for emulator and tonlib: correct work with config_addr, not accepted externals, bounces, debug ops gas consumption, added version and c5 dump, fixed tonlib crashes
5. Added new flags and commands to the node, in particular `--fast-state-serializer`, `getcollatoroptionsjson`, `setcollatoroptionsjson`

Besides the work of the core team, this update is based on the efforts of @krigga (emulator), stonfi team, in particular @dbaranovstonfi and @hey-researcher (emulator), and @loeul, @xiaoxianBoy, @simlecode (typos in comments and docs).


1 change: 1 addition & 0 deletions tddb/td/db/RocksDb.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -287,6 +287,7 @@ void RocksDbSnapshotStatistics::begin_snapshot(const rocksdb::Snapshot *snapshot
}

void RocksDbSnapshotStatistics::end_snapshot(const rocksdb::Snapshot *snapshot) {
auto lock = std::unique_lock<std::mutex>(mutex_);
auto id = reinterpret_cast<std::uintptr_t>(snapshot);
auto it = id_to_ts_.find(id);
CHECK(it != id_to_ts_.end());
Expand Down
48 changes: 48 additions & 0 deletions tdutils/td/utils/port/Stat.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -413,4 +413,52 @@ Result<CpuStat> cpu_stat() {
#endif
}

Result<uint64> get_total_ram() {
#if TD_LINUX
TRY_RESULT(fd, FileFd::open("/proc/meminfo", FileFd::Read));
SCOPE_EXIT {
fd.close();
};
constexpr int TMEM_SIZE = 10000;
char mem[TMEM_SIZE];
TRY_RESULT(size, fd.read(MutableSlice(mem, TMEM_SIZE - 1)));
if (size >= TMEM_SIZE - 1) {
return Status::Error("Failed for read /proc/meminfo");
}
mem[size] = 0;
const char* s = mem;
while (*s) {
const char *name_begin = s;
while (*s != 0 && *s != '\n') {
s++;
}
auto name_end = name_begin;
while (is_alpha(*name_end)) {
name_end++;
}
Slice name(name_begin, name_end);
if (name == "MemTotal") {
Slice value(name_end, s);
if (!value.empty() && value[0] == ':') {
value.remove_prefix(1);
}
value = trim(value);
value = split(value).first;
TRY_RESULT_PREFIX(mem, to_integer_safe<uint64>(value), "Invalid value of MemTotal");
if (mem >= 1ULL << (64 - 10)) {
return Status::Error("Invalid value of MemTotal");
}
return mem * 1024;
}
if (*s == 0) {
break;
}
s++;
}
return Status::Error("No MemTotal in /proc/meminfo");
#else
return Status::Error("Not supported");
#endif
}

} // namespace td
2 changes: 2 additions & 0 deletions tdutils/td/utils/port/Stat.h
Original file line number Diff line number Diff line change
Expand Up @@ -64,4 +64,6 @@ Status update_atime(CSlice path) TD_WARN_UNUSED_RESULT;

#endif

Result<uint64> get_total_ram() TD_WARN_UNUSED_RESULT;

} // namespace td
4 changes: 4 additions & 0 deletions validator-engine/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -14,5 +14,9 @@ add_executable(validator-engine ${VALIDATOR_ENGINE_SOURCE})
target_link_libraries(validator-engine overlay tdutils tdactor adnl tl_api dht
rldp rldp2 catchain validatorsession full-node validator ton_validator validator
fift-lib memprof git ${JEMALLOC_LIBRARIES})
if (JEMALLOC_FOUND)
target_include_directories(validator-engine PRIVATE ${JEMALLOC_INCLUDE_DIR})
target_compile_definitions(validator-engine PRIVATE -DTON_USE_JEMALLOC=1)
endif()

install(TARGETS validator-engine RUNTIME DESTINATION bin)
91 changes: 89 additions & 2 deletions validator-engine/validator-engine.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -75,6 +75,10 @@
#include "block/precompiled-smc/PrecompiledSmartContract.h"
#include "interfaces/validator-manager.h"

#if TON_USE_JEMALLOC
#include <jemalloc/jemalloc.h>
#endif

Config::Config() {
out_port = 3278;
full_node = ton::PublicKeyHash::zero();
Expand Down Expand Up @@ -1261,6 +1265,55 @@ class CheckDhtServerStatusQuery : public td::actor::Actor {
td::Promise<td::BufferSlice> promise_;
};

#if TON_USE_JEMALLOC
class JemallocStatsWriter : public td::actor::Actor {
public:
void start_up() override {
alarm();
}

void alarm() override {
alarm_timestamp() = td::Timestamp::in(60.0);
auto r_stats = get_stats();
if (r_stats.is_error()) {
LOG(WARNING) << "Jemalloc stats error : " << r_stats.move_as_error();
} else {
auto s = r_stats.move_as_ok();
LOG(WARNING) << "JEMALLOC_STATS : [ timestamp=" << (ton::UnixTime)td::Clocks::system()
<< " allocated=" << s.allocated << " active=" << s.active << " metadata=" << s.metadata
<< " resident=" << s.resident << " ]";
}
}

private:
struct JemallocStats {
size_t allocated, active, metadata, resident;
};

static td::Result<JemallocStats> get_stats() {
size_t sz = sizeof(size_t);
static size_t epoch = 1;
if (mallctl("epoch", &epoch, &sz, &epoch, sz)) {
return td::Status::Error("Failed to refrash stats");
}
JemallocStats stats;
if (mallctl("stats.allocated", &stats.allocated, &sz, nullptr, 0)) {
return td::Status::Error("Cannot get stats.allocated");
}
if (mallctl("stats.active", &stats.active, &sz, nullptr, 0)) {
return td::Status::Error("Cannot get stats.active");
}
if (mallctl("stats.metadata", &stats.metadata, &sz, nullptr, 0)) {
return td::Status::Error("Cannot get stats.metadata");
}
if (mallctl("stats.resident", &stats.resident, &sz, nullptr, 0)) {
return td::Status::Error("Cannot get stats.resident");
}
return stats;
}
};
#endif

void ValidatorEngine::set_local_config(std::string str) {
local_config_ = str;
}
Expand All @@ -1284,6 +1337,9 @@ void ValidatorEngine::schedule_shutdown(double at) {
}
void ValidatorEngine::start_up() {
alarm_timestamp() = td::Timestamp::in(1.0 + td::Random::fast(0, 100) * 0.01);
#if TON_USE_JEMALLOC
td::actor::create_actor<JemallocStatsWriter>("mem-stat").release();
#endif
}

void ValidatorEngine::alarm() {
Expand Down Expand Up @@ -1485,6 +1541,18 @@ td::Status ValidatorEngine::load_global_config() {
}
validator_options_.write().set_hardforks(std::move(h));

auto r_total_ram = td::get_total_ram();
if (r_total_ram.is_error()) {
LOG(ERROR) << "Failed to get total RAM size: " << r_total_ram.move_as_error();
} else {
td::uint64 total_ram = r_total_ram.move_as_ok();
LOG(WARNING) << "Total RAM = " << td::format::as_size(total_ram);
if (total_ram >= (90ULL << 30)) {
fast_state_serializer_enabled_ = true;
}
}
validator_options_.write().set_fast_state_serializer_enabled(fast_state_serializer_enabled_);

return td::Status::OK();
}

Expand Down Expand Up @@ -4365,7 +4433,7 @@ void need_scheduler_status(int sig) {
need_scheduler_status_flag.store(true);
}

void dump_memory_stats() {
void dump_memprof_stats() {
if (!is_memprof_on()) {
return;
}
Expand All @@ -4390,8 +4458,20 @@ void dump_memory_stats() {
LOG(WARNING) << td::tag("fast_backtrace_success_rate", get_fast_backtrace_success_rate());
}

void dump_jemalloc_prof() {
#if TON_USE_JEMALLOC
const char *filename = "/tmp/validator-jemalloc.dump";
if (mallctl("prof.dump", nullptr, nullptr, &filename, sizeof(const char *)) == 0) {
LOG(ERROR) << "Written jemalloc dump to " << filename;
} else {
LOG(ERROR) << "Failed to write jemalloc dump to " << filename;
}
#endif
}

void dump_stats() {
dump_memory_stats();
dump_memprof_stats();
dump_jemalloc_prof();
LOG(WARNING) << td::NamedThreadSafeCounter::get_default();
}

Expand Down Expand Up @@ -4632,6 +4712,13 @@ int main(int argc, char *argv[]) {
acts.push_back([&x, v]() { td::actor::send_closure(x, &ValidatorEngine::set_catchain_max_block_delay, v); });
return td::Status::OK();
});
p.add_option(
'\0', "fast-state-serializer",
"faster persistent state serializer, but requires more RAM (enabled automatically on machines with >= 90GB RAM)",
[&]() {
acts.push_back(
[&x]() { td::actor::send_closure(x, &ValidatorEngine::set_fast_state_serializer_enabled, true); });
});
auto S = p.run(argc, argv);
if (S.is_error()) {
LOG(ERROR) << "failed to parse options: " << S.move_as_error();
Expand Down
4 changes: 4 additions & 0 deletions validator-engine/validator-engine.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -237,6 +237,7 @@ class ValidatorEngine : public td::actor::Actor {
bool started_ = false;
ton::BlockSeqno truncate_seqno_{0};
std::string session_logs_file_;
bool fast_state_serializer_enabled_ = false;
bool not_all_shards_ = false;

std::set<ton::CatchainSeqno> unsafe_catchains_;
Expand Down Expand Up @@ -317,6 +318,9 @@ class ValidatorEngine : public td::actor::Actor {
void set_catchain_max_block_delay(double value) {
catchain_max_block_delay_ = value;
}
void set_fast_state_serializer_enabled(bool value) {
fast_state_serializer_enabled_ = value;
}
void set_not_all_shards() {
not_all_shards_ = true;
}
Expand Down
29 changes: 29 additions & 0 deletions validator/impl/accept-block.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -416,8 +416,37 @@ void AcceptBlockQuery::got_block_handle(BlockHandle handle) {
(is_masterchain() ? handle_->inited_proof() && handle_->is_applied() && handle_->inited_is_key_block()
: handle_->inited_proof_link())) {
finish_query();
return;
}
if (data_.is_null()) {
td::actor::send_closure(manager_, &ValidatorManager::get_candidate_data_by_block_id_from_db, id_, [SelfId = actor_id(this)](td::Result<td::BufferSlice> R) {
if (R.is_ok()) {
td::actor::send_closure(SelfId, &AcceptBlockQuery::got_block_candidate_data, R.move_as_ok());
} else {
td::actor::send_closure(SelfId, &AcceptBlockQuery::got_block_handle_cont);
}
});
} else {
got_block_handle_cont();
}
}

void AcceptBlockQuery::got_block_candidate_data(td::BufferSlice data) {
auto r_block = create_block(id_, std::move(data));
if (r_block.is_error()) {
fatal_error("invalid block candidate data in db: " + r_block.error().to_string());
return;
}
data_ = r_block.move_as_ok();
VLOG(VALIDATOR_DEBUG) << "got block candidate data from db";
if (data_.not_null() && !precheck_header()) {
fatal_error("invalid block header in AcceptBlock");
return;
}
got_block_handle_cont();
}

void AcceptBlockQuery::got_block_handle_cont() {
if (data_.not_null() && !handle_->received()) {
td::actor::send_closure(
manager_, &ValidatorManager::set_block_data, handle_, data_, [SelfId = actor_id(this)](td::Result<td::Unit> R) {
Expand Down
2 changes: 2 additions & 0 deletions validator/impl/accept-block.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -71,6 +71,8 @@ class AcceptBlockQuery : public td::actor::Actor {
void written_block_data();
void written_block_signatures();
void got_block_handle(BlockHandle handle);
void got_block_candidate_data(td::BufferSlice data);
void got_block_handle_cont();
void written_block_info();
void got_block_data(td::Ref<BlockData> data);
void got_prev_state(td::Ref<ShardState> state);
Expand Down
6 changes: 3 additions & 3 deletions validator/impl/fabric.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -133,9 +133,9 @@ void run_accept_block_query(BlockIdExt id, td::Ref<BlockData> data, std::vector<
td::Ref<ValidatorSet> validator_set, td::Ref<BlockSignatureSet> signatures,
td::Ref<BlockSignatureSet> approve_signatures, bool send_broadcast, bool apply,
td::actor::ActorId<ValidatorManager> manager, td::Promise<td::Unit> promise) {
td::actor::create_actor<AcceptBlockQuery>("accept", id, std::move(data), prev, std::move(validator_set),
std::move(signatures), std::move(approve_signatures), send_broadcast, apply,
manager, std::move(promise))
td::actor::create_actor<AcceptBlockQuery>(
PSTRING() << "accept" << id.id.to_str(), id, std::move(data), prev, std::move(validator_set),
std::move(signatures), std::move(approve_signatures), send_broadcast, apply, manager, std::move(promise))
.release();
}

Expand Down
2 changes: 1 addition & 1 deletion validator/manager.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -2841,7 +2841,7 @@ void ValidatorManagerImpl::alarm() {
}
alarm_timestamp().relax(log_ls_stats_at_);
if (cleanup_mempool_at_.is_in_past()) {
if (is_validator()) {
if (is_validator() || !collator_nodes_.empty()) {
get_external_messages(ShardIdFull{masterchainId, shardIdAll},
[](td::Result<std::vector<std::pair<td::Ref<ExtMessage>, int>>>) {});
get_external_messages(ShardIdFull{basechainId, shardIdAll},
Expand Down
Loading

0 comments on commit dbab317

Please sign in to comment.