Skip to content

Commit

Permalink
Merge OSPP PR (TuGraph-family#828)
Browse files Browse the repository at this point in the history
* Fix test case error

* update

* Query plan cache for cypher queries. (TuGraph-family#676)

* plan cache codebase.

* plan cache codebase.

* split plan_cache.h and plan_cache_param.

* integrate plan_cache into execution process.

* add test cases for parameterized query execution.

* fail direction

* plan cache codebase

* add more pattern for fastQueryParam

* add more pattern for fastQueryParam

* fix lint error

* remove buildit deps.

* fix bug in cypher visitor

* remove unused dir

---------

Co-authored-by: Ke Huang <[email protected]>

* basic columnar-based data structure (TuGraph-family#682)

* basic columnar-based data structure

* remove unnecessary function in cypher _string_t

* improve resizeOverflowBuffer

* add FieldType

---------

Co-authored-by: Shipeng Qi <[email protected]>

* Column record (TuGraph-family#683)

* basic columnar-based data structure

* columnar data record structure

* remove unnecessary function in cypher _string_t

* improve resizeOverflowBuffer

* add FieldType

* support FieldType

* format fix

---------

Co-authored-by: yannan-wyn <[email protected]>

* Tugraph supporting column based data done (TuGraph-family#685)

* basic columnar-based data structure

* columnar data record structure

* operators support column data

* modify runtime_context

* db support columnar-data done

* remove unnecessary function in cypher _string_t

* improve resizeOverflowBuffer

* add FieldType

* support FieldType

* initialize ColumnVector with FieldType

* format fix

* delete comments, adjust indent

* another formatting

* modify according to reviewer's comments

* add username after TODO

* TODO (username)

* TODO

* TODO

* fix potential memory leaking

* delete unlikely definition and usage

* make BATCH_SIZE an independent config

* fix coding standard

* remove whitespace

* static cast FLAGS_BATCH_SIZE

---------

Co-authored-by: yannan-wyn <[email protected]>

* Add column related testing and benchmark (TuGraph-family#686)

* basic columnar-based data structure

* columnar data record structure

* operators support column data

* modify runtime_context

* db support columnar-data done

* column related testing and benchmark

* remove unnecessary function in cypher _string_t

* improve resizeOverflowBuffer

* add FieldType

* support FieldType

* initialize ColumnVector with FieldType

* format fix

* delete comments, adjust indent

* another formatting

* modify according to reviewer's comments

* add username after TODO

* TODO (username)

* TODO

* TODO

* fix potential memory leaking

* delete unlikely definition and usage

* add testing for bitmask

* make BATCH_SIZE an independent config

* fix coding standard

* remove whitespace

* static cast FLAGS_BATCH_SIZE

* delete wrong assert

* rm reference in moveDataChunk

---------

Co-authored-by: yannan-wyn <[email protected]>

* Query compilation framework. (TuGraph-family#687)

* compilation execution framework.

* execution framework.

* remove static_var usage in data structures.

* test framework.

* generate code on current directory

* delete files after executions.

* fix bugs in test framework.

* LLVM framework

* LLVM backend

* fix compilation error.

* fix lint error.

* fix lint error.

* fix lint error.

---------

Co-authored-by: Ke Huang <[email protected]>

* revert

---------

Co-authored-by: RT_Enzyme <[email protected]>
Co-authored-by: Ke Huang <[email protected]>
Co-authored-by: Myrrolinz <[email protected]>
Co-authored-by: Shipeng Qi <[email protected]>
Co-authored-by: yannan-wyn <[email protected]>
  • Loading branch information
6 people authored Dec 21, 2024
1 parent cc68bcf commit a04277c
Show file tree
Hide file tree
Showing 36 changed files with 3,929 additions and 2 deletions.
2 changes: 1 addition & 1 deletion ci/images/tugraph-runtime-centos7-Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ RUN sed -e "s|^mirrorlist=|#mirrorlist=|g" \
RUN yum install -y \
libgfortran5.x86_64 \
libgomp \
libcurl-devel.x86_64 && yum clean all
wget && yum clean all

# install tugraph
# specifies the path of the object storage where the installation package resides
Expand Down
15 changes: 15 additions & 0 deletions src/cypher/execution_plan/ops/op_all_node_scan_col.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
/**
* Copyright 2022 AntGroup CO., Ltd.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
*/

#include "cypher/execution_plan/ops/op_all_node_scan_col.h"
146 changes: 146 additions & 0 deletions src/cypher/execution_plan/ops/op_all_node_scan_col.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,146 @@
/**
* Copyright 2022 AntGroup CO., Ltd.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
*/

#pragma once

#include "cypher/execution_plan/ops/op.h"
#include "cypher/resultset/column_vector.h"
#include "cypher/execution_plan/ops/op_config.h"

namespace cypher {

class AllNodeScanCol : public OpBase {
/* NOTE: Nodes in pattern graph are stored in std::vector, whose reference
* will become INVALID after reallocation.
* TODO(anyone) Make sure not add nodes to the pattern graph, otherwise use NodeId instead. */
friend class LocateNodeByVid;
friend class LocateNodeByVidV2;
friend class LocateNodeByIndexedProp;
friend class LocateNodeByIndexedPropV2;

Node *node_ = nullptr;
lgraph::VIter *it_ = nullptr; // also can be derived from node
std::string alias_; // also can be derived from node
std::string label_; // also can be derived from node
int node_rec_idx_; // index of node in record
int rec_length_; // number of entries in a record.
const SymbolTable *sym_tab_ = nullptr; // build time context
bool consuming_ = false; // whether begin consuming

public:
AllNodeScanCol(Node *node, const SymbolTable *sym_tab)
: OpBase(OpType::ALL_NODE_SCAN, "All Node Scan"), node_(node), sym_tab_(sym_tab) {
if (node) {
it_ = node->ItRef();
alias_ = node->Alias();
modifies.emplace_back(alias_);
}
auto it = sym_tab->symbols.find(alias_);
CYPHER_THROW_ASSERT(node && it != sym_tab->symbols.end());
if (it != sym_tab->symbols.end()) node_rec_idx_ = it->second.id;
rec_length_ = sym_tab->symbols.size();
consuming_ = false;
}

OpResult Initialize(RTContext *ctx) override {
// allocate a new record
record = std::make_shared<Record>(rec_length_, sym_tab_, ctx->param_tab_);
record->values[node_rec_idx_].type = Entry::NODE;
record->values[node_rec_idx_].node = node_;
// transaction allocated before in plan:execute
// TODO(anyone) remove patternGraph's state (ctx)
node_->ItRef()->Initialize(ctx->txn_->GetTxn().get(), lgraph::VIter::VERTEX_ITER);
return OP_OK;
}

OpResult RealConsume(RTContext *ctx) override {
uint32_t count = 0;
columnar_ = std::make_shared<DataChunk>();
while (count < FLAGS_BATCH_SIZE) {
node_->SetVid(-1);
if (!it_ || !it_->IsValid()) return (count > 0) ? OP_OK : OP_DEPLETED;
if (!consuming_) {
consuming_ = true;
} else {
it_->Next();
if (!it_->IsValid()) {
return (count > 0) ? OP_OK : OP_DEPLETED;
}
}
int64_t vid = it_->GetId();
for (auto& property : node_->ItRef()->GetFields()) {
const std::string& property_name = property.first;
const lgraph_api::FieldData& field = property.second;
if (field.type == lgraph_api::FieldType::STRING) {
if (columnar_->string_columns_.find(property_name) ==
columnar_->string_columns_.end()) {
columnar_->string_columns_[property_name] =
std::make_unique<ColumnVector>(sizeof(cypher_string_t),
FLAGS_BATCH_SIZE, field.type);
columnar_->property_positions_[property_name] = 0;
}
columnar_->property_vids_[property_name].push_back(vid);
uint32_t pos = columnar_->property_positions_[property_name]++;
StringColumn::AddString(
columnar_->string_columns_[property_name].get(), pos,
field.AsString().c_str(), field.AsString().size());
} else {
if (columnar_->columnar_data_.find(property_name) ==
columnar_->columnar_data_.end()) {
size_t element_size = ColumnVector::GetFieldSize(field.type);
columnar_->columnar_data_[property_name] =
std::make_unique<ColumnVector>(element_size,
FLAGS_BATCH_SIZE, field.type);
columnar_->property_positions_[property_name] = 0;
}
columnar_->property_vids_[property_name].push_back(vid);
uint32_t pos = columnar_->property_positions_[property_name]++;
ColumnVector::InsertIntoColumnVector(
columnar_->columnar_data_[property_name].get(), field, pos);
}
}

count++;
}
return OP_OK;
}

OpResult ResetImpl(bool complete) override {
consuming_ = false;
if (complete) {
// undo method initialize()
record = nullptr;
// TODO(anyone) cleaned in ExecutionPlan::Execute
if (it_ && it_->Initialized()) it_->FreeIter();
} else {
if (it_ && it_->Initialized()) it_->Reset();
}
return OP_OK;
}

std::string ToString() const override {
std::string str(name);
str.append(" [").append(alias_).append("]");
return str;
}

Node *GetNode() const { return node_; }

const SymbolTable *SymTab() const { return sym_tab_; }

CYPHER_DEFINE_VISITABLE()

CYPHER_DEFINE_CONST_VISITABLE()
};
} // namespace cypher
20 changes: 20 additions & 0 deletions src/cypher/execution_plan/ops/op_config.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
/**
* Copyright 2022 AntGroup CO., Ltd.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
*/


#include "cypher/execution_plan/ops/op_config.h"

namespace cypher {
DEFINE_int64(BATCH_SIZE, 32, "The batch size for processing");
}
21 changes: 21 additions & 0 deletions src/cypher/execution_plan/ops/op_config.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
/**
* Copyright 2022 AntGroup CO., Ltd.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
*/


#pragma once
#include <gflags/gflags.h>

namespace cypher {
DECLARE_int64(BATCH_SIZE);
}
15 changes: 15 additions & 0 deletions src/cypher/execution_plan/ops/op_limit_col.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
/**
* Copyright 2022 AntGroup CO., Ltd.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
*/

#include "cypher/execution_plan/ops/op_limit_col.h"
68 changes: 68 additions & 0 deletions src/cypher/execution_plan/ops/op_limit_col.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
/**
* Copyright 2022 AntGroup CO., Ltd.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
*/

#pragma once

#include "cypher/execution_plan/ops/op.h"
#include "cypher/execution_plan/ops/op_config.h"

namespace cypher {

class LimitCol : public OpBase {
friend class LazyProjectTopN;
size_t limit_ = 0; // Max number of records to consume.
size_t consumed_ = 0; // Number of records consumed so far.

public:
explicit LimitCol(size_t limit) : OpBase(OpType::LIMIT, "Limit"), limit_(limit) {}

OpResult Initialize(RTContext *ctx) override {
CYPHER_THROW_ASSERT(!children.empty());
auto &child = children[0];
auto res = child->Initialize(ctx);
if (res != OP_OK) return res;
columnar_ = std::make_shared<DataChunk>();
record = child->record;
return OP_OK;
}

OpResult RealConsume(RTContext *ctx) override {
if (consumed_ >= limit_) return OP_DEPLETED;
CYPHER_THROW_ASSERT(!children.empty());
auto &child = children[0];
auto res = child->Consume(ctx);
columnar_ = child->columnar_;
int usable_r = std::min(static_cast<size_t>(FLAGS_BATCH_SIZE),
limit_ - consumed_);
columnar_->TruncateData(usable_r);
consumed_ += usable_r;
return res;
}

OpResult ResetImpl(bool complete) override {
consumed_ = 0;
return OP_OK;
}

std::string ToString() const override {
std::string str(name);
str.append(" [").append(std::to_string(limit_)).append("]");
return str;
}

CYPHER_DEFINE_VISITABLE()

CYPHER_DEFINE_CONST_VISITABLE()
};
} // namespace cypher
15 changes: 15 additions & 0 deletions src/cypher/execution_plan/ops/op_produce_results_col.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
/**
* Copyright 2022 AntGroup CO., Ltd.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
*/

#include "cypher/execution_plan/ops/op_produce_results_col.h"
Loading

0 comments on commit a04277c

Please sign in to comment.