Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize arguments handling #127

Merged
merged 2 commits into from
Jul 18, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 15 additions & 6 deletions libsql/internal/hrana/stream_request.go
Original file line number Diff line number Diff line change
Expand Up @@ -16,13 +16,15 @@ func CloseStream() StreamRequest {
return StreamRequest{Type: "close"}
}

func ExecuteStream(sql string, params shared.Params, wantRows bool) (*StreamRequest, error) {
func ExecuteStream(sql string, params *shared.Params, wantRows bool) (*StreamRequest, error) {
stmt := &Stmt{
Sql: &sql,
WantRows: wantRows,
}
if err := stmt.AddArgs(params); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add guard in the AddArgs method ... } else if len(params.Positional()) > 0 { ... and not change argument to pointer then?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but then we're not really removing special cases and we're doing additional check for 1M statements potentially.

return nil, err
if params != nil {
if err := stmt.AddArgs(*params); err != nil {
return nil, err
}
}
return &StreamRequest{Type: "execute", Stmt: stmt}, nil
}
Expand All @@ -39,15 +41,22 @@ func ExecuteStoredStream(sqlId int32, params shared.Params, wantRows bool) (*Str
}

func BatchStream(sqls []string, params []shared.Params, wantRows bool, transactional bool) (*StreamRequest, error) {
batch := &Batch{}
size := len(sqls)
if transactional {
size += 1
}
batch := &Batch{Steps: make([]BatchStep, 0, size)}
addArgs := len(params) > 0
for idx, sql := range sqls {
s := sql
stmt := &Stmt{
Sql: &s,
WantRows: wantRows,
}
if err := stmt.AddArgs(params[idx]); err != nil {
return nil, err
if addArgs {
if err := stmt.AddArgs(params[idx]); err != nil {
return nil, err
}
}
var condition *BatchCondition
if transactional {
Expand Down
6 changes: 5 additions & 1 deletion libsql/internal/http/hranaV2/hranaV2.go
Original file line number Diff line number Diff line change
Expand Up @@ -293,7 +293,11 @@ func (h *hranaV2Conn) executeStmt(ctx context.Context, query string, args []driv
}
msg := &hrana.PipelineRequest{}
if len(stmts) == 1 {
executeStream, err := hrana.ExecuteStream(stmts[0], params[0], wantRows)
var p *shared.Params
if len(params) > 0 {
p = &params[0]
}
executeStream, err := hrana.ExecuteStream(stmts[0], p, wantRows)
if err != nil {
return nil, fmt.Errorf("failed to execute SQL: %s\n%w", query, err)
}
Expand Down
7 changes: 5 additions & 2 deletions libsql/internal/http/shared/statement.go
Original file line number Diff line number Diff line change
Expand Up @@ -32,13 +32,16 @@ func ParseStatement(sql string) ([]string, []ParamsInfo, error) {
}

func ParseStatementAndArgs(sql string, args []driver.NamedValue) ([]string, []Params, error) {
stmts, _ := sqliteparserutils.SplitStatement(sql)

if len(args) == 0 {
return stmts, nil, nil
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we return array for params here too: return stmts, make([]Params, len(stmts)), nil?
It's a bit of overhead but it should be pretty negligible, no? But the benefit is that API will be "uniform" and there will be less special cases in the dependent code...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about it but this is proportional to the size of the input which is bad. For example for 100k or 1M statements that's not great especially that this is continuous memory which is hard to find in a fragmented heap.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For 1M statements this array will occupy at most 32MB. But from what I saw locally - dump process consumes GBs of RAM to handle 1M statements or so...

So, I agree that this is not great, but I doubt if this is actually that critical...
I would rather keep API uniform but also work on truly streaming approach (but this will require more work...)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Streaming is my next step. Just fixing stuff on the way.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree to some extend but this is not only about memory. Without special casing the arguments handling code will execute on those empty arrays and it does some logic including conditions. I don't want to execute this for 1M statements when I don't have to.

}
parameters, err := ConvertArgs(args)
if err != nil {
return nil, nil, err
}

stmts, _ := sqliteparserutils.SplitStatement(sql)

stmtsParams := make([]Params, len(stmts))
totalParametersAlreadyUsed := 0
for idx, stmt := range stmts {
Expand Down
Loading