-
-
Notifications
You must be signed in to change notification settings - Fork 87
SQL Source
Kiba Pro SQL
source provides an efficient way to read large volumes of rows.
Currently tested against: PostgreSQL 9.3+, MySQL 5.5+, Ruby 2.4-2.7.
Requirements: make sure to add those to your Gemfile
:
-
sequel
gem -
pg
gem (if you connect to Postgres) -
mysql2
gem (if you connect to MySQL)
Here is a typical use (wrapping the Sequel connection in a block to ensure automatic close after processing):
require 'kiba-pro/sources/sql'
Sequel.connect(connection_config) do |db|
Kiba.run(Kiba.parse do
source Kiba::Pro::Sources::SQL,
database: db,
query: "SELECT * FROM items"
# SNIP
end)
end
You can also specify the query using the Sequel DSL, for instance:
source Kiba::Pro::Sources::SQL,
database: db,
query: -> (db) { db[:items].where('updated_at >= ?', last_updated_at).limit(1000) }
To stream with MySQL, just add .stream
to your query construction (no extension required):
source Kiba::Pro::Sources::SQL,
database: db,
query: -> (db) { db[:items].stream }
To read large amounts of data with Postgres, it is recommended to either use a cursor like this:
query: -> (db) { db[:items].where(...).use_cursor }
or to use streaming support (officialized in Kiba Pro v1.2) like with MySQL
# add `sequel_pg` to your `Gemfile` before
require 'sequel_pg'
# SNIP
Sequel.connect(connection_config) do |db|
db.extension(:pg_streaming)
Kiba.run(Kiba.parse do
source Kiba::Pro::Sources::SQL,
database: db,
query: -> (db) { db["SELECT * FROM items"].stream }
# SNIP
end)
end
Home | Core Concepts | Defining jobs | Running jobs | Writing sources | Writing transforms | Writing destinations | Implementation Guidelines | Kiba Pro
This wiki is tracked by git and publicly editable. You are welcome to fix errors and typos. Any defacing or vandalism of content will result in your changes being reverted and you being blocked.