Skip to content

Commit

Permalink
GH-35589: [Ruby] Add support or JRuby (#44346)
Browse files Browse the repository at this point in the history
### Rationale for this change

JRuby is a Ruby implementation. It's based on Java. We have the Java implementation. So we can use it for JRuby.

### What changes are included in this PR?

This is not a complete support. This just can create int8 and int32 arrays by using the Java implementation not the C++ implementation. We can improve this step by step.

Note that we can build gem for JRuby but we'll not release it for now. We need to build our gems as artifacts by CI in release process and publish approved gems after release vote. If we use the current "gem build && gem push" for JRuby gems, we need JRuby on release. It's not desired because it increases release complexity.

### Are these changes tested?

Yes but only a few tests are only passed for now.

### Are there any user-facing changes?

Yes.
* GitHub Issue: #35589

Authored-by: Sutou Kouhei <[email protected]>
Signed-off-by: Sutou Kouhei <[email protected]>
  • Loading branch information
kou authored Oct 10, 2024
1 parent 35f26c0 commit 34ce119
Show file tree
Hide file tree
Showing 28 changed files with 907 additions and 131 deletions.
2 changes: 1 addition & 1 deletion c_glib/arrow-glib/array-builder.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -6320,7 +6320,7 @@ garrow_union_array_builder_class_init(GArrowUnionArrayBuilderClass *klass)
* garrow_union_array_builder_append_child:
* @builder: A #GArrowUnionArrayBuilder.
* @child: A #GArrowArrayBuilder for new child.
* @filed_name: (nullable): A field name for new child.
* @field_name: (nullable): A field name for new child.
*
* Returns: The type ID for the appended child.
*
Expand Down
2 changes: 1 addition & 1 deletion c_glib/arrow-glib/array-builder.h
Original file line number Diff line number Diff line change
Expand Up @@ -1820,7 +1820,7 @@ GARROW_AVAILABLE_IN_12_0
gint8
garrow_union_array_builder_append_child(GArrowUnionArrayBuilder *builder,
GArrowArrayBuilder *child,
const gchar *filed_name);
const gchar *field_name);

GARROW_AVAILABLE_IN_12_0
gboolean
Expand Down
9 changes: 2 additions & 7 deletions ruby/red-arrow/lib/arrow.rb
Original file line number Diff line number Diff line change
Expand Up @@ -15,16 +15,11 @@
# specific language governing permissions and limitations
# under the License.

require "extpp/setup"
require "gio2"

require "arrow/version"

require "arrow/loader"

module Arrow
class Error < StandardError
end

Loader.load
end

require_relative "arrow/#{RUBY_ENGINE}"
11 changes: 7 additions & 4 deletions ruby/red-arrow/lib/arrow/array.rb
Original file line number Diff line number Diff line change
Expand Up @@ -33,9 +33,10 @@ def new(*args)
end

def builder_class
builder_class_name = "#{name}Builder"
return nil unless const_defined?(builder_class_name)
const_get(builder_class_name)
local_name = name.split("::").last
builder_class_name = "#{local_name}Builder"
return nil unless Arrow.const_defined?(builder_class_name)
Arrow.const_get(builder_class_name)
end

# @api private
Expand Down Expand Up @@ -92,6 +93,8 @@ def equal_array?(other, options=nil)
equal_options(other, options)
end

alias_method :size, :length

def each
return to_enum(__method__) unless block_given?

Expand Down Expand Up @@ -250,7 +253,7 @@ def resolve(other_array)
"[array][resolve] need to implement " +
"a feature that building #{value_data_type} array " +
"from raw Ruby Array"
raise NotImplemented, message
raise NotImplementedError, message
end
other_array
elsif other_array.respond_to?(:value_data_type)
Expand Down
52 changes: 52 additions & 0 deletions ruby/red-arrow/lib/arrow/jruby.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.

if File.exist?("../red-arrow_jars")
# installed gems
require_relative "../red-arrow_jars"
else
# local development
require "red-arrow_jars"
end

module Arrow
class << self
def allocator
@allocator ||= org.apache.arrow.memory.RootAllocator.new
end
end
end

require_relative "jruby/array"
require_relative "jruby/array-builder"
require_relative "jruby/chunked-array"
require_relative "jruby/compression-type"
require_relative "jruby/csv-read-options"
require_relative "jruby/decimal128"
require_relative "jruby/decimal256"
require_relative "jruby/error"
require_relative "jruby/file-system"
require_relative "jruby/function"
require_relative "jruby/record-batch"
require_relative "jruby/record-batch-iterator"
require_relative "jruby/sort-key"
require_relative "jruby/sort-options"
require_relative "jruby/stream-listener-raw"
require_relative "jruby/table"
require_relative "jruby/writable"

require_relative "libraries"
114 changes: 114 additions & 0 deletions ruby/red-arrow/lib/arrow/jruby/array-builder.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,114 @@
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.

require_relative "array"

module Arrow
module ArrayBuildable
ValueVector = org.apache.arrow.vector.ValueVector
def buildable?(args)
return false if args.size == 1 and args.first.is_a?(ValueVector)
super
end
end

class ArrayBuilder
class << self
prepend ArrayBuildable
end

def initialize
@vector = self.class::Array::Vector.new("", Arrow.allocator)
@vector.allocate_new
@index = 0
end

def append_value(value)
@vector.set(@index, value)
@index += 1
end

def append_values(values, is_valids=nil)
if is_valids
values.zip(is_valids) do |value, is_valid|
if is_valid
@vector.set(@index, value)
else
@vector.set_null(@index)
end
@index += 1
end
else
values.each do |value|
@vector.set(@index, value)
@index += 1
end
end
end

def append_nulls(n)
n.times do
@vector.set_null(@index)
@index += 1
end
end

def finish
@vector.set_value_count(@index)
vector, @vector = @vector, nil
self.class::Array.new(vector)
end
end

class Int8ArrayBuilder < ArrayBuilder
Array = Int8Array
end

class Int32ArrayBuilder < ArrayBuilder
Array = Int32Array
end

class FixedSizeBinaryArrayBuilder < ArrayBuilder
end

class Decimal128ArrayBuilder < FixedSizeBinaryArrayBuilder
end

class Decimal256ArrayBuilder < FixedSizeBinaryArrayBuilder
end

class ListArrayBuilder < ArrayBuilder
end

class MapArrayBuilder < ArrayBuilder
end

class StructArrayBuilder < ArrayBuilder
end

class UnionArrayBuilder < ArrayBuilder
def append_child(child, filed_name)
raise NotImplementedError
end
end

class DenseUnionArrayBuilder < UnionArrayBuilder
end

class SparseUnionArrayBuilder < UnionArrayBuilder
end
end
109 changes: 109 additions & 0 deletions ruby/red-arrow/lib/arrow/jruby/array.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,109 @@
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.

require_relative "data-type"

module Arrow
class Array
VectorAppender = org.apache.arrow.vector.util.VectorAppender
VectorEqualsVisitor = org.apache.arrow.vector.compare.VectorEqualsVisitor

attr_reader :vector

def initialize(vector)
@vector = vector
end

def ==(other_array)
return false unless other_array.is_a?(self.class)
VectorEqualsVisitor.vector_equals(@vector, other_array.vector)
end

def null?(i)
@vector.null?(i)
end

def get_value(i)
@vector.get_object(i)
end

def to_s
@vector.to_s
end

def inspect
super.sub(/>\z/) do
" #{to_s}>"
end
end

def close
@vector.close
end

def length
@vector.value_count
end

def value_data_type
self.class::ValueDataType.new
end

def values
each.to_a
end

def cast(other_value_data_type)
other_value_data_type.build_array(to_a)
end

def is_in(values)
raise NotImplementedError
end

def concatenate(other_arrays)
total_size = length + other_arrays.sum(&:length)
vector = self.class::Vector.new("", Arrow.allocator)
vector.allocate_new(total_size)
appender = VectorAppender.new(vector)
@vector.accept(appender, nil)
other_arrays.each do |other_array|
other_array.vector.accept(appender, nil)
end
self.class.new(vector)
end
end

class Int8Array < Array
Vector = org.apache.arrow.vector.SmallIntVector
ValueDataType = Int8DataType
end

class Int32Array < Array
Vector = org.apache.arrow.vector.IntVector
ValueDataType = Int32DataType
end

class FixedSizeBinaryArray < Array
end

class StructArray < Array
def fields
raise NotImplementedError
end
end
end
36 changes: 36 additions & 0 deletions ruby/red-arrow/lib/arrow/jruby/chunked-array.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.

module Arrow
class ChunkedArray
def initialize(arrays)
@arrays = arrays
end

def n_rows
@arrays.sum(&:size)
end

def chunks
@arrays
end

def get_chunk(i)
@arrays[i]
end
end
end
Loading

0 comments on commit 34ce119

Please sign in to comment.