Skip to content

Commit

Permalink
GH-38516: [Go][Parquet] Increment the number of rows written when app…
Browse files Browse the repository at this point in the history
…ending a new row group (#38517)

### Rationale for this change

This makes it so the `NumRows` method on the `file.Writer` reports the total number of rows written across multiple row groups.

### Are these changes tested?

A regression test is added that asserts that the total number of rows written matches expectations.

* Closes: #38516

Authored-by: Tim Schaub <[email protected]>
Signed-off-by: Matt Topol <[email protected]>
  • Loading branch information
tschaub authored and kou committed Aug 30, 2024
1 parent 7d5ee1c commit 558c3e7
Show file tree
Hide file tree
Showing 2 changed files with 3 additions and 0 deletions.
1 change: 1 addition & 0 deletions parquet/file/file_writer.go
Original file line number Diff line number Diff line change
Expand Up @@ -121,6 +121,7 @@ func (fw *Writer) AppendRowGroup() SerialRowGroupWriter {

func (fw *Writer) appendRowGroup(buffered bool) *rowGroupWriter {
if fw.rowGroupWriter != nil {
fw.nrows += fw.rowGroupWriter.nrows
fw.rowGroupWriter.Close()
}
fw.rowGroups++
Expand Down
2 changes: 2 additions & 0 deletions parquet/file/file_writer_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -97,6 +97,8 @@ func (t *SerializeTestSuite) fileSerializeTest(codec compress.Compression, expec
writer.Close()

nrows := t.numRowGroups * t.rowsPerRG
t.EqualValues(nrows, writer.NumRows())

reader, err := file.NewParquetReader(bytes.NewReader(sink.Bytes()))
t.NoError(err)
t.Equal(t.numCols, reader.MetaData().Schema.NumColumns())
Expand Down

0 comments on commit 558c3e7

Please sign in to comment.