Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ck::IO::FileReader + Docs #3807

Draft
wants to merge 15 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -50,3 +50,6 @@ charmrun
ampirun
pgm
*.swp

#Ignore the generated headers dir
src/libs/ck-libs/io/headers#
68 changes: 66 additions & 2 deletions doc/libraries/manual.rst
Original file line number Diff line number Diff line change
Expand Up @@ -974,7 +974,7 @@ The following functions comprise the interface to the library for parallel file
This method is invoked to read data asynchronously from the read session. This method returns immediately to the caller, but the
read is only guaranteed complete once the callback ``after_read`` is called. Internally, the read request is buffered
until the Buffer Chares can respond with the requested data. After the read finishes, the
after_read callback is invoked taking a ReadCompleteMsg* which points to a vector<char> buffer, the offset,
after_read callback is invoked taking a ReadCompleteMsg* which points to a char* buffer, the offset,
and the number of bytes of the read.


Expand All @@ -989,10 +989,74 @@ The following functions comprise the interface to the library for parallel file
the ``FileReadyMsg`` sent to the ``opened`` callback after a file has been
opened. This method should only be called from a single PE, once per file.

FileReader API
--------------

The FileReader API is an additional abstraction layer built on top of Ck::IO to support
streaming reads from a file and implement the callback internally. This API is designed to
match that of the c++ std::ifstream. Under the hood, when an application reads a small
number of bytes via a FileReader object,
the Buffer Chare will send a large chunk of data to the FileReader which can be buffered there
until the application requests more.

- Creating a FileReader object:

.. code-block:: c++

FileReader::FileReader(Ck::IO::Session session)

Before creating a FileReader, the Ck::IO::Session must be created (see above). This session is
passed in to the FileReader constructor.

- Reading data:

.. code-block:: c++

FileReader& FileReader::read(char* buffer, size_t num_bytes_to_read)

Read the specified number of bytes from the file opened in the session. The data is read into the buffer.
This method is blocking and returns a pointer to the FileReader object.

- Seeking:

There are two functions for seeking in the file, one for seeking from the current position, and one for seeking from a set position (like the end, or the beginning).

.. code-block:: c++

FileReader& FileReader::seekg(size_t pos)

.. code-block:: c++

FileReader& FileReader::seekg(size_t pos, std::ios_base::seekdir dir)

The options for std::ios_base::seekdir are std::ios_base::beg, std::ios_base::cur, and std::ios_base::end.

- Tell functionality:

.. code-block:: c++

size_t FileReader::tellg()

This function returns the current position in the file.

- End of file and gcount:

.. code-block:: c++

bool FileReader::eof()

This function returns true if the end of the file has been reached and false otherwise.

.. code-block:: c++

size_t FileReader::gcount()

This function returns the number of bytes read by the last read operation.


Examples
--------

For example code showing how to use CkIO for output, see ``tests/charm++/io/``.

For example code showing how to use CkIO for input, see ``tests/charm++/io_read/``.
For example code showing how to use CkIO for input (including FileReader), see ``tests/charm++/io_read/``.
158 changes: 158 additions & 0 deletions src/libs/ck-libs/io/ckio.C
Original file line number Diff line number Diff line change
Expand Up @@ -280,6 +280,8 @@ private:
public:
ReadAssembler(Session session) { _session = session; }

ReadAssembler(CkMigrateMessage* m) : CBase_ReadAssembler(m) {}

/*
* This function adds the read request to the _read_info_buffer table
* which maps a tag to a ReadInfo struct
Expand Down Expand Up @@ -308,6 +310,11 @@ public:
return _curr_read_tag - 1;
}

void pup(PUP::er& p)
{
// TODO: All files must be closed across checkpoint/restart
}

void removeEntryFromReadTable(int tag) { _read_info_buffer.erase(tag); }

/**
Expand Down Expand Up @@ -989,6 +996,157 @@ public:

int registerArray(CkArrayIndex& numElements, CkArrayID aid) { return 0; }
};
FileReader::FileReader(Ck::IO::Session session)
: _session_token(session), _offset(session.offset), _num_bytes(session.bytes)
{
}

FileReader& FileReader::read(char* buffer, size_t num_bytes_to_read)
{
if (_eofbit)
{ // no more bytes to read
_gcount = 0;
return *this;
}
size_t amt_from_cache = _data_cache.getFromBuffer(
_curr_pos, num_bytes_to_read, buffer); // get whatever data the cache has for us
_curr_pos += amt_from_cache;
if (amt_from_cache == num_bytes_to_read)
{
return *this;
}
size_t bytes_to_read_left = num_bytes_to_read - amt_from_cache;
size_t bytes_to_read = std::min(
std::max(bytes_to_read_left, _data_cache.capacity()),
(_offset + _num_bytes -
_curr_pos)); // if the read is too small, get more data to store in the buffer
if (!bytes_to_read)
{
return *this;
}
char* tmp_data_buff =
new char[bytes_to_read]; // temporary buffer that will hold all the data
ReadCompleteMsg* read_msg;
Ck::IO::read(_session_token, bytes_to_read, _curr_pos, tmp_data_buff,
CkCallbackResumeThread((void*&)read_msg));
// below will not get executed until the read is done
size_t bytes_read = read_msg->bytes;
if (bytes_read > bytes_to_read_left)
{
// if I read more bytes than what was left to read, that means I have extra bytes that
// the buffer can use
_data_cache.setBuffer(_curr_pos, bytes_read, tmp_data_buff);
_curr_pos += bytes_to_read_left;
std::memcpy(buffer + amt_from_cache, tmp_data_buff, bytes_to_read_left);
}
else
{
// too many bytes, nothing to actually cache
_curr_pos += bytes_read;
std::memcpy(buffer + amt_from_cache, tmp_data_buff, bytes_read);
}
delete[] tmp_data_buff;
if (_curr_pos >= (_offset + _num_bytes))
{
_eofbit = true; // ran out of data to read
_curr_pos = _offset + _num_bytes;
}
_gcount = std::min(bytes_read, bytes_to_read_left);
return *this;
}

// overload ! operator on filereader object
bool FileReader::operator!() const
{
// CkPrintf("In overwritten operator\n");
return false;
}

size_t FileReader::tellg() { return _curr_pos; }

FileReader& FileReader::seekg(size_t pos, std::ios_base::seekdir dir)
{
if (dir == std::ios_base::beg)
{
_curr_pos = pos + _offset;
}
else if (dir == std::ios_base::cur)
{
_curr_pos += pos;
}
else if (dir == std::ios_base::end)
{
_curr_pos = _offset + _num_bytes - pos;
}

_eofbit = false;
if (_curr_pos < _offset)
{
_curr_pos = _offset;
_eofbit = false;
}
else if (_curr_pos >= (_offset + _num_bytes))
{
_curr_pos = _offset + _num_bytes;
_eofbit = true;
}
return *this;
}

FileReader& FileReader::seekg(size_t pos)
{
_curr_pos = pos;
if (_curr_pos < _offset)
{
_curr_pos = _offset;
_eofbit = false;
}
else if (_curr_pos >= (_offset + _num_bytes))
{
_curr_pos = _offset + _num_bytes;
_eofbit = true;
return *this;
}
_eofbit = false;
return *this;
}

bool FileReader::eof() { return _eofbit; }

size_t FileReader::gcount() { return _gcount; }

FileReaderBuffer::FileReaderBuffer() { _buffer = new char[_buff_capacity]; }

FileReaderBuffer::FileReaderBuffer(size_t buff_capacity)
{
_buff_capacity = buff_capacity;
_buffer = new char[_buff_capacity];
}

void FileReaderBuffer::setBuffer(size_t offset, size_t num_bytes, char* data)
{
is_dirty = false;
_offset = offset;
_buff_size = std::min(_buff_capacity, num_bytes);
std::memcpy(_buffer, data, _buff_size); // copy the first section of bytes
}

size_t FileReaderBuffer::getFromBuffer(size_t offset, size_t num_bytes, char* buffer)
{
if (is_dirty || offset < _offset || offset >= (_offset + _buff_size))
{
return 0; // the buffer has nothing of relevance
}
size_t cached_len = std::min(offset + num_bytes, _offset + _buff_size) - offset;
std::memcpy(buffer, _buffer + (offset - _offset), cached_len);
return cached_len;
}

size_t FileReaderBuffer::capacity() {
return _buff_capacity;
}

FileReaderBuffer::~FileReaderBuffer() { delete[] _buffer; }

} // namespace IO
} // namespace Ck
Expand Down
2 changes: 1 addition & 1 deletion src/libs/ck-libs/io/ckio.ci
Original file line number Diff line number Diff line change
Expand Up @@ -128,7 +128,7 @@ entry void addSessionReadAssemblerFinished(CkReductionMsg* msg);
}

// class tht will be used to assemble a specific read call
group ReadAssembler
group[migratable] ReadAssembler
{
// stores the parameters of the read call it is tasked with building
entry ReadAssembler(Session session);
Expand Down
Loading
Loading