Golang file system abstraction tailored for AWS S3
The library provides a Golang file system abstraction tailored for AWS S3, enabling seamless streaming of binary objects along with their corresponding metadata.
The streaming is convenient paradigm for handling large binary objects like images, videos, and more. Applications can effectively manage data consumption by leveraging io.Reader
and io.Writer
for seamless abstraction. This library employs the AWS Golang SDK v2 under the hood to facilitate access to AWS S3 through streams.
On the other hand, a file system is a method used by computers to organize and store data on storage devices. It provides a structured way to access and manage binary objects. File systems handle tasks such as creating, reading, writing, and deleting binary objects, as well as managing permissions and metadata associated with each file or directory.
The library implements Golang File System and enhances it by adding support for writable files and type-safe metadata. The file system api is following:
type FileSystem interface {
Create(path string) (File, error)
Open(path string) (fs.File, error)
Stat(path string) (fs.FileInfo, error)
ReadDir(path string) ([]fs.DirEntry, error)
Glob(pattern string) ([]string, error)
}
Notably, the interface supports reading and writing metadata associated with AWS objects using fs.FileInfo
.
The filesystem abstraction operates based on two fundamental principles:
- Files are always accessed using an absolute path starting with
/
. This absolute path is interpreted within the context of the "mounted" file system. - Directories always end with a
/
to distinguish them from files.
The library requires Go 1.18 or later due to usage of generics.
The latest version of the library is available at its main
branch. All development, including new features and bug fixes, take place on the main
branch using forking and pull requests as described in contribution guidelines. The stable version is available via Golang modules.
Use go get
to retrieve the library and add it as dependency to your application.
go get -u github.com/fogfish/stream
Check out the examples. They cover all fundamental use cases with runnable code snippets. Below is a simplest "Hello World"-like application for reading the object from AWS S3.
import (
"io"
"os"
"github.com/fogfish/stream"
)
// mount s3 bucket as file system
s3fs, err := stream.NewFS(/* name of S3 bucket */)
if err != nil {
return err
}
// open stream `io.Reader` to an object on S3
fd, err := s3fs.Open("/the/example/key.txt")
if err != nil {
return err
}
// stream data using io.Reader interface
buf, err := io.ReadAll(fd)
if err != nil {
return err
}
// close stream
err = fd.Close()
if err != nil {
return err
}
See and try examples. Its cover all basic use-cases of the library.
The library serves as a user-side implementation of Golang's file system abstractions defined by io/fs. It implements fs.FS
, fs.StatFS
, fs.ReadDirFS
and fs.GlobFS
. Additionally, it offers extensions for file writing: stream.CreateFS
, stream.RemoveFS
and stream.CopyFS
.
To create a file system instance, utilize stream.NewFS
or stream.New
. The file system is configurable using options pattern.
s3fs, err := stream.NewFS(
/* name of S3 bucket */,
stream.WithIOTimeout(5 * time.Second),
stream.WithListingLimit(25),
)
To open the file for reading use Open
function giving the absolute path starting with /
, the returned file descriptor is a composite of io.Reader
, io.Closer
and stream.Stat
. Utilize Golang's convenient streaming methods to consume S3 object seamlessly.
r, err := s3fs.Open("/the/example/key")
if err != nil {
return err
}
defer r.Close()
// utilize Golang's convenient streaming methods
io.ReadAll(r)
To open the file for writing use Create
function giving the absolute path starting with /
, the returned file descriptor is a composite of io.Writer
, io.Closer
and stream.Stat
. Utilize Golang's convenient streaming methods to update S3 object seamlessly. Once all bytes are written, it's crucial to close the stream. Failure to do so would cause data loss. The object is considered successfully created on S3 only if all Write
operations and subsequent Close
actions are successful.
w, err := s3fs.Create("/the/example/key", nil)
if err != nil {
return err
}
// utilize Golang's convenient streaming methods
io.WriteString(w, "Hello World!\n")
// close stream and handle error to prevent data loss.
err = w.Close()
if err != nil {
return err
}
The file system implements interfaces fs.ReadDirFS
and fs.GlobFS
for traversal through objects. The classical file system organize data hierarchically into directories as opposed to the flat storage structure of general purpose AWS S3 (the directory bucket is not supported yet). The flat structure implies a limitations into the implementation
- it assumes a directory if the path ends with
/
(e.g./the/example/key
points to the object,/the/example/key/
points to the directory). - it return path relative to pattern for all found object.
err := fs.WalkDir(s3fs, dir, func(path string, d fs.DirEntry, err error) error {
if err != nil {
return err
}
if d.IsDir() {
return nil
}
// do something with file
// path is absolute path to the file but entry is relative path
// path == dir + d.Name()
return nil
})
For added convenience, the file system is enhanced with stream.RemoveFS
and stream.CopyFS
, enabling the removal of S3 objects and the copying of objects across buckets, respectively.
fs.FileInfo
is a primary container for S3 objects metadata. The file system provides access to metadata either from open streams (file descriptors) or for any key.
fi, err := s3fs.Stat("/the/example/key")
if err != nil {
return err
}
AWS S3 support object metadata as a set of name-value pairs and allows to define the metadata at the time you upload the object and read it late. This library support both system and user-controlled metadata attributes.
What sets this library apart is its encouragement for developers to utilize the Golang type system in defining object metadata. Rather than working with loosely typed name-value pairs, metadata is structured as Golang structs, promoting correctness and maintainability. This approach is facilitated through generic programming style within the library.
A Golang struct type serves as the metadata container, where each public field is transformed into name-value pairs before being written to S3. Example below defines the container build with two user controlled attributes Author
and Chapter
and two system attributes ContentType
and ContentLanguage
.
type Note struct {
Author string
Chapter string
ContentType string
ContentLanguage string
}
The file system interface has been expanded to handle user-defined metadata in a type-safe manner. Firstly, stream.New()
create type annotated client. Secondly, the Create()
function accepts a pointer to the metadata container, which is then written alongside the data. Lastly, the fs.FileInfo
container retains an instance of associated metadata, which is accessible through either a Sys()
call or the StatSys()
helper.
// create client and define metadata type
s3fs, err := stream.New[Note](/* name of S3 bucket */)
// create stream and annotate it with metadata
fd, err := s3fs.Create("/the/example/key",
&Note{/* defined metadata values */},
)
// fs.FileInfo carries previously written metadata, use Sys() function to access.
fi, err := s3fs.Stat("/the/example/key")
note := s3fs.StatSys(fi)
AWS S3 defined collection of well-known system attributes. This library supports only subset of those: Cache-Control
, Content-Encoding
, Content-Language
, Content-Type
, Expires
, ETag
, Last-Modified
and Storage-Class
. Open Pull Request or raise an issue if subset needs to be enhanced.
The library define type stream.SystemMetadata
that incorporates all supported attributes. You might annotate your own types.
type Note struct {
stream.SystemMetadata
Author string
Chapter string
}
Usage of io.Reader
and io.Writer
interfaces is sufficient for majority cloud applications. However, there are instances where delegating read/write responsibilities to a mobile client becomes necessary. For example, directly uploading images or video files from a mobile client to an S3 bucket is both scalable and considerably more efficient than routing through a backend system. The library accommodates this scenario with a special case for streaming binary objects using pre-signed URLs. The file system return pre-signed URL for the stream within the metadata. It only requires definition of attribute PreSignedUrl
of string
type.
type PreSignedUrl struct {
PreSignedUrl string
}
Use fs.FileInfo
container and metadata api depicted above to obtain pre-signed URLs.
// Mount the S3 bucket with metadata containing the `PreSignedUrl` attribute
s3fs, err := stream.New[stream.PreSignedUrl](/* name of S3 bucket */)
// Open file for read or write
fd, err := s3fs.Create("/the/example/key", nil)
if err != nil {
return err
}
defer fd.Close()
// read files metadata
fi, err := fd.Stat()
if err != nil {
return err
}
if meta := s3fs.StatSys(fi); meta != nil {
// Use meta.PreSignedUrl
}
Note: Utilizing a pre-signed URL necessitates passing all headers that were provided to the Create function.
fd, err := s3fs.Create("/the/example/key",
&Note{
Author: "fogfish",
ContentType: "text/plain",
ContentLanguage: "en",
}
)
curl -XPUT https://pre-signed-url-goes-here \
-H 'Content-Type: text/plain' \
-H 'Content-Language: en' \
-H 'X-Amz-Meta-Author: fogfish' \
-d 'some content'
The library consistently returns fs.PathError
, except in cases where the object is not found, in which fs.ErrNotExist
is returned. Additionally, it refrains from wrapping stream I/O errors.
The library implements compatible wrapper of os.DirFS
to enhance functionality and provide a more user-friendly interface for filesystem operations, allowing clients to seamlessly mount both S3 and local file systems.
import "github.com/fogfish/stream/lfs"
fs, err := lfs.New("/path/to/root")
The library is MIT licensed and accepts contributions via GitHub pull requests:
- Fork it
- Create your feature branch (
git checkout -b my-new-feature
) - Commit your changes (
git commit -am 'Added some feature'
) - Push to the branch (
git push origin my-new-feature
) - Create new Pull Request
The build and testing process requires Go version 1.13 or later.
build and test library.
git clone https://github.com/fogfish/stream
cd stream
go test
The commit message helps us to write a good release note, speed-up review process. The message should address two question what changed and why. The project follows the template defined by chapter Contributing to a Project of Git book.
If you experience any issues with the library, please let us know via GitHub issues. We appreciate detailed and accurate reports that help us to identity and replicate the issue.