Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Contributing valkeyJSON module #1

Merged
merged 9 commits into from
Nov 29, 2024
Merged
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
31 changes: 31 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
# IDE files
.idea/
*.vscode
.vscode/*

# Build temp files
*build
cmake-build-*/

# Auto-generated files
**/__pycache__/*
test-data
*.pyc
*.bin
*.o
*.xo
*.so
*.d
*.a
*.log
*.out

# Others
.DS_Store
.attach_pid*
venv/
core.*
valkeytests
**/include
**/report.html
**/assets
167 changes: 167 additions & 0 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,167 @@
cmake_minimum_required(VERSION 3.17)

include(FetchContent)
include(ExternalProject)

# Detect the system architecture
EXECUTE_PROCESS(
COMMAND uname -m
COMMAND tr -d '\n'
OUTPUT_VARIABLE ARCHITECTURE
)

if("${ARCHITECTURE}" STREQUAL "x86_64")
message("Building JSON for x86_64")
elseif("${ARCHITECTURE}" STREQUAL "aarch64")
message("Building JSON for aarch64")
else()
message(FATAL_ERROR "Unsupported architecture. JSON is only supported on x86_64 and aarch64.")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

print out the architecture?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, printing out ${ARCHITECTURE} will be helpful.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like this is all about reusing the tests at https://github.com/valkey-io/valkey-bloom/tree/unstable/tests? or there is more?

Yes, it's for reusing the pytest infrastructure in valkey-bloom.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm this dependency is a bit odd - does it make sense to pull the pytest infra into the Valkey repo instead? We are discussing migrating to python based integration test infra for Valkey too.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I would suggest we should pull it out in the valkey repo instead. However, we can change the line later when we have it separately.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was going to suggest we write an RFC or something similar for the python testing framework as well before we get too far along building it organically.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not move the testing framework to it's own repo to start with? It only needs to be pulled in for running the tests, not when just building the module.

I'd rather see people spend time on a README for the test framework than on an RFC. Or we can do both in parallel...?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather see people spend time on a README for the test framework than on an RFC. Or we can do both in parallel...?

I just want us to have some long term plan for testing and not just build it up organically. The reason is that Amazon internally has a similar testing framework that is extremely complex because nobody really maintains it and just bolt on random features. There are many cases where it's possible to do the same way 2 or 3 different ways. I think the same is true of our TCL framework, just to a smaller scale.

So whether it be an RFC or just some direction in a new repo, I don't care which.

endif()

# Project definition
project(ValkeyJSONModule VERSION 1.0 LANGUAGES C CXX)

# Set the name of the JSON shared library
set(JSON_MODULE_LIB json)

# Define the Valkey directories
set(VALKEY_DOWNLOAD_DIR "${CMAKE_BINARY_DIR}/_deps/valkey-src")
set(VALKEY_BIN_DIR "${CMAKE_BINARY_DIR}/_deps/valkey-src/src/valkey/src")

# Download and build Valkey
ExternalProject_Add(
valkey
GIT_REPOSITORY https://github.com/valkey-io/valkey.git # Replace with actual URL
GIT_TAG ${VALKEY_VERSION}
PREFIX ${VALKEY_DOWNLOAD_DIR}
CONFIGURE_COMMAND ""
BUILD_COMMAND make distclean && make -j
INSTALL_COMMAND ""
BUILD_IN_SOURCE 1
)

# Define the paths for the copied files
set(VALKEY_INCLUDE_DIR "${CMAKE_CURRENT_SOURCE_DIR}/src/include")
set(VALKEY_BINARY_DEST "${CMAKE_CURRENT_SOURCE_DIR}/tst/integration/.build/binaries/${VALKEY_VERSION}")

ExternalProject_Add_Step(
valkey
copy_header_files
COMMENT "Copying header files to include/ directory"
DEPENDEES download
DEPENDERS configure
COMMAND ${CMAKE_COMMAND} -E make_directory ${VALKEY_INCLUDE_DIR}
COMMAND ${CMAKE_COMMAND} -E copy ${VALKEY_DOWNLOAD_DIR}/src/valkey/src/valkeymodule.h ${VALKEY_INCLUDE_DIR}/valkeymodule.h
ALWAYS 1
)

# Copy header and binary after Valkey make
add_custom_command(TARGET valkey
POST_BUILD
COMMAND ${CMAKE_COMMAND} -E make_directory ${VALKEY_BINARY_DEST}
COMMAND ${CMAKE_COMMAND} -E copy ${VALKEY_BIN_DIR}/valkey-server ${VALKEY_BINARY_DEST}/valkey-server
COMMENT "Copied valkeymodule.h and valkey-server to destination directories"
)

# Define valkey-bloom branch
set(VALKEY_BLOOM_BRANCH "unstable" CACHE STRING "Valkey-bloom branch to use")

# Set the download directory for Valkey-bloom
set(VALKEY_BLOOM_DOWNLOAD_DIR "${CMAKE_CURRENT_BINARY_DIR}/_deps/valkey-bloom-src")

# Download valkey-bloom
ExternalProject_Add(
valkey-bloom
GIT_REPOSITORY https://github.com/valkey-io/valkey-bloom.git
madolson marked this conversation as resolved.
Show resolved Hide resolved
GIT_TAG ${VALKEY_BLOOM_BRANCH}
GIT_SHALLOW TRUE
PREFIX "${VALKEY_BLOOM_DOWNLOAD_DIR}"
CONFIGURE_COMMAND ""
BUILD_COMMAND ""
INSTALL_COMMAND ""
)

# Step to copy pytest files
ExternalProject_Add_Step(
valkey-bloom
copy_pytest_files
COMMENT "Copying pytest files to tst/integration directory"
DEPENDEES build
COMMAND ${CMAKE_COMMAND} -E make_directory ${CMAKE_CURRENT_SOURCE_DIR}/tst/integration
COMMAND ${CMAKE_COMMAND} -E copy_directory ${VALKEY_BLOOM_DOWNLOAD_DIR}/src/valkey-bloom/tests/valkeytests ${CMAKE_CURRENT_SOURCE_DIR}/tst/integration/valkeytests
)

# Enable instrumentation options if requested
if("$ENV{INSTRUMENT_V2PATH}" STREQUAL "yes")
add_compile_definitions(INSTRUMENT_V2PATH)
message("Enabled INSTRUMENT_V2PATH")
endif()

# Disable Doxygen documentation generation
set(BUILD_DOCUMENTATION OFF)
# When CODE_COVERAGE is ON, the package is built twice, once for debug and once for release.
# To fix the problem, disable the code coverage.
set(CODE_COVERAGE OFF)

# Fix for linking error when code coverage is enabled on ARM
if(CODE_COVERAGE AND CMAKE_BUILD_TYPE STREQUAL "Debug")
add_link_options("--coverage")
endif()

# Set C & C++ standard versions
set(CMAKE_C_STANDARD 11)
set(CMAKE_C_STANDARD_REQUIRED True)
set(CMAKE_CXX_STANDARD 17)
set(CMAKE_CXX_STANDARD_REQUIRED True)

# Always include debug symbols and optimize the code
set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -O2 -g")
roshkhatri marked this conversation as resolved.
Show resolved Hide resolved
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -O2 -g")

# RapidJSON SIMD optimization
if("${ARCHITECTURE}" STREQUAL "x86_64")
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -march=nehalem")
elseif("${ARCHITECTURE}" STREQUAL "aarch64")
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -march=armv8-a")
else()
message(FATAL_ERROR "Unsupported architecture. JSON is only supported on x86_64 and aarch64.")
roshkhatri marked this conversation as resolved.
Show resolved Hide resolved
endif()

# Additional flags for all architectures
set(ADDITIONAL_FLAGS "-fPIC")

# Compiler warning flags
set(C_WARNING "-Wall -Werror -Wextra")
set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} ${ADDITIONAL_FLAGS} ${C_WARNING}")
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} ${ADDITIONAL_FLAGS} ${C_WARNING}")

# Fetch RapidJSON
FetchContent_Declare(
rapidjson
GIT_REPOSITORY https://github.com/Tencent/rapidjson.git
GIT_TAG 0d4517f15a8d7167ba9ae67f3f22a559ca841e3b
)

# Disable RapidJSON tests and examples
set(RAPIDJSON_BUILD_TESTS OFF CACHE BOOL "Build rapidjson tests" FORCE)
set(RAPIDJSON_BUILD_EXAMPLES OFF CACHE BOOL "Build rapidjson examples" FORCE)
set(RAPIDJSON_BUILD_DOC OFF CACHE BOOL "Build rapidjson documentation" FORCE)

# Make Rapidjson available
FetchContent_MakeAvailable(rapidjson)

# Add the src subdirectory for building
add_subdirectory(src)

# Add the src subdirectory for building
roshkhatri marked this conversation as resolved.
Show resolved Hide resolved
add_subdirectory(tst)

add_custom_target(test
COMMENT "Run JSON integration tests."
USES_TERMINAL
COMMAND rm -rf ${CMAKE_BINARY_DIR}/tst/integration
COMMAND mkdir -p ${CMAKE_BINARY_DIR}/tst/integration
COMMAND cp -rp ${CMAKE_SOURCE_DIR}/tst/integration/. ${CMAKE_BINARY_DIR}/tst/integration/
COMMAND echo "[TARGET] begin integration tests"
COMMAND ${CMAKE_SOURCE_DIR}/tst/integration/run.sh "test" ${CMAKE_SOURCE_DIR}
COMMAND echo "[TARGET] end integration tests")
89 changes: 89 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,89 @@
# ValkeyJSON

ValkeyJSON is a C++ Valkey-Module that provides native JSON (JavaScript Object Notation) support for Valkey. The implementation complies with RFC7159 and ECMA-404 JSON data interchange standards. Users can natively store, query, and modify JSON data structures using the JSONPath query language. The query expressions support advanced capabilities including wildcard selections, filter expressions, array slices, union operations, and recursive searches.

ValkeyJSON leverages [RapidJSON](https://rapidjson.org/), a high-performance JSON parser and generator for C++, chosen for its small footprint and exceptional performance and memory efficiency. As a header-only library with no external dependencies, RapidJSON provides robust Unicode support while maintaining a compact memory profile of just 16 bytes per JSON value on most 32/64-bit machines.

## Motivation
While Valkey core lacks native JSON support, there's significant community demand for JSON capabilities. ValkeyJSON provides a comprehensive open-source solution with extensive JSON manipulation features.
madolson marked this conversation as resolved.
Show resolved Hide resolved

## Building and Testing

#### To build the module and run tests
```text
# Builds the valkey-server (unstable) for integration testing.
export SERVER_VERSION=unstable
./build.sh

# Builds the valkey-server (8.0.0) for integration testing.
export SERVER_VERSION=8.0.0
./build.sh
```

#### To build just the module
```text
mdkir build
cd build
cmake .. -DVALKEY_VERSION=unstable
make
```

#### To run all unit tests:
```text
cd build
make -j unit
```

#### To run all integration tests:
```text
make -j test
```

## Load the Module
To test the module with a Valkey, you can load the module using any of the following ways:

#### Using valkey.conf:
```
1. Add the following to valkey.conf:
loadmodule /path/to/libjson.so
2. Start valkey-server:
valkey-server /path/to/valkey.conf
```

#### Starting valkey with --loadmodule option:
```text
valkey-server --loadmodule /path/to/libjson.so
```

#### Using Valkey command MODULE LOAD:
```
1. Connect to a running Valkey instance using valkey-cli
2. Execute Valkey command:
MODULE LOAD /path/to/libjson.so
```
## Supported Module Commands
```text
JSON.ARRAPPEND
JSON.ARRINDEX
JSON.ARRINSERT
JSON.ARRLEN
JSON.ARRPOP
JSON.ARRTRIM
JSON.CLEAR
JSON.DEBUG
JSON.DEL
JSON.FORGET
JSON.GET
JSON.MGET
JSON.MSET
JSON.NUMINCRBY
JSON.NUMMULTBY
JSON.OBJLEN
JSON.OBJKEYS
JSON.RESP
JSON.SET
JSON.STRAPPEND
JSON.STRLEN
JSON.TOGGLE
JSON.TYPE
```
63 changes: 63 additions & 0 deletions build.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
#!/bin/bash

# Script to build valkeyJSON module, build it and generate .so files, run unit and integration tests.

# # Exit the script if any command fails
set -e

SCRIPT_DIR=$(pwd)
echo "Script Directory: $SCRIPT_DIR"

# Ensure SERVER_VERSION environment variable is set
if [ -z "$SERVER_VERSION" ]; then
echo "WARNING: SERVER_VERSION environment variable is not set. Defaulting to unstable."
export SERVER_VERSION="unstable"
fi

if [ "$SERVER_VERSION" != "unstable" ] && [ "$SERVER_VERSION" != "8.0.0" ] ; then
roshkhatri marked this conversation as resolved.
Show resolved Hide resolved
echo "ERROR: Unsupported version - $SERVER_VERSION"
exit 1
fi

# Variables
BUILD_DIR="$SCRIPT_DIR/build"

# Build the Valkey JSON module using CMake
echo "Building ValkeyJSON module..."
if [ ! -d "$BUILD_DIR" ]; then
mkdir $BUILD_DIR
fi
cd $BUILD_DIR
cmake .. -DVALKEY_VERSION=$SERVER_VERSION
make

# Running the Valkey JSON unit tests.
echo "Running Unit Tests..."
make -j unit

cd $SCRIPT_DIR

REQUIREMENTS_FILE="requirements.txt"

# Check if pip is available
if command -v pip > /dev/null 2>&1; then
echo "Using pip to install packages..."
pip install -r "$SCRIPT_DIR/$REQUIREMENTS_FILE"
# Check if pip3 is available
elif command -v pip3 > /dev/null 2>&1; then
echo "Using pip3 to install packages..."
pip3 install -r "$SCRIPT_DIR/$REQUIREMENTS_FILE"

else
echo "Error: Neither pip nor pip3 is available. Please install Python package installer."
exit 1
fi

export MODULE_PATH="$SCRIPT_DIR/build/src/libjson.so"

# Running the Valkey JSON integration tests.
echo "Running the integration tests..."
cd $BUILD_DIR
make -j test

echo "Build and Integration Tests succeeded"
3 changes: 3 additions & 0 deletions requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
valkey
pytest==4
pytest-html
Loading