JetStream is a fast library for LLM inference and serving on TPU and GPU.
Use the following commands to run a server locally:
# Start a server
python -m jetstream.core.implementations.mock.server
# Test local mock server
python -m jetstream.core.tools.requester
# Load test local mock server
python -m jetstream.core.tools.load_tester
# Test JetStream core orchestrator
python -m jetstream.core.orchestrator_test
# Test JetStream core server library
python -m jetstream.core.server_test
# Test mock JetStream engine implementation
python -m jetstream.engine.mock_engine_test
# Test mock JetStream token utils
python -m jetstream.engine.utils_test