GitHub - gmastrapas/jina: Cloud-native neural search framework for 𝙖𝙣𝙮 kind of data

Cloud-Native Neural Search^? Framework for Any Kind of Data

Jina^🔊 is a neural search framework that empowers anyone to build SOTA and scalable deep learning search applications in minutes.

🌌 All data types - Scalable indexing, querying, understanding of any data: video, image, long/short text, music, source code, PDF, etc.

⏱️ Save time - The design pattern of neural search systems, from zero to a production-ready system in minutes.

🌩️ Fast & cloud-native - Distributed architecture from day one, scalable & cloud-native by design: enjoy containerizing, streaming, paralleling, sharding, async scheduling, HTTP/gRPC/WebSocket protocols.

🍱 Own your stack - Keep end-to-end stack ownership of your solution, avoid integration pitfalls you get with fragmented, multi-vendor, generic legacy tools.

Install

via PyPI: pip install jina
via Conda: conda install jina -c conda-forge
via Docker: docker run jinaai/jina:latest
More install options

Documentation

Run Quick Demo

👗 Fashion image search: jina hello fashion
🤖 QA chatbot: pip install "jina[demo]" && jina hello chatbot
📰 Multimodal search: pip install "jina[demo]" && jina hello multimodal
🍴 Fork the source of a demo to your folder: jina hello fork fashion ../my-proj/

Build Your First Jina App

Document, Executor, and Flow are three fundamental concepts in Jina.

📄 Document is the basic data type in Jina;
⚙️ Executor is how Jina processes Documents;
🔀 Flow is how Jina streamlines and distributes Executors.

Leveraging these three components, let's build an app that find lines from a code snippet that are most similar to the query.

^{💡 Preliminaries: character embedding, pooling, Euclidean distance} ^{📗 Read our docs for details}

1️⃣ Copy-paste the minimum example below and run it:

import numpy as np
from jina import Document, DocumentArray, Executor, Flow, requests

class CharEmbed(Executor):  # a simple character embedding with mean-pooling
    offset = 32  # letter `a`
    dim = 127 - offset + 1  # last pos reserved for `UNK`
    char_embd = np.eye(dim) * 1  # one-hot embedding for all chars

    @requests
    def foo(self, docs: DocumentArray, **kwargs):
        for d in docs:
            r_emb = [ord(c) - self.offset if self.offset <= ord(c) <= 127 else (self.dim - 1) for c in d.text]
            d.embedding = self.char_embd[r_emb, :].mean(axis=0)  # average pooling

class Indexer(Executor):
    _docs = DocumentArray()  # for storing all documents in memory

    @requests(on='/index')
    def foo(self, docs: DocumentArray, **kwargs):
        self._docs.extend(docs)  # extend stored `docs`

    @requests(on='/search')
    def bar(self, docs: DocumentArray, **kwargs):
         docs.match(self._docs, metric='euclidean')

f = Flow(port_expose=12345, protocol='http', cors=True).add(uses=CharEmbed, parallel=2).add(uses=Indexer)  # build a Flow, with 2 parallel CharEmbed, tho unnecessary
with f:
    f.post('/index', (Document(text=t.strip()) for t in open(__file__) if t.strip()))  # index all lines of _this_ file
    f.block()  # block for listening request

2️⃣ Open http://localhost:12345/docs (an extended Swagger UI) in your browser, click /search tab and input:

{"data": [{"text": "@requests(on=something)"}]}

That means, we want to find lines from the above code snippet that are most similar to @request(on=something). Now click Execute button!

3️⃣ Not a GUI fan? Let's do it in Python then! Keep the above server running and start a simple client:

from jina import Client, Document
from jina.types.request import Response


def print_matches(resp: Response):  # the callback function invoked when task is done
    for idx, d in enumerate(resp.docs[0].matches[:3]):  # print top-3 matches
        print(f'[{idx}]{d.scores["euclidean"].value:2f}: "{d.text}"')


c = Client(protocol='http', port=12345)  # connect to localhost:12345
c.post('/search', Document(text='request(on=something)'), on_done=print_matches)

This prints the following results:

         Client@1608[S]:connected to the gateway at localhost:12345!
[0]0.168526: "@requests(on='/index')"
[1]0.181676: "@requests(on='/search')"
[2]0.218218: "from jina import Document, DocumentArray, Executor, Flow, requests"

^{😔 Doesn't work? Our bad! Please report it here.}

Support

Join our Slack community to chat to our engineers about your use cases, questions, and support queries.
Join our Engineering All Hands meet-up to discuss your use case and learn Jina's new features.
- When? The second Tuesday of every month
- Where? Zoom (see our public events calendar/.ical) and live stream on YouTube
Subscribe to the latest video tutorials on our YouTube channel

Join Us

Jina is backed by Jina AI and licensed under Apache-2.0. We are actively hiring AI engineers, solution engineers to build the next neural search ecosystem in opensource.

Contributing

We welcome all kinds of contributions from the open-source community, individuals and partners. We owe our success to your active involvement.

Name		Name	Last commit message	Last commit date
Latest commit History 6,482 Commits
.github		.github
Dockerfiles		Dockerfiles
cli		cli
conda		conda
daemon		daemon
docs		docs
jina		jina
scripts		scripts
tests		tests
.darglint		.darglint
.dockerignore		.dockerignore
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
RELEASE.md		RELEASE.md
extra-requirements.txt		extra-requirements.txt
fastentrypoints.py		fastentrypoints.py
pytest.ini		pytest.ini
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Install

Documentation

Run Quick Demo

Build Your First Jina App

Support

Join Us

Contributing

About

Releases

Packages

Languages

License

gmastrapas/jina

Folders and files

Latest commit

History

Repository files navigation

Install

Documentation

Run Quick Demo

Build Your First Jina App

Support

Join Us

Contributing

About

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages