Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v2.1 web #365

Merged
merged 4 commits into from
Dec 3, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions .github/workflows/web-demos.yml
Original file line number Diff line number Diff line change
Expand Up @@ -38,5 +38,11 @@ jobs:
- name: Pre-build dependencies
run: npm install yarn

# ************** REMOVE AFTER RELEASE ********************
- name: Build Local Packages
run: yarn && yarn copywasm && yarn build
working-directory: binding/web
# ********************************************************

- name: Install dependencies
run: yarn install
1 change: 1 addition & 0 deletions binding/web/.gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -4,3 +4,4 @@ lib/pv_cheetah*.wasm
cypress/fixtures/audio_samples/*
test/cheetah_params*.js
test/cheetah_params*.pv
test/test_data.json
7 changes: 7 additions & 0 deletions binding/web/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -191,6 +191,13 @@ Terminate `CheetahWorker` instance:
await handle.terminate();
```

### Language Model

Default models for supported languages can be found in [lib/common](../../lib/common).

Create custom language models using the [Picovoice Console](https://console.picovoice.ai/). Here you can train
language models with custom vocabulary and boost words in the existing vocabulary.

## Demo

For example usage refer to our [Web demo application](https://github.com/Picovoice/cheetah/tree/master/demo/web).
4 changes: 2 additions & 2 deletions binding/web/package.json
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
"description": "Cheetah Speech-to-Text engine for web browsers (via WebAssembly)",
"author": "Picovoice Inc",
"license": "Apache-2.0",
"version": "2.0.0",
"version": "2.1.0",
"keywords": [
"cheetah",
"web",
Expand Down Expand Up @@ -35,7 +35,7 @@
"test-perf": "cypress run --spec test/cheetah_perf.test.ts"
},
"dependencies": {
"@picovoice/web-utils": "=1.3.1"
"@picovoice/web-utils": "=1.4.3"
},
"devDependencies": {
"@babel/core": "^7.21.3",
Expand Down
12 changes: 12 additions & 0 deletions binding/web/scripts/setup_test.js
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,16 @@ const paramsSourceDirectory = join(
'common',
);

const testDataSource = join(
__dirname,
'..',
'..',
'..',
'resources',
'.test',
'test_data.json'
);

const sourceDirectory = join(
__dirname,
"..",
Expand All @@ -30,6 +40,8 @@ try {
fs.copyFileSync(join(paramsSourceDirectory, file), join(testDirectory, file));
});

fs.copyFileSync(testDataSource, join(testDirectory, 'test_data.json'));

fs.mkdirSync(join(fixturesDirectory, 'audio_samples'), { recursive: true });
fs.readdirSync(join(sourceDirectory, 'audio_samples')).forEach(file => {
fs.copyFileSync(join(sourceDirectory, 'audio_samples', file), join(fixturesDirectory, 'audio_samples', file));
Expand Down
17 changes: 13 additions & 4 deletions binding/web/src/cheetah.ts
Original file line number Diff line number Diff line change
Expand Up @@ -452,7 +452,7 @@ export class Cheetah {
// A WebAssembly page has a constant size of 64KiB. -> 1MiB ~= 16 pages
const memory = new WebAssembly.Memory({ initial: 3700 });

const memoryBufferUint8 = new Uint8Array(memory.buffer);
let memoryBufferUint8 = new Uint8Array(memory.buffer);

const pvError = new PvError();

Expand Down Expand Up @@ -551,14 +551,23 @@ export class Cheetah {
throw new CheetahErrors.CheetahOutOfMemoryError('malloc failed: Cannot allocate memory');
}

const memoryBufferView = new DataView(memory.buffer);
let memoryBufferView = new DataView(memory.buffer);

const status = await pv_cheetah_init(
accessKeyAddress,
modelPathAddress,
endpointDurationSec,
(enableAutomaticPunctuation) ? 1 : 0,
objectAddressAddress);

if (memoryBufferView.buffer.byteLength === 0) {
memoryBufferView = new DataView(memory.buffer);
}

if (memoryBufferUint8.buffer.byteLength === 0) {
memoryBufferUint8 = new Uint8Array(memory.buffer);
}

if (status !== PV_STATUS_SUCCESS) {
const messageStack = await Cheetah.getMessageStack(
pv_get_error_stack,
Expand Down Expand Up @@ -599,7 +608,7 @@ export class Cheetah {
frameLength: frameLength,
sampleRate: sampleRate,
version: version,

objectAddress: objectAddress,
inputBufferAddress: inputBufferAddress,
isEndpointAddress: isEndpointAddress,
Expand All @@ -625,7 +634,7 @@ export class Cheetah {
memoryBufferUint8: Uint8Array,
): Promise<string[]> {
const status = await pv_get_error_stack(messageStackAddressAddressAddress, messageStackDepthAddress);
if (status != PvStatus.SUCCESS) {
if (status !== PvStatus.SUCCESS) {
throw pvStatusToException(status, "Unable to get Cheetah error state");
}

Expand Down
115 changes: 62 additions & 53 deletions binding/web/test/cheetah.test.ts
Original file line number Diff line number Diff line change
@@ -1,19 +1,12 @@
import { Cheetah, CheetahWorker } from "../";
import { CheetahError } from "../dist/types/cheetah_error";
import { CheetahError } from "../dist/types/cheetah_errors";
import testData from './test_data.json';

// @ts-ignore
import cheetahParams from "./cheetah_params";
import { PvModel } from '@picovoice/web-utils';

const ACCESS_KEY: string = Cypress.env("ACCESS_KEY");

const testParam = {
language: 'en',
audio_file: 'test.wav',
transcript: 'Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
punctuations: ['.'],
error_rate: 0.025,
};
const ACCESS_KEY: string = Cypress.env('ACCESS_KEY');

const levenshteinDistance = (words1: string[], words2: string[]) => {
const res = Array.from(Array(words1.length + 1), () => new Array(words2.length + 1));
Expand All @@ -37,8 +30,10 @@ const levenshteinDistance = (words1: string[], words2: string[]) => {

const wordErrorRate = (reference: string, hypothesis: string, useCER = false): number => {
const splitter = (useCER) ? '' : ' ';
const ed = levenshteinDistance(reference.split(splitter), hypothesis.split(splitter));
return ed / reference.length;
const refWords = reference.split(splitter);
const hypWords = hypothesis.split(splitter);
const ed = levenshteinDistance(refWords, hypWords);
return ed / refWords.length;
};

function delay(time: number) {
Expand Down Expand Up @@ -131,7 +126,7 @@ const runProcTest = async (
model,
{
enableAutomaticPunctuation: enablePunctuation,
processErrorCallback: (error: string) => {
processErrorCallback: (error: CheetahError) => {
reject(error);
}
}
Expand Down Expand Up @@ -166,7 +161,7 @@ const runProcTest = async (

describe("Cheetah Binding", function () {
it(`should return process and flush error message stack`, async () => {
let errors: [CheetahError] = [];
let errors: CheetahError[] = [];

const runProcess = () => new Promise<void>(async resolve => {
const cheetah = await Cheetah.create(
Expand Down Expand Up @@ -285,45 +280,59 @@ describe("Cheetah Binding", function () {
});
});

it(`should be able to process (${testParam.language}) (${instanceString})`, () => {
try {
cy.getFramesFromFile(`audio_samples/${testParam.audio_file}`).then( async pcm => {
const suffix = (testParam.language === 'en') ? '' : `_${testParam.language}`;
await runProcTest(
instance,
pcm,
testParam.punctuations,
testParam.transcript,
testParam.error_rate,
{
model: { publicPath: `/test/cheetah_params${suffix}.pv`, forceWrite: true },
enablePunctuation: false,
useCER: (testParam.language === 'ja')
});
});
} catch (e) {
expect(e).to.be.undefined;
}
});
for (const testParam of testData.tests.language_tests) {
it(`should be able to process (${testParam.language}) (${instanceString})`, () => {
try {
cy.getFramesFromFile(`audio_samples/${testParam.audio_file}`).then(
async pcm => {
const suffix =
testParam.language === 'en' ? '' : `_${testParam.language}`;
await runProcTest(
instance,
pcm,
testParam.punctuations,
testParam.transcript,
testParam.error_rate,
{
model: {
publicPath: `/test/cheetah_params${suffix}.pv`,
forceWrite: true,
},
}
);
}
);
} catch (e) {
expect(e).to.be.undefined;
}
});

it(`should be able to process with punctuation (${testParam.language}) (${instanceString})`, () => {
try {
cy.getFramesFromFile(`audio_samples/${testParam.audio_file}`).then( async pcm => {
const suffix = (testParam.language === 'en') ? '' : `_${testParam.language}`;
await runProcTest(
instance,
pcm,
testParam.punctuations,
testParam.transcript,
testParam.error_rate,
{
model: { publicPath: `/test/cheetah_params${suffix}.pv`, forceWrite: true },
useCER: (testParam.language === 'ja')
});
});
} catch (e) {
expect(e).to.be.undefined;
}
});
it(`should be able to process with punctuation (${testParam.language}) (${instanceString})`, () => {
try {
cy.getFramesFromFile(`audio_samples/${testParam.audio_file}`).then(
async pcm => {
const suffix =
testParam.language === 'en' ? '' : `_${testParam.language}`;
await runProcTest(
instance,
pcm,
testParam.punctuations,
testParam.transcript,
testParam.error_rate,
{
model: {
publicPath: `/test/cheetah_params${suffix}.pv`,
forceWrite: true,
},
enablePunctuation: true,
}
);
}
);
} catch (e) {
expect(e).to.be.undefined;
}
});
}
}
});
20 changes: 10 additions & 10 deletions binding/web/yarn.lock
Original file line number Diff line number Diff line change
Expand Up @@ -1100,12 +1100,12 @@
"@nodelib/fs.scandir" "2.1.5"
fastq "^1.6.0"

"@picovoice/web-utils@=1.3.1":
version "1.3.1"
resolved "https://registry.yarnpkg.com/@picovoice/web-utils/-/web-utils-1.3.1.tgz#d417e98604a650b54a8e03669015ecf98c2383ec"
integrity sha512-jcDqdULtTm+yJrnHDjg64hARup+Z4wNkYuXHNx6EM8+qZkweBq9UA6XJrHAlUkPnlkso4JWjaIKhz3x8vZcd3g==
"@picovoice/web-utils@=1.4.3":
version "1.4.3"
resolved "https://registry.yarnpkg.com/@picovoice/web-utils/-/web-utils-1.4.3.tgz#1de0b20d6080c18d295c6df37c09d88bf7c4f555"
integrity sha512-7JN3YYsSD9Gtce6YKG3XqpX49dkeu7jTdbox7rHQA/X/Q3zxopXA9zlCKSq6EIjFbiX2iuzDKUx1XrFa3d8c0w==
dependencies:
commander "^9.2.0"
commander "^10.0.1"

"@rollup/plugin-babel@^6.0.3":
version "6.0.4"
Expand Down Expand Up @@ -1679,6 +1679,11 @@ combined-stream@^1.0.6, combined-stream@~1.0.6:
dependencies:
delayed-stream "~1.0.0"

commander@^10.0.1:
version "10.0.1"
resolved "https://registry.yarnpkg.com/commander/-/commander-10.0.1.tgz#881ee46b4f77d1c1dccc5823433aa39b022cbe06"
integrity sha512-y4Mg2tXshplEbSGzx7amzPwKKOCGuoSRP/CjEdwwk0FOGlUbq6lKuoyDZTNZkmxHdJtp54hdfY/JUrdL7Xfdug==

commander@^2.20.0:
version "2.20.3"
resolved "https://registry.yarnpkg.com/commander/-/commander-2.20.3.tgz#fd485e84c03eb4881c20722ba48035e8531aeb33"
Expand All @@ -1689,11 +1694,6 @@ commander@^5.1.0:
resolved "https://registry.yarnpkg.com/commander/-/commander-5.1.0.tgz#46abbd1652f8e059bddaef99bbdcb2ad9cf179ae"
integrity sha512-P0CysNDQ7rtVw4QIQtm+MRxV66vKFSvlsQvGYXZWR3qFU0jlMKHZZZgw8e+8DSah4UDKMqnknRDQz+xuQXQ/Zg==

commander@^9.2.0:
version "9.5.0"
resolved "https://registry.yarnpkg.com/commander/-/commander-9.5.0.tgz#bc08d1eb5cedf7ccb797a96199d41c7bc3e60d30"
integrity sha512-KRs7WVDKg86PWiuAqhDrAQnTXZKraVcCc6vFdL14qrZ/DcWwuRo7VoiYXalXO7S5GKpqYiVEwCbgFDfxNHKJBQ==

common-tags@^1.8.0:
version "1.8.2"
resolved "https://registry.yarnpkg.com/common-tags/-/common-tags-1.8.2.tgz#94ebb3c076d26032745fd54face7f688ef5ac9c6"
Expand Down
2 changes: 1 addition & 1 deletion demo/web/.gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -4,4 +4,4 @@ node_modules
dist/
*.log
.DS_Store
cheetah_params.js
models/*
10 changes: 6 additions & 4 deletions demo/web/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,18 +10,20 @@ Signup or Login to [Picovoice Console](https://console.picovoice.ai/) to get you

## Install & run

Use `yarn` or `npm` to install the dependencies, and the `start` script to start a local web server hosting the demo.
Use `yarn` or `npm` to install the dependencies, and the `start` script with a language code
to start a local web server hosting the demo in the language of your choice (e.g. `sv` -> Swedish, `zh` -> Mandarin).
To see a list of available languages, run `start` without a language code.

```console
yarn
yarn start
yarn start ${LANGUAGE}
```

(or)

```console
npm install
npm run start
npm run start ${LANGUAGE}
```

Open `localhost:5000` in your web browser, as hinted at in the output:
Expand All @@ -32,4 +34,4 @@ Available on:
Hit CTRL-C to stop the server
```

Wait until Cheetah and the WebVoiceProcessor have initialized. Say any phrase and Cheetah will start transcribing in real time.
Wait until Cheetah and the WebVoiceProcessor have initialized. Choose an audio file or record audio to transcribe.
4 changes: 2 additions & 2 deletions demo/web/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
<head>
<script src="node_modules/@picovoice/web-voice-processor/dist/iife/index.js"></script>
<script src="node_modules/@picovoice/cheetah-web/dist/iife/index.js"></script>
<script src="cheetah_params.js"></script>
<script src="models/cheetahModel.js"></script>
<script type="application/javascript">
let cheetah = null;

Expand All @@ -29,7 +29,7 @@
cheetah = await CheetahWeb.CheetahWorker.create(
accessKey,
cheetahTranscriptionCallback,
{ base64: modelParams },
cheetahModel,
{ enableAutomaticPunctuation: true }
);

Expand Down
5 changes: 2 additions & 3 deletions demo/web/package.json
Original file line number Diff line number Diff line change
Expand Up @@ -5,8 +5,7 @@
"main": "index.js",
"private": true,
"scripts": {
"postinstall": "npx pvbase64 -i ../../lib/common/cheetah_params.pv -o ./cheetah_params.js",
"start": "yarn run http-server -a localhost -p 5000"
"start": "node scripts/run_demo.js"
},
"keywords": [
"Picovoice",
Expand All @@ -18,7 +17,7 @@
"author": "Picovoice Inc",
"license": "Apache-2.0",
"dependencies": {
"@picovoice/cheetah-web": "~2.0.0",
"@picovoice/cheetah-web": "../../binding/web",
"@picovoice/web-voice-processor": "~4.0.8"
},
"devDependencies": {
Expand Down
Loading
Loading