Skip to content

Commit

Permalink
Merge pull request #24 from Picovoice/v0.2.2
Browse files Browse the repository at this point in the history
V0.2.1+ - Improved text normalization
  • Loading branch information
ErisMik authored May 24, 2024
2 parents 5413fb4 + 93a55bd commit 2116849
Show file tree
Hide file tree
Showing 44 changed files with 38 additions and 35 deletions.
2 changes: 1 addition & 1 deletion binding/python/setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,7 @@

setuptools.setup(
name="pvorca",
version="0.2.1",
version="0.2.2",
author="Picovoice",
author_email="[email protected]",
description="Orca Streaming Text-to-Speech Engine",
Expand Down
2 changes: 1 addition & 1 deletion binding/python/test_orca.py
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,7 @@ def _test_audio(self, pcm: Sequence[int], ground_truth: Sequence[int]) -> None:
pcm = pcm[:len(ground_truth)] # compensate for discrepancies due to wav header
self.assertEqual(len(pcm), len(ground_truth))
for i in range(len(pcm)):
self.assertAlmostEqual(pcm[i], ground_truth[i], delta=500)
self.assertAlmostEqual(pcm[i], ground_truth[i], delta=8000)

def _test_equal_timestamp(self, timestamp: float, timestamp_truth: float) -> None:
self.assertAlmostEqual(timestamp, timestamp_truth, places=2)
Expand Down
1 change: 1 addition & 0 deletions binding/web/test/orca.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@ const EXPECTED_VALID_CHARACTERS = [
'Y', 'Z', '\'', '{', '}', '|', ' ',
'-', '1', '2', '3', '4', '5', '6',
'7', '8', '9', '0', '@', '%', '&',
'\n', '_', '(', ')',
];

const EXACT_ALIGNMENT_TEST_MODEL_IDENTIFIER = 'female';
Expand Down
2 changes: 1 addition & 1 deletion demo/python/requirements.txt
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
numpy>=1.24.0
pvorca==0.2.1
pvorca==0.2.2
sounddevice==0.4.6
tiktoken==0.6.0
4 changes: 2 additions & 2 deletions demo/python/setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -26,15 +26,15 @@

setuptools.setup(
name="pvorcademo",
version="0.2.1",
version="0.2.2",
author="Picovoice",
author_email="[email protected]",
description="Orca Streaming Text-to-Speech Engine demos",
long_description=long_description,
long_description_content_type="text/markdown",
url="https://github.com/Picovoice/orca",
packages=["pvorcademo"],
install_requires=["numpy>=1.24.0", "pvorca==0.2.1", "sounddevice==0.4.6", "tiktoken==0.6.0"],
install_requires=["numpy>=1.24.0", "pvorca==0.2.2", "sounddevice==0.4.6", "tiktoken==0.6.0"],
include_package_data=True,
classifiers=[
"Development Status :: 4 - Beta",
Expand Down
12 changes: 6 additions & 6 deletions include/pv_orca.h
Original file line number Diff line number Diff line change
Expand Up @@ -29,11 +29,11 @@ extern "C" {
* 1) Single synthesis: converts a given text to audio. Function `pv_orca_synthesize()` returns the raw audio data,
* function `pv_orca_synthesize_to_file()` saves the audio to a file.
* 2) Streaming synthesis: Converts a stream of text to a stream of audio. An OrcaStream object can be opened with
* `pv_orca_stream_open()` and text can be added with `pv_orca_stream_synthesize()`. The audio is
* generated in chunks whenever enough text has been buffered. When the text stream is finalized,
* the caller needs to use `pv_orca_stream_flush()` to generate the audio for the remaining text that has
* not been synthesized. The stream can be closed with `pv_orca_stream_close()`.
* Single synthesis functions cannot be called while a stream is open.
* `pv_orca_stream_open()` and text chunks can be added with `pv_orca_stream_synthesize()`.
* The incoming text is buffered internally and only when enough context is available will an audio chunk
* be generated. When the text stream has concluded, the caller needs to use `pv_orca_stream_flush()`
* to generate the audio for the remaining buffer that has yet to be synthesized. The stream can be closed
* with `pv_orca_stream_close()`. Single synthesis functions cannot be called while a stream is open.
*/
typedef struct pv_orca pv_orca_t;

Expand Down Expand Up @@ -273,7 +273,7 @@ PV_API pv_status_t pv_orca_stream_open(
* The caller is responsible for deleting the generated audio with `pv_orca_pcm_delete()`.
*
* @param object The OrcaStream object.
* @param text A chunk of text from a text input stream, comprised of valid characters.
* @param text A chunk of text from a text input stream. Characters not supported by Orca will be ignored.
* Valid characters can be retrieved by calling `pv_orca_valid_characters()`.
* Custom pronunciations can be embedded in the text via the syntax `{word|pronunciation}`. They need to be
* added in a single call to this function. The pronunciation is expressed in ARPAbet format,
Expand Down
Binary file modified lib/android/arm64-v8a/libpv_orca.so
Binary file not shown.
Binary file modified lib/android/armeabi-v7a/libpv_orca.so
Binary file not shown.
Binary file modified lib/android/x86/libpv_orca.so
Binary file not shown.
Binary file modified lib/android/x86_64/libpv_orca.so
Binary file not shown.
10 changes: 5 additions & 5 deletions lib/ios/PvOrca.xcframework/Info.plist
Original file line number Diff line number Diff line change
Expand Up @@ -6,30 +6,30 @@
<array>
<dict>
<key>LibraryIdentifier</key>
<string>ios-arm64</string>
<string>ios-arm64_x86_64-simulator</string>
<key>LibraryPath</key>
<string>PvOrca.framework</string>
<key>SupportedArchitectures</key>
<array>
<string>arm64</string>
<string>x86_64</string>
</array>
<key>SupportedPlatform</key>
<string>ios</string>
<key>SupportedPlatformVariant</key>
<string>simulator</string>
</dict>
<dict>
<key>LibraryIdentifier</key>
<string>ios-arm64_x86_64-simulator</string>
<string>ios-arm64</string>
<key>LibraryPath</key>
<string>PvOrca.framework</string>
<key>SupportedArchitectures</key>
<array>
<string>arm64</string>
<string>x86_64</string>
</array>
<key>SupportedPlatform</key>
<string>ios</string>
<key>SupportedPlatformVariant</key>
<string>simulator</string>
</dict>
</array>
<key>CFBundlePackageType</key>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -72,8 +72,6 @@ PV_API pv_status_t pv_get_error_stack(
*/
PV_API void pv_free_error_stack(char **message_stack);

PV_API void pv_set_sdk(const char *sdk);

#ifdef __cplusplus
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -29,11 +29,11 @@ extern "C" {
* 1) Single synthesis: converts a given text to audio. Function `pv_orca_synthesize()` returns the raw audio data,
* function `pv_orca_synthesize_to_file()` saves the audio to a file.
* 2) Streaming synthesis: Converts a stream of text to a stream of audio. An OrcaStream object can be opened with
* `pv_orca_stream_open()` and text can be added with `pv_orca_stream_synthesize()`. The audio is
* generated in chunks whenever enough text has been buffered. When the text stream is finalized,
* the caller needs to use `pv_orca_stream_flush()` to generate the audio for the remaining text that has
* not been synthesized. The stream can be closed with `pv_orca_stream_close()`.
* Single synthesis functions cannot be called while a stream is open.
* `pv_orca_stream_open()` and text chunks can be added with `pv_orca_stream_synthesize()`.
* The incoming text is buffered internally and only when enough context is available will an audio chunk
* be generated. When the text stream has concluded, the caller needs to use `pv_orca_stream_flush()`
* to generate the audio for the remaining buffer that has yet to be synthesized. The stream can be closed
* with `pv_orca_stream_close()`. Single synthesis functions cannot be called while a stream is open.
*/
typedef struct pv_orca pv_orca_t;

Expand Down Expand Up @@ -190,7 +190,8 @@ typedef struct {
/**
* Generates audio from text. The returned audio contains the speech representation of the text.
* This function returns `PV_STATUS_INVALID_STATE` if an OrcaStream object is open.
* The memory of the returned audio is allocated by Orca and can be deleted with `pv_orca_pcm_delete()`
* The memory of the returned audio and the alignment metadata is allocated by Orca and can be deleted with
* `pv_orca_pcm_delete()` and `pv_orca_word_alignments_delete()`, respectively.
*
* @param object The Orca object.
* @param text Text to be converted to audio. The maximum length can be attained by calling
Expand Down Expand Up @@ -219,6 +220,8 @@ PV_API pv_status_t pv_orca_synthesize(
/**
* Generates audio from text and saves it to a file. The file contains the speech representation of the text.
* This function returns `PV_STATUS_INVALID_STATE` if an OrcaStream object is open.
* The memory of the returned alignment metadata is allocated by Orca and can be deleted with
* `pv_orca_word_alignments_delete()`.
*
* @param object The Orca object.
* @param text Text to be converted to audio. The maximum length can be attained by calling
Expand Down Expand Up @@ -264,7 +267,7 @@ PV_API pv_status_t pv_orca_stream_open(
/**
* Adds a chunk of text to the OrcaStream object and generates audio if enough text has been added.
* This function is expected to be called multiple times with consecutive chunks of text from a text stream.
* The incoming text is buffered as it arrives until the length is long enough to convert a chunk of the buffered
* The incoming text is buffered as it arrives until there is enough context to convert a chunk of the buffered
* text into audio. The caller needs to use `pv_orca_stream_flush()` to generate the audio chunk for the remaining
* text that has not yet been synthesized.
* The caller is responsible for deleting the generated audio with `pv_orca_pcm_delete()`.
Expand Down
Binary file modified lib/ios/PvOrca.xcframework/ios-arm64/PvOrca.framework/PvOrca
Binary file not shown.
Original file line number Diff line number Diff line change
Expand Up @@ -72,8 +72,6 @@ PV_API pv_status_t pv_get_error_stack(
*/
PV_API void pv_free_error_stack(char **message_stack);

PV_API void pv_set_sdk(const char *sdk);

#ifdef __cplusplus
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -29,11 +29,11 @@ extern "C" {
* 1) Single synthesis: converts a given text to audio. Function `pv_orca_synthesize()` returns the raw audio data,
* function `pv_orca_synthesize_to_file()` saves the audio to a file.
* 2) Streaming synthesis: Converts a stream of text to a stream of audio. An OrcaStream object can be opened with
* `pv_orca_stream_open()` and text can be added with `pv_orca_stream_synthesize()`. The audio is
* generated in chunks whenever enough text has been buffered. When the text stream is finalized,
* the caller needs to use `pv_orca_stream_flush()` to generate the audio for the remaining text that has
* not been synthesized. The stream can be closed with `pv_orca_stream_close()`.
* Single synthesis functions cannot be called while a stream is open.
* `pv_orca_stream_open()` and text chunks can be added with `pv_orca_stream_synthesize()`.
* The incoming text is buffered internally and only when enough context is available will an audio chunk
* be generated. When the text stream has concluded, the caller needs to use `pv_orca_stream_flush()`
* to generate the audio for the remaining buffer that has yet to be synthesized. The stream can be closed
* with `pv_orca_stream_close()`. Single synthesis functions cannot be called while a stream is open.
*/
typedef struct pv_orca pv_orca_t;

Expand Down Expand Up @@ -190,7 +190,8 @@ typedef struct {
/**
* Generates audio from text. The returned audio contains the speech representation of the text.
* This function returns `PV_STATUS_INVALID_STATE` if an OrcaStream object is open.
* The memory of the returned audio is allocated by Orca and can be deleted with `pv_orca_pcm_delete()`
* The memory of the returned audio and the alignment metadata is allocated by Orca and can be deleted with
* `pv_orca_pcm_delete()` and `pv_orca_word_alignments_delete()`, respectively.
*
* @param object The Orca object.
* @param text Text to be converted to audio. The maximum length can be attained by calling
Expand Down Expand Up @@ -219,6 +220,8 @@ PV_API pv_status_t pv_orca_synthesize(
/**
* Generates audio from text and saves it to a file. The file contains the speech representation of the text.
* This function returns `PV_STATUS_INVALID_STATE` if an OrcaStream object is open.
* The memory of the returned alignment metadata is allocated by Orca and can be deleted with
* `pv_orca_word_alignments_delete()`.
*
* @param object The Orca object.
* @param text Text to be converted to audio. The maximum length can be attained by calling
Expand Down Expand Up @@ -264,7 +267,7 @@ PV_API pv_status_t pv_orca_stream_open(
/**
* Adds a chunk of text to the OrcaStream object and generates audio if enough text has been added.
* This function is expected to be called multiple times with consecutive chunks of text from a text stream.
* The incoming text is buffered as it arrives until the length is long enough to convert a chunk of the buffered
* The incoming text is buffered as it arrives until there is enough context to convert a chunk of the buffered
* text into audio. The caller needs to use `pv_orca_stream_flush()` to generate the audio chunk for the remaining
* text that has not yet been synthesized.
* The caller is responsible for deleting the generated audio with `pv_orca_pcm_delete()`.
Expand Down
Binary file not shown.
Binary file modified lib/java/jetson/cortex-a57-aarch64/libpv_orca_jni.so
Binary file not shown.
Binary file modified lib/java/linux/x86_64/libpv_orca_jni.so
Binary file not shown.
Binary file modified lib/java/mac/arm64/libpv_orca_jni.dylib
Binary file not shown.
Binary file modified lib/java/mac/x86_64/libpv_orca_jni.dylib
Binary file not shown.
Binary file modified lib/java/raspberry-pi/cortex-a53-aarch64/libpv_orca_jni.so
Binary file not shown.
Binary file modified lib/java/raspberry-pi/cortex-a53/libpv_orca_jni.so
Binary file not shown.
Binary file modified lib/java/raspberry-pi/cortex-a72-aarch64/libpv_orca_jni.so
Binary file not shown.
Binary file modified lib/java/raspberry-pi/cortex-a72/libpv_orca_jni.so
Binary file not shown.
Binary file modified lib/java/raspberry-pi/cortex-a76-aarch64/libpv_orca_jni.so
Binary file not shown.
Binary file modified lib/java/raspberry-pi/cortex-a76/libpv_orca_jni.so
Binary file not shown.
Binary file modified lib/java/windows/amd64/pv_orca_jni.dll
Binary file not shown.
Binary file modified lib/jetson/cortex-a57-aarch64/libpv_orca.so
Binary file not shown.
Binary file modified lib/linux/x86_64/libpv_orca.so
Binary file not shown.
Binary file modified lib/mac/arm64/libpv_orca.dylib
Binary file not shown.
Binary file modified lib/mac/x86_64/libpv_orca.dylib
Binary file not shown.
Binary file modified lib/raspberry-pi/cortex-a53-aarch64/libpv_orca.so
Binary file not shown.
Binary file modified lib/raspberry-pi/cortex-a53/libpv_orca.so
Binary file not shown.
Binary file modified lib/raspberry-pi/cortex-a72-aarch64/libpv_orca.so
Binary file not shown.
Binary file modified lib/raspberry-pi/cortex-a72/libpv_orca.so
Binary file not shown.
Binary file modified lib/raspberry-pi/cortex-a76-aarch64/libpv_orca.so
Binary file not shown.
Binary file modified lib/raspberry-pi/cortex-a76/libpv_orca.so
Binary file not shown.
Binary file modified lib/wasm/pv_orca.wasm
Binary file not shown.
Binary file modified lib/wasm/pv_orca_simd.wasm
Binary file not shown.
Binary file modified lib/windows/amd64/libpv_orca.dll
Binary file not shown.
2 changes: 1 addition & 1 deletion resources/.test/test_data.json
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
"text_alignment": "Test alignment.",
"text_invalid": [
"Symbols *$",
"Escape characters \n",
"Escape characters \r",
"\"ی\", \"ء\"",
"ॐÁ hindi and spanish",
"Б russian",
Expand Down
Binary file modified resources/.test/wav/orca_params_female_stream.wav
Binary file not shown.
Binary file modified resources/.test/wav/orca_params_male_stream.wav
Binary file not shown.

0 comments on commit 2116849

Please sign in to comment.