Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Orca prepare v0.2 #17

Merged
merged 181 commits into from
May 10, 2024
Merged
Changes from 1 commit
Commits
Show all changes
181 commits
Select commit Hold shift + click to select a range
2b11ca0
update orca
bejager Apr 9, 2024
51390be
add libs android, ios, java, mac, rpi
bejager Apr 9, 2024
894bedc
update tests to compare against raw pcm
bejager Apr 9, 2024
09dd9d0
first try streaming orca demo
bejager Apr 9, 2024
adb4963
version with interactive text input
bejager Apr 10, 2024
c3a4637
refactor streaming demo
bejager Apr 10, 2024
ee6369d
updated binding
bejager Apr 10, 2024
29dea80
update tests
bejager Apr 10, 2024
3e338b1
let user choose t/s
bejager Apr 10, 2024
5aa2ab6
include open ai llm in demo
bejager Apr 11, 2024
db22da2
add open ai tts
bejager Apr 11, 2024
9126b61
update libs
bejager Apr 11, 2024
c7129a6
refactor streaming demo
bejager Apr 11, 2024
fa46e69
tweaks
bejager Apr 11, 2024
b8c3166
move to llm folder
bejager Apr 11, 2024
4bb83a7
review tests
bejager Apr 11, 2024
c7b27a3
pin pip versions
bejager Apr 11, 2024
288f31f
don't depend on numpy
bejager Apr 11, 2024
38a5abd
use local package for python demo
bejager Apr 11, 2024
2fb2167
fix c test
bejager Apr 11, 2024
a859b21
add to dict, fix requirements for python demo
bejager Apr 11, 2024
7d9e567
update libs and fix test
bejager Apr 13, 2024
87ac894
update progress printer
bejager Apr 15, 2024
e775341
tweaks
bejager Apr 15, 2024
a51c81c
add python streaming demo
bejager Apr 16, 2024
3643d9b
add C streaming demo
bejager Apr 16, 2024
cead4be
add C streaming demo file
bejager Apr 16, 2024
0db84b4
add streaming demo to actions, add to README
bejager Apr 16, 2024
e9cc059
fix python tests
bejager Apr 16, 2024
d1c2f26
update llm demo
bejager Apr 16, 2024
b99308d
tweaks
bejager Apr 16, 2024
3da8c43
wip
bejager Apr 16, 2024
82d993c
update llm demo
bejager Apr 16, 2024
8b56b02
update llm demo
bejager Apr 17, 2024
4ed95fe
clean-up progress printer
bejager Apr 17, 2024
36db3ac
tweaks
bejager Apr 17, 2024
fad2982
merge
bejager Apr 22, 2024
4ac5b6a
option to save metadata for animation
bejager Apr 22, 2024
59fa51f
wip
bejager Apr 22, 2024
fd4712c
Merge branch 'orca-prepare-v0.2' of github.com:Picovoice/orca into or…
bejager Apr 22, 2024
a46cea3
update models
bejager Apr 22, 2024
d5c96db
fix llm demo
bejager Apr 22, 2024
89cf2e2
demo styling
bejager Apr 23, 2024
a0ecd4e
tweak
bejager Apr 23, 2024
606f40b
update python demo
bejager Apr 30, 2024
ee2a927
smart wait chunk calculation
bejager May 1, 2024
dbff25b
refactor python demo and update readme
bejager May 1, 2024
4ed4d90
clean-up
bejager May 1, 2024
651dd6f
tweaks
bejager May 1, 2024
4860e1a
comment exact alignment test
bejager May 1, 2024
fefc482
make spell check happy
bejager May 1, 2024
0ed4fa7
update workflows
bejager May 1, 2024
9a571d3
no v3.7
bejager May 1, 2024
0caf1e8
update tests
bejager May 1, 2024
1478902
update requirements
bejager May 1, 2024
494ff27
install dependency
bejager May 2, 2024
5d57420
update dependencies
bejager May 2, 2024
976a9c1
add fallback when no audio device is connected to runner
bejager May 2, 2024
8e3e4fc
update readme
bejager May 2, 2024
dadd98d
install portaudio on ubuntu
bejager May 2, 2024
d10f896
update README
bejager May 2, 2024
2347cc2
update libs
bejager May 2, 2024
b97651b
update demos
bejager May 2, 2024
f86b6c3
update alignment test data
bejager May 2, 2024
18e948c
Merge branch 'main' into orca-prepare-v0.2
bejager May 2, 2024
e618874
max character limit
albho Apr 16, 2024
816a30b
set random state
albho Apr 16, 2024
201f814
alignments, cleanup types
albho Apr 16, 2024
891b5bc
minor
albho Apr 18, 2024
b321890
main good
albho Apr 19, 2024
e926e8d
works
albho Apr 20, 2024
9267d89
worker & demo
albho Apr 22, 2024
edc1e4f
readme
albho Apr 22, 2024
b6dedc3
spelling
albho Apr 22, 2024
e5429d6
use promises instead of callbacks
albho Apr 23, 2024
8e619d4
update demo
albho Apr 25, 2024
7e2a92b
update demo
albho Apr 29, 2024
7500ecf
update
albho Apr 30, 2024
6bd7d6e
revert
albho Apr 30, 2024
8683598
minor
albho Apr 30, 2024
0ae5dd8
wait for second pcm to start playing audio
albho May 1, 2024
edc52f2
cleanup
albho May 2, 2024
cd0beae
update tests
bejager May 2, 2024
89f99ee
fix test
albho May 2, 2024
f6de5ef
test install portaudio
bejager May 2, 2024
ed8b9f7
tweaks
bejager May 2, 2024
6e2fb0d
more tweaks
bejager May 2, 2024
10c65c1
bump python version and test 3.10+
bejager May 2, 2024
7e3cfca
open numpy version
bejager May 2, 2024
66305ed
readme fixes
bejager May 3, 2024
d1e71c6
update readmes and workflows
bejager May 3, 2024
2724681
review
bejager May 3, 2024
9a4ddcc
open numpy
bejager May 3, 2024
5917c54
update demo setup script
bejager May 3, 2024
d975bdc
printouts to c demo test
bejager May 3, 2024
66603ed
fix ci, review 2
bejager May 3, 2024
a8915ab
add to spell dict
bejager May 3, 2024
1d42539
cleanup
bejager May 3, 2024
14b7a0c
align readmes with python
bejager May 3, 2024
ce9ac68
Merge branch 'orca-prepare-v0.2' into web-v0.2-update
bejager May 3, 2024
96c088f
README updates
bejager May 3, 2024
4717268
after release
bejager May 3, 2024
96c72fe
update
albho May 5, 2024
9bd158d
init
albho May 2, 2024
ea37402
update
albho May 2, 2024
5028cd0
fix
albho May 2, 2024
04242a0
fix
albho May 2, 2024
f199010
update
albho May 2, 2024
0447ae9
update
albho May 2, 2024
98e7233
update libs
albho May 2, 2024
1298370
revert
albho May 2, 2024
cb8beaa
update
albho May 2, 2024
2a3ed75
version
albho May 3, 2024
aea6bbf
update
albho May 6, 2024
e0d821d
test
albho May 6, 2024
c1a7f75
streaming working
albho May 6, 2024
38a5341
working
albho May 7, 2024
8f02600
cleanup
albho May 7, 2024
caaf4cd
minor
albho May 7, 2024
fb76010
minor
albho May 7, 2024
4102949
python open_stream -> stream_open
bejager May 7, 2024
6ed3ecc
update
albho May 7, 2024
ef567a7
minor
albho May 7, 2024
cb3ff2c
update readmes and comments
albho May 7, 2024
7997362
binding
albho Apr 26, 2024
2cfae3d
demo
albho May 1, 2024
6d61d8c
cleanup
albho May 1, 2024
4d5e1b4
wait for second chunk to play audio
albho May 1, 2024
529e250
minor clean
albho May 2, 2024
a2c1c27
update readme
bejager May 3, 2024
62f314c
update
albho May 5, 2024
fdc52a4
version
albho May 6, 2024
a8a64b2
update
albho May 7, 2024
9288b22
update ui
albho May 7, 2024
56e6cb1
update comments
albho May 8, 2024
7019010
change folder name and small improvements
bejager May 8, 2024
7854069
change title
bejager May 8, 2024
41a4397
staged
albho May 8, 2024
2e31804
actions
albho May 8, 2024
0f0e08b
try actions
albho May 8, 2024
74620c9
double quotes
albho May 8, 2024
af3a022
rm branch for perf
albho May 8, 2024
7d87858
revert perf
albho May 8, 2024
6451d09
Update web.yml
laves May 8, 2024
c55254d
Update web.yml
laves May 8, 2024
8e6caa8
post-init constants set on init
albho May 8, 2024
e8b3c03
initialize post-init constants in init function
bejager May 8, 2024
a62e910
workflow
albho May 8, 2024
a8136f3
fix
albho May 8, 2024
5491b17
trigger wf
albho May 8, 2024
e054363
trigger workflows, allow streaming synth on invalid input
albho May 8, 2024
741b11c
try fix appcenter
albho May 8, 2024
7fed5cb
protected
albho May 8, 2024
aa55260
harmonize python demo arguments
bejager May 8, 2024
26c23d2
update
albho May 8, 2024
d08f707
update
albho May 8, 2024
e264960
Merge pull request #20 from Picovoice/web-v0.2-update
bejager May 8, 2024
5beec79
update demo
albho May 8, 2024
585c8fa
workflows
albho May 8, 2024
7ec64c7
fix test
albho May 8, 2024
805cdae
test
albho May 8, 2024
100f9e2
update web readme
bejager May 8, 2024
8c4d791
fix perf test
albho May 9, 2024
10864b7
try fix perf
albho May 9, 2024
0a648b2
orca -> orcastream
bejager May 9, 2024
579d2d2
Merge pull request #22 from Picovoice/android-v0.2-update
bejager May 9, 2024
fcae359
update main readmes with new web and android
bejager May 9, 2024
190890c
web make OrcaStream type available
albho May 9, 2024
f1f435e
revert header change
albho May 9, 2024
d5e52f9
harmonize readmes
bejager May 9, 2024
f16dff6
demo update
albho May 9, 2024
65f0f5a
change to tag
albho May 9, 2024
06d7fc6
python release 0.2.1
bejager May 9, 2024
ea2844e
minor
albho May 9, 2024
0db4fcc
add iOS streaming version to main readme
bejager May 9, 2024
dbe3c57
web post-release
albho May 9, 2024
335444f
android post-release
albho May 9, 2024
850a213
ios post-release
albho May 9, 2024
ffacaf3
Merge pull request #23 from Picovoice/ios-v0.2-update
bejager May 9, 2024
c1bf63c
cleanup web, ios, android workflows
albho May 9, 2024
26e3d53
ios lint
albho May 10, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
binding
albho committed May 7, 2024
commit 7997362ca9d6b1cc8675d4aeefc2dcae4440a025
106 changes: 99 additions & 7 deletions binding/android/Orca/orca/src/main/java/ai/picovoice/orca/Orca.java
Original file line number Diff line number Diff line change
@@ -34,6 +34,7 @@ public class Orca {
}

private long handle;
private long stream;

/**
* Constructor.
@@ -96,7 +97,7 @@ public void delete() {
* @return The output audio.
* @throws OrcaException if there is an error while synthesizing audio.
*/
public short[] synthesize(String text, OrcaSynthesizeParams params) throws OrcaException {
public OrcaAudio synthesize(String text, OrcaSynthesizeParams params) throws OrcaException {
if (handle == 0) {
throw new OrcaInvalidStateException(
"Attempted to call Orca synthesize after delete."
@@ -106,7 +107,8 @@ public short[] synthesize(String text, OrcaSynthesizeParams params) throws OrcaE
return OrcaNative.synthesize(
handle,
text,
params.getSpeechRate());
params.getSpeechRate(),
params.getRandomState());
}

/**
@@ -123,7 +125,7 @@ public short[] synthesize(String text, OrcaSynthesizeParams params) throws OrcaE
* @param params Global parameters for synthesized text. See 'OrcaSynthesizeParams' for details.
* @throws OrcaException if there is an error while synthesizing audio to file.
*/
public void synthesizeToFile(
public OrcaWord[] synthesizeToFile(
String text,
String outputPath,
OrcaSynthesizeParams params) throws OrcaException {
@@ -133,11 +135,95 @@ public void synthesizeToFile(
);
}

OrcaNative.synthesizeToFile(
OrcaAudio result = OrcaNative.synthesizeToFile(
handle,
text,
outputPath,
params.getSpeechRate());
params.getSpeechRate(),
params.getRandomState());

return result.getWordArray();
}

/**
* TODO:
*
* @param params Global parameters for synthesized text. See 'OrcaSynthesizeParams' for details.
* @return Orca stream.
* @throws OrcaException if there is an error while synthesizing audio.
*/
public long streamOpen(OrcaSynthesizeParams params) throws OrcaException {
if (handle == 0) {
throw new OrcaInvalidStateException(
"Attempted to call Orca streamOpen after delete."
);
}

return OrcaNative.streamOpen(
handle,
params.getSpeechRate(),
params.getRandomState());
}

/**
* Generates audio from a live stream of text. The returned audio contains the speech representation of the text.
*
* @param text Text to be converted to audio. The maximum length can be attained by calling
* `getMaxCharacterLimit()`. Allowed characters can be retrieved by calling
* `getValidCharacters()`. Custom pronunciations can be embedded in the text via the
* syntax `{word|pronunciation}`. The pronunciation is expressed in ARPAbet format,
* e.g.: `I {liv|L IH V} in {Sevilla|S EH V IY Y AH}`.
* @throws OrcaException if there is an error while synthesizing audio.
*/
public short[] streamSynthesize(String text) throws OrcaException {
if (handle == 0) {
throw new OrcaInvalidStateException(
"Attempted to call Orca streamSynthesize after delete."
);
}
if (stream == 0) {
throw new OrcaInvalidStateException(
"Stream not initialized."
);
}

return OrcaNative.streamSynthesize(stream, text);
}

/**
* Flushes remaining text. The returned audio contains the speech representation of the text.
*
* @throws OrcaException if there is an error while synthesizing audio.
*/
public short[] streamFlush() throws OrcaException {
if (handle == 0) {
throw new OrcaInvalidStateException(
"Attempted to call Orca streamFlush after delete."
);
}
if (stream == 0) {
throw new OrcaInvalidStateException(
"Stream not initialized."
);
}

return OrcaNative.streamFlush(stream);
}

/**
* Deletes Orca stream.
*
* @throws OrcaException if there is an error while synthesizing audio.
*/
public void streamClose() throws OrcaException {
if (handle == 0) {
throw new OrcaInvalidStateException(
"Attempted to call Orca streamClose after delete."
);
}
if (stream != 0) {
OrcaNative.streamClose(stream);
}
}

/**
@@ -154,8 +240,14 @@ public String getVersion() {
*
* @return The maximum number of characters that can be synthesized at once.
*/
public int getMaxCharacterLimit() {
return OrcaNative.getMaxCharacterLimit();
public int getMaxCharacterLimit() throws OrcaException {
if (handle == 0) {
throw new OrcaInvalidStateException(
"Attempted to call Orca getMaxCharacterLimit after delete."
);
}

return OrcaNative.getMaxCharacterLimit(handle);
}

/**
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
/*
Copyright 2024 Picovoice Inc.
You may not use this file except in compliance with the license. A copy of the license is
located in the "LICENSE" file accompanying this source.
Unless required by applicable law or agreed to in writing, software distributed under the
License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either
express or implied. See the License for the specific language governing permissions and
limitations under the License.
*/

package ai.picovoice.orca;

public class OrcaAudio {

private final short[] pcm;
private final OrcaWord[] wordArray;

/**
* Constructor.
*
* @param pcm Synthesized audio.
* @param wordArray Synthesized words and their associated metadata.
*/
public OrcaAudio(short[] pcm, OrcaWord[] wordArray) {
this.pcm = pcm;
this.wordArray = wordArray;
}

/**
* Getter for the synthesized audio.
*
* @return Synthesized audio.
*/
public short[] getPcm() {
return pcm;
}

/**
* Getter for synthesized words and their associated metadata.
*
* @return Synthesized words and their associated metadata.
*/
public OrcaWord[] getWordArray() {
return wordArray;
}
}
Original file line number Diff line number Diff line change
@@ -26,16 +26,31 @@ class OrcaNative {

static native String[] getValidCharacters(long object) throws OrcaException;

static native int getMaxCharacterLimit();
static native int getMaxCharacterLimit(long object) throws OrcaException;

static native short[] synthesize(
static native OrcaAudio synthesize(
long object,
String text,
float speechRate) throws OrcaException;
float speechRate,
long randomState) throws OrcaException;

static native void synthesizeToFile(
static native OrcaAudio synthesizeToFile(
long object,
String text,
String outputPath,
float speechRate) throws OrcaException;
float speechRate,
long randomState) throws OrcaException;

static native long streamOpen(
long object,
float speechRate,
long randomState) throws OrcaException;

static native short[] streamSynthesize(
long object,
String text) throws OrcaException;

static native short[] streamFlush(long object) throws OrcaException;

static native void streamClose(long object);
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
/*
Copyright 2024 Picovoice Inc.
You may not use this file except in compliance with the license. A copy of the license is
located in the "LICENSE" file accompanying this source.
Unless required by applicable law or agreed to in writing, software distributed under the
License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either
express or implied. See the License for the specific language governing permissions and
limitations under the License.
*/

package ai.picovoice.orca;

public class OrcaPhoneme {

private final String phoneme;
private final float startSec;
private final float endSec;

/**
* Constructor.
*
* @param phoneme Synthesized phoneme.
* @param startSec Start time of the phoneme in seconds.
* @param endSec End time of the phoneme in seconds.
*/
public OrcaPhoneme(String phoneme, float startSec, float endSec) {
this.phoneme = phoneme;
this.startSec = startSec;
this.endSec = endSec;
}

/**
* Getter for the synthesized phoneme.
*
* @return Synthesized phoneme.
*/
public String getPhoneme() {
return phoneme;
}

/**
* Getter for the start time of the phoneme in seconds.
*
* @return Start time of the phoneme in seconds.
*/
public float getStartSec() {
return startSec;
}

/**
* Getter for the end time of the phoneme in seconds.
*
* @return End time of the phoneme in seconds.
*/
public float getEndSec() {
return endSec;
}
}
Original file line number Diff line number Diff line change
@@ -18,12 +18,14 @@
public class OrcaSynthesizeParams {

private final float speechRate;
private final long randomState;

/**
* Constructor.
*/
private OrcaSynthesizeParams(float speechRate) {
private OrcaSynthesizeParams(float speechRate, long randomState) {
this.speechRate = speechRate;
this.randomState = randomState;
}

/**
@@ -35,12 +37,22 @@ public float getSpeechRate() {
return this.speechRate;
}

/**
* Getter for the random state (i.e. the random state for the synthesized speech).
*
* @return Random State.
*/
public long getRandomState() {
return this.randomState;
}

/**
* Builder for creating instance of OrcaSynthesizeParams.
*/
public static class Builder {

private float speechRate = 1.0f;
private long randomState = 1; // TODO: After Ben updates the JNI, update this to -1

/**
* Sets the speech rate.
@@ -53,6 +65,17 @@ public Builder setSpeechRate(float speechRate) {
return this;
}

/**
* Sets the random state.
*
* @param randomState The random state for the synthesized speech.
* @return Modified builder object.
*/
public Builder setRandomState(long randomState) {
this.randomState = randomState;
return this;
}

/**
* Validates properties and creates an instance of OrcaSynthesizeParams.
*
@@ -66,7 +89,7 @@ public OrcaSynthesizeParams build() throws OrcaInvalidArgumentException {
);
}

return new OrcaSynthesizeParams(speechRate);
return new OrcaSynthesizeParams(speechRate, randomState);
}
}
}
Loading