Audio Anonymization

Audio anonymization is the process of modifying audio recordings to remove or alter personally identifiable information, thereby protecting the privacy of the speakers. In the context of linguistic and speech analysis, this often involves replacing personal names, locations, dates, and other private information with neutral or unrecognizable sounds, such as beeps.

Anonymization Methods

The script supports two distinct methods of audio anonymization:

1. TextGrid-based Anonymization

This method uses word-level alignment between audio and transcription, suitable for systematic anonymization of specific words or automatically detected personal information.

2. TRS-based Anonymization

This method uses manual annotations in Transcriber AG format (.trs files), suitable for selective anonymization of specific speech segments that contain sensitive information.

TextGrid-based Anonymization Process

Prerequisites

Before starting the TextGrid-based anonymization, you need:

Audio recording (.wav format)
Corresponding transcription (.txt format)
Montreal Forced Aligner (MFA) installed
Pronunciation dictionary
Acoustic model

Step 1: Forced Alignment with MFA

Audio files must be aligned with their transcriptions using the Montreal Forced Aligner. The alignment process generates TextGrid files with precise time stamps for each word:

mfa align </path/to/folder/with/wav/and/txt/files/> </path/to/pronunciation_dictionary.txt> </path/to/acoustic_model.zip> </path/to/aligned_output_files/>

This process creates TextGrid files containing:

Word-level segmentation
Time stamps for each word
Phone-level alignments

Detailed alignment instructions can be found here.

Step 2: Audio Anonymization with TextGrid

The TextGrid mode of the anonymizer can be run in two ways:

Automatic Name Detection

python audio_anonymizer.py textgrid input.wav input.TextGrid output.wav

This mode:

Uses spaCy's Slovenian transformer model for named entity recognition
Automatically detects personal names, organizations, and locations
Anonymizes all detected entities
Provides a report of identified and anonymized content

Manual Word Specification

python audio_anonymizer.py textgrid input.wav input.TextGrid output.wav --keywords word1 "word2*" word3

This mode allows:

Explicit specification of words to anonymize
Wildcard matching using * (e.g., [* matches all text in square brackets)
Case-insensitive matching
Combination of multiple keywords

TRS-based Anonymization Process

Prerequisites

Audio recording (.wav format)
Corresponding TRS file with manual annotations
Background tags marking sensitive content

Running TRS Anonymization

python audio_anonymizer.py trs input.wav input.trs output.wav

The TRS mode looks for specially marked segments in the transcription:

<Background time="start_time" type="shh" level="high"/>
[sensitive content]
<Background time="end_time" level="off"/>

Technical Details

Beep Generation

Both anonymization methods use sophisticated beep generation that:

Matches the volume of surrounding speech
Applies fade in/out effects for smoother transitions
Maintains the duration of the original speech segment

Volume Matching

The anonymizer ensures natural-sounding output by:

Analyzing the volume of surrounding speech
Adjusting beep volume to match
Applying a slight reduction factor for comfort

Quality Control

For best results:

Always verify the quality of forced alignment before anonymization
Check the automatically detected entities when using automatic mode
Listen to the anonymized output to ensure all sensitive content is properly handled
Keep backups of original files

Usage Examples

Example 1: Automatic Name Detection

python audio_anonymizer.py textgrid recording.wav transcript.TextGrid anonymized.wav

Output:

Identified keywords containing personal information: ['Janez', 'Novak', 'Ljubljana']
Anonymizing part from 1.23s to 1.89s: Janez
Anonymizing part from 2.45s to 3.12s: Novak
...

Example 2: Manual Word List with Wildcards

python audio_anonymizer.py textgrid recording.wav transcript.TextGrid anonymized.wav --keywords "Jan*" "Nov*" "Ljubljana"

Output:

Anonymizing part from 1.23s to 1.89s: Janez
Anonymizing part from 2.45s to 3.12s: Novak
...

Example 3: TRS Mode

python audio_anonymizer.py trs recording.wav transcript.trs anonymized.wav

Output:

Found 3 background intervals to anonymize:
  1230ms - 1890ms (duration: 660ms)
  Text to anonymize: [ime in priimek]
...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

anonymization.md

anonymization.md

Audio Anonymization

Anonymization Methods

1. TextGrid-based Anonymization

2. TRS-based Anonymization

TextGrid-based Anonymization Process

Prerequisites

Step 1: Forced Alignment with MFA

Step 2: Audio Anonymization with TextGrid

Automatic Name Detection

Manual Word Specification

TRS-based Anonymization Process

Prerequisites

Running TRS Anonymization

Technical Details

Beep Generation

Volume Matching

Quality Control

Usage Examples

Example 1: Automatic Name Detection

Example 2: Manual Word List with Wildcards

Example 3: TRS Mode

Files

anonymization.md

Latest commit

History

anonymization.md

File metadata and controls

Audio Anonymization

Anonymization Methods

1. TextGrid-based Anonymization

2. TRS-based Anonymization

TextGrid-based Anonymization Process

Prerequisites

Step 1: Forced Alignment with MFA

Step 2: Audio Anonymization with TextGrid

Automatic Name Detection

Manual Word Specification

TRS-based Anonymization Process

Prerequisites

Running TRS Anonymization

Technical Details

Beep Generation

Volume Matching

Quality Control

Usage Examples

Example 1: Automatic Name Detection

Example 2: Manual Word List with Wildcards

Example 3: TRS Mode