Skip to content

Commit a80004b

Browse files
committed
Merge branch 'main' of github.com:philterd/phileas
2 parents cb50db2 + 4ecf5f4 commit a80004b

File tree

219 files changed

+1051
-13453
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

219 files changed

+1051
-13453
lines changed

.github/workflows/build.yml

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
name: Build
2-
on: [pull_request, workflow_dispatch]
2+
on: [push, pull_request, workflow_dispatch]
33
jobs:
44
build:
55
runs-on: ubuntu-latest

.mvn/wrapper/maven-wrapper.properties

+18
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
# Licensed to the Apache Software Foundation (ASF) under one
2+
# or more contributor license agreements. See the NOTICE file
3+
# distributed with this work for additional information
4+
# regarding copyright ownership. The ASF licenses this file
5+
# to you under the Apache License, Version 2.0 (the
6+
# "License"); you may not use this file except in compliance
7+
# with the License. You may obtain a copy of the License at
8+
#
9+
# http://www.apache.org/licenses/LICENSE-2.0
10+
#
11+
# Unless required by applicable law or agreed to in writing,
12+
# software distributed under the License is distributed on an
13+
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
14+
# KIND, either express or implied. See the License for the
15+
# specific language governing permissions and limitations
16+
# under the License.
17+
distributionUrl=https://repo.maven.apache.org/maven2/org/apache/maven/apache-maven/3.6.3/apache-maven-3.6.3-bin.zip
18+
wrapperUrl=https://repo.maven.apache.org/maven2/org/apache/maven/wrapper/maven-wrapper/3.2.0/maven-wrapper-3.2.0.jar

README.md

+12-3
Original file line numberDiff line numberDiff line change
@@ -1,17 +1,26 @@
11
# Phileas
22

3-
Phileas is a Java library to redact PII, PHI, and other sensitive information from text. Given text or documents (PDF), Phileas analyzes the text searching for sensitive information such as persons' names, ages, addresses, and many other types of information. Phileas is highly configurable through its settings and policies.s
3+
Phileas is a Java library to deidentify text and redact PII, PHI, and other sensitive information from text. Given text or documents (PDF), Phileas analyzes the text searching for sensitive information such as persons' names, ages, addresses, and many other types of information. Phileas is highly configurable through its settings and policies.
44

55
When sensitive information is identified, Phileas can manipulate the sensitive information in a variety of ways. The information can be replaced, encrypted, anonymized, and more. The user chooses how to manipulate each type of sensitive information. We refer to each of these methods in whole as "redaction."
66

77
Information can be redacted based on the content of the information and other attributes. For example, only certain persons' names, only zip codes meeting some qualification, or IP addresses that match a given pattern.
88

9-
Phileas is the underlying core of [Philter](https://philterd.ai/philter/), a turnkey text redaction engine which is built on top of Phileas and provides an API for redacting text. Philter runs entirely within your cloud and never transmits data outside of your cloud. Custom AI models are available for domains like healthcare, legal, and news.
9+
## Powered by Phileas
10+
11+
Phileas is the underlying core of [Philter](https://www.philterd.io/philter/), a turnkey text redaction engine which is built on top of Phileas and provides an API for redacting text. Philter runs entirely within your cloud and never transmits data outside of your cloud. Custom AI models are available for domains like healthcare, legal, and news.
1012

1113
* [Philter on the AWS Marketplace](https://aws.amazon.com/marketplace/pp/B07YVB8FFT?ref=_ptnr_philterd)
1214
* [Philer on the Google Cloud Marketplace](https://console.cloud.google.com/marketplace/product/philterd-public/philter)
1315
* [Philter on the Azure Marketplace](https://azuremarketplace.microsoft.com/en-us/marketplace/apps/philterdllc1687189098111.philter?tab=Overview)
14-
* On-prem deployments by contacting us at [https://www.philterd.ai/](https://www.philterd.ai).
16+
* On-prem deployments by contacting us at [https://www.philterd.io/](https://www.philterd.io).
17+
18+
Phileas also powers [Airlock](https://www.philterd.io/airlock), an AI policy layer to prevent the disclosure of sensitive information, such as PII and PHI, in your AI applications.
19+
20+
* [Airlock on the AWS Marketplace](https://aws.amazon.com/marketplace/pp/prodview-inkh5a3kbhtf2)
21+
* [Airlock on the Google Cloud Marketplace](https://console.cloud.google.com/marketplace/product/philterd-public/airlock)
22+
* [Airlock on the Azure Marketplace](https://azuremarketplace.microsoft.com/en-us/marketplace/apps/philterdllc1687189098111.airlock?tab=Overview)
23+
* On-prem deployments by contacting us at [https://www.philterd.io/](https://www.philterd.io).
1524

1625
## What Phileas Can Do
1726

mvnw

+308
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,308 @@
1+
#!/bin/sh
2+
# ----------------------------------------------------------------------------
3+
# Licensed to the Apache Software Foundation (ASF) under one
4+
# or more contributor license agreements. See the NOTICE file
5+
# distributed with this work for additional information
6+
# regarding copyright ownership. The ASF licenses this file
7+
# to you under the Apache License, Version 2.0 (the
8+
# "License"); you may not use this file except in compliance
9+
# with the License. You may obtain a copy of the License at
10+
#
11+
# http://www.apache.org/licenses/LICENSE-2.0
12+
#
13+
# Unless required by applicable law or agreed to in writing,
14+
# software distributed under the License is distributed on an
15+
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
16+
# KIND, either express or implied. See the License for the
17+
# specific language governing permissions and limitations
18+
# under the License.
19+
# ----------------------------------------------------------------------------
20+
21+
# ----------------------------------------------------------------------------
22+
# Apache Maven Wrapper startup batch script, version 3.2.0
23+
#
24+
# Required ENV vars:
25+
# ------------------
26+
# JAVA_HOME - location of a JDK home dir
27+
#
28+
# Optional ENV vars
29+
# -----------------
30+
# MAVEN_OPTS - parameters passed to the Java VM when running Maven
31+
# e.g. to debug Maven itself, use
32+
# set MAVEN_OPTS=-Xdebug -Xrunjdwp:transport=dt_socket,server=y,suspend=y,address=8000
33+
# MAVEN_SKIP_RC - flag to disable loading of mavenrc files
34+
# ----------------------------------------------------------------------------
35+
36+
if [ -z "$MAVEN_SKIP_RC" ] ; then
37+
38+
if [ -f /usr/local/etc/mavenrc ] ; then
39+
. /usr/local/etc/mavenrc
40+
fi
41+
42+
if [ -f /etc/mavenrc ] ; then
43+
. /etc/mavenrc
44+
fi
45+
46+
if [ -f "$HOME/.mavenrc" ] ; then
47+
. "$HOME/.mavenrc"
48+
fi
49+
50+
fi
51+
52+
# OS specific support. $var _must_ be set to either true or false.
53+
cygwin=false;
54+
darwin=false;
55+
mingw=false
56+
case "$(uname)" in
57+
CYGWIN*) cygwin=true ;;
58+
MINGW*) mingw=true;;
59+
Darwin*) darwin=true
60+
# Use /usr/libexec/java_home if available, otherwise fall back to /Library/Java/Home
61+
# See https://developer.apple.com/library/mac/qa/qa1170/_index.html
62+
if [ -z "$JAVA_HOME" ]; then
63+
if [ -x "/usr/libexec/java_home" ]; then
64+
JAVA_HOME="$(/usr/libexec/java_home)"; export JAVA_HOME
65+
else
66+
JAVA_HOME="/Library/Java/Home"; export JAVA_HOME
67+
fi
68+
fi
69+
;;
70+
esac
71+
72+
if [ -z "$JAVA_HOME" ] ; then
73+
if [ -r /etc/gentoo-release ] ; then
74+
JAVA_HOME=$(java-config --jre-home)
75+
fi
76+
fi
77+
78+
# For Cygwin, ensure paths are in UNIX format before anything is touched
79+
if $cygwin ; then
80+
[ -n "$JAVA_HOME" ] &&
81+
JAVA_HOME=$(cygpath --unix "$JAVA_HOME")
82+
[ -n "$CLASSPATH" ] &&
83+
CLASSPATH=$(cygpath --path --unix "$CLASSPATH")
84+
fi
85+
86+
# For Mingw, ensure paths are in UNIX format before anything is touched
87+
if $mingw ; then
88+
[ -n "$JAVA_HOME" ] && [ -d "$JAVA_HOME" ] &&
89+
JAVA_HOME="$(cd "$JAVA_HOME" || (echo "cannot cd into $JAVA_HOME."; exit 1); pwd)"
90+
fi
91+
92+
if [ -z "$JAVA_HOME" ]; then
93+
javaExecutable="$(which javac)"
94+
if [ -n "$javaExecutable" ] && ! [ "$(expr "\"$javaExecutable\"" : '\([^ ]*\)')" = "no" ]; then
95+
# readlink(1) is not available as standard on Solaris 10.
96+
readLink=$(which readlink)
97+
if [ ! "$(expr "$readLink" : '\([^ ]*\)')" = "no" ]; then
98+
if $darwin ; then
99+
javaHome="$(dirname "\"$javaExecutable\"")"
100+
javaExecutable="$(cd "\"$javaHome\"" && pwd -P)/javac"
101+
else
102+
javaExecutable="$(readlink -f "\"$javaExecutable\"")"
103+
fi
104+
javaHome="$(dirname "\"$javaExecutable\"")"
105+
javaHome=$(expr "$javaHome" : '\(.*\)/bin')
106+
JAVA_HOME="$javaHome"
107+
export JAVA_HOME
108+
fi
109+
fi
110+
fi
111+
112+
if [ -z "$JAVACMD" ] ; then
113+
if [ -n "$JAVA_HOME" ] ; then
114+
if [ -x "$JAVA_HOME/jre/sh/java" ] ; then
115+
# IBM's JDK on AIX uses strange locations for the executables
116+
JAVACMD="$JAVA_HOME/jre/sh/java"
117+
else
118+
JAVACMD="$JAVA_HOME/bin/java"
119+
fi
120+
else
121+
JAVACMD="$(\unset -f command 2>/dev/null; \command -v java)"
122+
fi
123+
fi
124+
125+
if [ ! -x "$JAVACMD" ] ; then
126+
echo "Error: JAVA_HOME is not defined correctly." >&2
127+
echo " We cannot execute $JAVACMD" >&2
128+
exit 1
129+
fi
130+
131+
if [ -z "$JAVA_HOME" ] ; then
132+
echo "Warning: JAVA_HOME environment variable is not set."
133+
fi
134+
135+
# traverses directory structure from process work directory to filesystem root
136+
# first directory with .mvn subdirectory is considered project base directory
137+
find_maven_basedir() {
138+
if [ -z "$1" ]
139+
then
140+
echo "Path not specified to find_maven_basedir"
141+
return 1
142+
fi
143+
144+
basedir="$1"
145+
wdir="$1"
146+
while [ "$wdir" != '/' ] ; do
147+
if [ -d "$wdir"/.mvn ] ; then
148+
basedir=$wdir
149+
break
150+
fi
151+
# workaround for JBEAP-8937 (on Solaris 10/Sparc)
152+
if [ -d "${wdir}" ]; then
153+
wdir=$(cd "$wdir/.." || exit 1; pwd)
154+
fi
155+
# end of workaround
156+
done
157+
printf '%s' "$(cd "$basedir" || exit 1; pwd)"
158+
}
159+
160+
# concatenates all lines of a file
161+
concat_lines() {
162+
if [ -f "$1" ]; then
163+
# Remove \r in case we run on Windows within Git Bash
164+
# and check out the repository with auto CRLF management
165+
# enabled. Otherwise, we may read lines that are delimited with
166+
# \r\n and produce $'-Xarg\r' rather than -Xarg due to word
167+
# splitting rules.
168+
tr -s '\r\n' ' ' < "$1"
169+
fi
170+
}
171+
172+
log() {
173+
if [ "$MVNW_VERBOSE" = true ]; then
174+
printf '%s\n' "$1"
175+
fi
176+
}
177+
178+
BASE_DIR=$(find_maven_basedir "$(dirname "$0")")
179+
if [ -z "$BASE_DIR" ]; then
180+
exit 1;
181+
fi
182+
183+
MAVEN_PROJECTBASEDIR=${MAVEN_BASEDIR:-"$BASE_DIR"}; export MAVEN_PROJECTBASEDIR
184+
log "$MAVEN_PROJECTBASEDIR"
185+
186+
##########################################################################################
187+
# Extension to allow automatically downloading the maven-wrapper.jar from Maven-central
188+
# This allows using the maven wrapper in projects that prohibit checking in binary data.
189+
##########################################################################################
190+
wrapperJarPath="$MAVEN_PROJECTBASEDIR/.mvn/wrapper/maven-wrapper.jar"
191+
if [ -r "$wrapperJarPath" ]; then
192+
log "Found $wrapperJarPath"
193+
else
194+
log "Couldn't find $wrapperJarPath, downloading it ..."
195+
196+
if [ -n "$MVNW_REPOURL" ]; then
197+
wrapperUrl="$MVNW_REPOURL/org/apache/maven/wrapper/maven-wrapper/3.2.0/maven-wrapper-3.2.0.jar"
198+
else
199+
wrapperUrl="https://repo.maven.apache.org/maven2/org/apache/maven/wrapper/maven-wrapper/3.2.0/maven-wrapper-3.2.0.jar"
200+
fi
201+
while IFS="=" read -r key value; do
202+
# Remove '\r' from value to allow usage on windows as IFS does not consider '\r' as a separator ( considers space, tab, new line ('\n'), and custom '=' )
203+
safeValue=$(echo "$value" | tr -d '\r')
204+
case "$key" in (wrapperUrl) wrapperUrl="$safeValue"; break ;;
205+
esac
206+
done < "$MAVEN_PROJECTBASEDIR/.mvn/wrapper/maven-wrapper.properties"
207+
log "Downloading from: $wrapperUrl"
208+
209+
if $cygwin; then
210+
wrapperJarPath=$(cygpath --path --windows "$wrapperJarPath")
211+
fi
212+
213+
if command -v wget > /dev/null; then
214+
log "Found wget ... using wget"
215+
[ "$MVNW_VERBOSE" = true ] && QUIET="" || QUIET="--quiet"
216+
if [ -z "$MVNW_USERNAME" ] || [ -z "$MVNW_PASSWORD" ]; then
217+
wget $QUIET "$wrapperUrl" -O "$wrapperJarPath" || rm -f "$wrapperJarPath"
218+
else
219+
wget $QUIET --http-user="$MVNW_USERNAME" --http-password="$MVNW_PASSWORD" "$wrapperUrl" -O "$wrapperJarPath" || rm -f "$wrapperJarPath"
220+
fi
221+
elif command -v curl > /dev/null; then
222+
log "Found curl ... using curl"
223+
[ "$MVNW_VERBOSE" = true ] && QUIET="" || QUIET="--silent"
224+
if [ -z "$MVNW_USERNAME" ] || [ -z "$MVNW_PASSWORD" ]; then
225+
curl $QUIET -o "$wrapperJarPath" "$wrapperUrl" -f -L || rm -f "$wrapperJarPath"
226+
else
227+
curl $QUIET --user "$MVNW_USERNAME:$MVNW_PASSWORD" -o "$wrapperJarPath" "$wrapperUrl" -f -L || rm -f "$wrapperJarPath"
228+
fi
229+
else
230+
log "Falling back to using Java to download"
231+
javaSource="$MAVEN_PROJECTBASEDIR/.mvn/wrapper/MavenWrapperDownloader.java"
232+
javaClass="$MAVEN_PROJECTBASEDIR/.mvn/wrapper/MavenWrapperDownloader.class"
233+
# For Cygwin, switch paths to Windows format before running javac
234+
if $cygwin; then
235+
javaSource=$(cygpath --path --windows "$javaSource")
236+
javaClass=$(cygpath --path --windows "$javaClass")
237+
fi
238+
if [ -e "$javaSource" ]; then
239+
if [ ! -e "$javaClass" ]; then
240+
log " - Compiling MavenWrapperDownloader.java ..."
241+
("$JAVA_HOME/bin/javac" "$javaSource")
242+
fi
243+
if [ -e "$javaClass" ]; then
244+
log " - Running MavenWrapperDownloader.java ..."
245+
("$JAVA_HOME/bin/java" -cp .mvn/wrapper MavenWrapperDownloader "$wrapperUrl" "$wrapperJarPath") || rm -f "$wrapperJarPath"
246+
fi
247+
fi
248+
fi
249+
fi
250+
##########################################################################################
251+
# End of extension
252+
##########################################################################################
253+
254+
# If specified, validate the SHA-256 sum of the Maven wrapper jar file
255+
wrapperSha256Sum=""
256+
while IFS="=" read -r key value; do
257+
case "$key" in (wrapperSha256Sum) wrapperSha256Sum=$value; break ;;
258+
esac
259+
done < "$MAVEN_PROJECTBASEDIR/.mvn/wrapper/maven-wrapper.properties"
260+
if [ -n "$wrapperSha256Sum" ]; then
261+
wrapperSha256Result=false
262+
if command -v sha256sum > /dev/null; then
263+
if echo "$wrapperSha256Sum $wrapperJarPath" | sha256sum -c > /dev/null 2>&1; then
264+
wrapperSha256Result=true
265+
fi
266+
elif command -v shasum > /dev/null; then
267+
if echo "$wrapperSha256Sum $wrapperJarPath" | shasum -a 256 -c > /dev/null 2>&1; then
268+
wrapperSha256Result=true
269+
fi
270+
else
271+
echo "Checksum validation was requested but neither 'sha256sum' or 'shasum' are available."
272+
echo "Please install either command, or disable validation by removing 'wrapperSha256Sum' from your maven-wrapper.properties."
273+
exit 1
274+
fi
275+
if [ $wrapperSha256Result = false ]; then
276+
echo "Error: Failed to validate Maven wrapper SHA-256, your Maven wrapper might be compromised." >&2
277+
echo "Investigate or delete $wrapperJarPath to attempt a clean download." >&2
278+
echo "If you updated your Maven version, you need to update the specified wrapperSha256Sum property." >&2
279+
exit 1
280+
fi
281+
fi
282+
283+
MAVEN_OPTS="$(concat_lines "$MAVEN_PROJECTBASEDIR/.mvn/jvm.config") $MAVEN_OPTS"
284+
285+
# For Cygwin, switch paths to Windows format before running java
286+
if $cygwin; then
287+
[ -n "$JAVA_HOME" ] &&
288+
JAVA_HOME=$(cygpath --path --windows "$JAVA_HOME")
289+
[ -n "$CLASSPATH" ] &&
290+
CLASSPATH=$(cygpath --path --windows "$CLASSPATH")
291+
[ -n "$MAVEN_PROJECTBASEDIR" ] &&
292+
MAVEN_PROJECTBASEDIR=$(cygpath --path --windows "$MAVEN_PROJECTBASEDIR")
293+
fi
294+
295+
# Provide a "standardized" way to retrieve the CLI args that will
296+
# work with both Windows and non-Windows executions.
297+
MAVEN_CMD_LINE_ARGS="$MAVEN_CONFIG $*"
298+
export MAVEN_CMD_LINE_ARGS
299+
300+
WRAPPER_LAUNCHER=org.apache.maven.wrapper.MavenWrapperMain
301+
302+
# shellcheck disable=SC2086 # safe args
303+
exec "$JAVACMD" \
304+
$MAVEN_OPTS \
305+
$MAVEN_DEBUG_OPTS \
306+
-classpath "$MAVEN_PROJECTBASEDIR/.mvn/wrapper/maven-wrapper.jar" \
307+
"-Dmaven.multiModuleProjectDirectory=${MAVEN_PROJECTBASEDIR}" \
308+
${WRAPPER_LAUNCHER} $MAVEN_CONFIG "$@"

0 commit comments

Comments
 (0)