Skip to content

Commit

Permalink
Merge pull request #22 from AlessioLuciani/dev
Browse files Browse the repository at this point in the history
v0.4.0
  • Loading branch information
AlessioLuciani committed Dec 11, 2020
2 parents 3aea262 + 78011e6 commit 225e478
Show file tree
Hide file tree
Showing 23 changed files with 852 additions and 280 deletions.
30 changes: 30 additions & 0 deletions .github/workflows/flutter.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
name: Flutter CI

on:
push:
branches: [ dev, master ]
pull_request:
branches: [ dev, master ]

jobs:
build:

runs-on: macos-latest #ubuntu-latest
env:
working-directory: ./example

steps:
- uses: actions/checkout@v2
- uses: actions/setup-java@v1
with:
java-version: '12.x'
- uses: subosito/flutter-action@v1
with:
channel: 'beta' # or: 'dev' or 'stable'
- run: flutter pub get
working-directory: ${{env.working-directory}}
# - run: flutter test
- run: flutter build appbundle #apk
working-directory: ${{env.working-directory}}
- run: flutter build ios --release --no-codesign
working-directory: ${{env.working-directory}}
6 changes: 6 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,9 @@
## 0.4.0

* The PDFBoxResourceLoader is now used on Android to load PDF documents much faster than before. The fast initialization (i.e. *fastInit*) option has therefore been removed.
* PDF documents are no longer kept alive in the platform-specific scope. Instead, they are opened and closed at each read with the respective library functions. This does not affect the caching mechanism utilized directly in Dart. This change prevents errors due to multiple document accesses at the same time.
* Tests have been implemented.

## 0.3.1

* The possibility to initialize a document faster (without immediately initializing the text stripper engine) on Android has been added.
Expand Down
28 changes: 11 additions & 17 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# PDF Text Plugin

[![Pub Version](https://img.shields.io/pub/v/pdf_text)](https://pub.dev/packages/pdf_text)
[![GitHub issues](https://img.shields.io/github/issues/AlessioLuciani/flutter-pdf-text)](https://github.com/AlessioLuciani/flutter-pdf-text/issues)
![Flutter CI](https://github.com/AlessioLuciani/flutter-pdf-text/workflows/Flutter%20CI/badge.svg?branch=master)
[![GitHub forks](https://img.shields.io/github/forks/AlessioLuciani/flutter-pdf-text)](https://github.com/AlessioLuciani/flutter-pdf-text/network)
[![GitHub stars](https://img.shields.io/github/stars/AlessioLuciani/flutter-pdf-text)](https://github.com/AlessioLuciani/flutter-pdf-text/stargazers)
[![GitHub license](https://img.shields.io/github/license/AlessioLuciani/flutter-pdf-text)](https://github.com/AlessioLuciani/flutter-pdf-text/blob/master/LICENSE)
Expand All @@ -18,7 +18,7 @@ Add this to your package's `pubspec.yaml` file:

```yaml
dependencies:
pdf_text: ^0.3.1
pdf_text: ^0.4.0
```
## Usage
Expand All @@ -29,7 +29,7 @@ Import the package with:
import 'package:pdf_text/pdf_text.dart';
```

*Create a PDF document instance using a File object:*
**Create a PDF document instance using a File object:**

```dart
PDFDoc doc = await PDFDoc.fromFile(file);
Expand All @@ -53,13 +53,7 @@ Pass a password for encrypted PDF documents:
PDFDoc doc = await PDFDoc.fromFile(file, password: password);
```

Use faster initialization on Android:

```dart
PDFDoc doc = await PDFDoc.fromFile(file, fastInit: true);
```

*Read the text of the entire document:*
**Read the text of the entire document:**

```dart
String docText = await doc.text;
Expand All @@ -71,19 +65,19 @@ Retrieve the number of pages of the document:
int numPages = doc.length;
```

*Access a page of the document:*
**Access a page of the document:**

```dart
PDFPage page = doc.pageAt(pageNumber);
```

*Read the text of a page of the document:*
**Read the text of a page of the document:**

```dart
String pageText = await page.text;
```

*Read the information of the document:*
**Read the information of the document:**

```dart
PDFDocInfo info = doc.info;
Expand Down Expand Up @@ -119,9 +113,9 @@ allows you not to waste time loading text that you will probably not use. When y
| Return | Description |
|---|---|
| PDFPage | **pageAt(int pageNumber)** <br> Gets the page of the document at the given page number. |
| static Future\<PDFDoc> | **fromFile(File file, {String password = "", bool fastInit = false})** <br> Creates a PDFDoc object with a File instance. Optionally, takes a password for encrypted PDF documents. If fastInit is true, the initialization of the document will be faster on Android. In that case, the text stripper engine will not be initialized with this call, but later when some text is read. This means that the first text read will take some time but the document data can be accessed immediately.|
| static Future\<PDFDoc> | **fromPath(String path, {String password = "", bool fastInit = false})** <br> Creates a PDFDoc object with a file path. Optionally, takes a password for encrypted PDF documents. If fastInit is true, the initialization of the document will be faster on Android. In that case, the text stripper engine will not be initialized with this call, but later when some text is read. This means that the first text read will take some time but the document data can be accessed immediately.|
| static Future\<PDFDoc> | **fromURL(String url, {String password = "", bool fastInit = false})** <br> Creates a PDFDoc object with a url. Optionally, takes a password for encrypted PDF documents. If fastInit is true, the initialization of the document will be faster on Android. In that case, the text stripper engine will not be initialized with this call, but later when some text is read. This means that the first text read will take some time but the document data can be accessed immediately. It downloads the PDF file located in the given URL and saves it in the app's temporary directory. |
| static Future\<PDFDoc> | **fromFile(File file, {String password = ""})** <br> Creates a PDFDoc object with a File instance. Optionally, takes a password for encrypted PDF documents.|
| static Future\<PDFDoc> | **fromPath(String path, {String password = ""})** <br> Creates a PDFDoc object with a file path. Optionally, takes a password for encrypted PDF documents.|
| static Future\<PDFDoc> | **fromURL(String url, {String password = ""})** <br> Creates a PDFDoc object with a url. Optionally, takes a password for encrypted PDF documents.|
| void | **deleteFile()** <br> Deletes the file related to this PDFDoc.<br>Throws an exception if the FileSystemEntity cannot be deleted. |
| static Future | **deleteAllExternalFiles()** <br> Deletes all the files of the documents that have been imported from outside the local file system (e.g. using fromURL). |

Expand Down Expand Up @@ -155,4 +149,4 @@ class PDFDocInfo {
## Contribute

If you have any suggestions, improvements or issues, feel free to contribute to this project.
You can either submit a new issue or propose a pull request.
You can either submit a new issue or propose a pull request. Direct your pull requests into the *dev* branch.
168 changes: 71 additions & 97 deletions android/src/main/kotlin/dev/aluc/pdf_text/PdfTextPlugin.kt
Original file line number Diff line number Diff line change
@@ -1,44 +1,30 @@
package dev.aluc.pdf_text


import android.os.Handler
import android.os.Looper
import androidx.annotation.NonNull
import com.tom_roush.pdfbox.pdmodel.PDDocument
import com.tom_roush.pdfbox.text.PDFTextStripper
import com.tom_roush.pdfbox.util.PDFBoxResourceLoader
import io.flutter.embedding.engine.plugins.FlutterPlugin
import io.flutter.plugin.common.MethodCall
import io.flutter.plugin.common.MethodChannel
import io.flutter.plugin.common.MethodChannel.MethodCallHandler
import io.flutter.plugin.common.MethodChannel.Result
import io.flutter.plugin.common.PluginRegistry.Registrar

import java.io.File


import com.tom_roush.pdfbox.pdmodel.PDDocument
import com.tom_roush.pdfbox.pdmodel.PDPage
import com.tom_roush.pdfbox.text.PDFTextStripper
import kotlin.concurrent.thread

/** PdfTextPlugin */
public class PdfTextPlugin: FlutterPlugin, MethodCallHandler {

/**
* PDF document cached from the previous use.
*/
private var cachedDoc: PDDocument? = null
private var cachedDocPath: String? = null

/**
* PDF text stripper.
*/
private var pdfTextStripper = PDFTextStripper()
class PdfTextPlugin: FlutterPlugin, MethodCallHandler {

override fun onAttachedToEngine(@NonNull flutterPluginBinding: FlutterPlugin.FlutterPluginBinding) {
val channel = MethodChannel(flutterPluginBinding.getFlutterEngine().getDartExecutor(), "pdf_text")
channel.setMethodCallHandler(PdfTextPlugin());
val channel = MethodChannel(flutterPluginBinding.binaryMessenger, "pdf_text")
channel.setMethodCallHandler(PdfTextPlugin())
PDFBoxResourceLoader.init(flutterPluginBinding.applicationContext)
}



// This static function is optional and equivalent to onAttachedToEngine. It supports the old
// pre-Flutter-1.12 Android projects. You are encouraged to continue supporting
// plugin registration via this function while apps migrate to use the new Android APIs
Expand All @@ -53,36 +39,38 @@ public class PdfTextPlugin: FlutterPlugin, MethodCallHandler {
fun registerWith(registrar: Registrar) {
val channel = MethodChannel(registrar.messenger(), "pdf_text")
channel.setMethodCallHandler(PdfTextPlugin())
PDFBoxResourceLoader.init(registrar.context())
}
}

override fun onMethodCall(@NonNull call: MethodCall, @NonNull result: Result) {
thread (start = true) {
when (call.method) {
"initDoc" -> {
val args = call.arguments as Map<String, Any>
val args = call.arguments as Map<*, *>
val path = args["path"] as String
val password = args["password"] as String
val fastInit = args["fastInit"] as Boolean
initDoc(result, path, password, fastInit)
initDoc(result, path, password)
}
"getDocPageText" -> {
val args = call.arguments as Map<String, Any>
val args = call.arguments as Map<*, *>
val path = args["path"] as String
val pageNumber = args["number"] as Int
getDocPageText(result, path, pageNumber)
val password = args["password"] as String
getDocPageText(result, path, pageNumber, password)
}
"getDocText" -> {
val args = call.arguments as Map<String, Any>
val args = call.arguments as Map<*, *>
val path = args["path"] as String
@Suppress("UNCHECKED_CAST")
val missingPagesNumbers = args["missingPagesNumbers"] as List<Int>
getDocText(result, path, missingPagesNumbers)
val password = args["password"] as String
getDocText(result, path, missingPagesNumbers, password)
}
else -> {
Handler(Looper.getMainLooper()).post {
result.notImplemented()
}

}
}
}
Expand All @@ -94,34 +82,35 @@ public class PdfTextPlugin: FlutterPlugin, MethodCallHandler {
/**
Initializes the PDF document and returns some information into the channel.
*/
private fun initDoc(result: Result, path: String, password: String, fastInit: Boolean) {
val doc = getDoc(result, path, password, !fastInit) ?: return
// Getting the length of the PDF document in pages.
val length = doc.numberOfPages
private fun initDoc(result: Result, path: String, password: String) {
getDoc(result, path, password)?.use { doc ->
// Getting the length of the PDF document in pages.
val length = doc.numberOfPages

val info = doc.documentInformation
val info = doc.documentInformation

var creationDate: String? = null
if (info.creationDate != null) {
creationDate = info.creationDate.time.toString()
}
var modificationDate: String? = null
if (info.modificationDate != null) {
modificationDate = info.modificationDate.time.toString()
}
val data = hashMapOf<String, Any>(
"length" to length,
"info" to hashMapOf("author" to info.author,
"creationDate" to creationDate,
"modificationDate" to modificationDate,
"creator" to info.creator, "producer" to info.producer,
"keywords" to splitKeywords(info.keywords),
"title" to info.title, "subject" to info.subject
)
)

Handler(Looper.getMainLooper()).post {
result.success(data)
var creationDate: String? = null
if (info.creationDate != null) {
creationDate = info.creationDate.time.toString()
}
var modificationDate: String? = null
if (info.modificationDate != null) {
modificationDate = info.modificationDate.time.toString()
}
val data = hashMapOf<String, Any>(
"length" to length,
"info" to hashMapOf("author" to info.author,
"creationDate" to creationDate,
"modificationDate" to modificationDate,
"creator" to info.creator, "producer" to info.producer,
"keywords" to splitKeywords(info.keywords),
"title" to info.title, "subject" to info.subject
)
)
doc.close()
Handler(Looper.getMainLooper()).post {
result.success(data)
}
}
}

Expand All @@ -132,7 +121,7 @@ public class PdfTextPlugin: FlutterPlugin, MethodCallHandler {
if (keywordsString == null) {
return null
}
var keywords = keywordsString.split(",").toMutableList()
val keywords = keywordsString.split(",").toMutableList()
for (i in keywords.indices) {
var keyword = keywords[i]
keyword = keyword.dropWhile { it == ' ' }
Expand All @@ -145,51 +134,45 @@ public class PdfTextPlugin: FlutterPlugin, MethodCallHandler {
/**
Gets the text of a document page, given its number.
*/
private fun getDocPageText(result: Result, path: String, pageNumber: Int) {
val doc = getDoc(result, path) ?: return
pdfTextStripper.startPage = pageNumber
pdfTextStripper.endPage = pageNumber
val text = pdfTextStripper.getText(doc)
Handler(Looper.getMainLooper()).post {
result.success(text)
private fun getDocPageText(result: Result, path: String, pageNumber: Int, password: String) {
getDoc(result, path, password)?.use { doc ->
val stripper = PDFTextStripper();
stripper.startPage = pageNumber
stripper.endPage = pageNumber
val text = stripper.getText(doc)
doc.close()
Handler(Looper.getMainLooper()).post {
result.success(text)
}
}
}

/**
Gets the text of the entire document.
In order to improve the performance, it only retrieves the pages that are currently missing.
*/
private fun getDocText(result: Result, path: String, missingPagesNumbers: List<Int>) {
val doc = getDoc(result, path) ?: return
var missingPagesTexts = arrayListOf<String>()
missingPagesNumbers.forEach {
pdfTextStripper.startPage = it
pdfTextStripper.endPage = it
missingPagesTexts.add(pdfTextStripper.getText(doc))
}
Handler(Looper.getMainLooper()).post {
result.success(missingPagesTexts)
private fun getDocText(result: Result, path: String, missingPagesNumbers: List<Int>, password: String) {
getDoc(result, path, password)?.use { doc ->
val missingPagesTexts = arrayListOf<String>()
val stripper = PDFTextStripper();
missingPagesNumbers.forEach {
stripper.startPage = it
stripper.endPage = it
missingPagesTexts.add(stripper.getText(doc))
}
doc.close()
Handler(Looper.getMainLooper()).post {
result.success(missingPagesTexts)
}
}
}

/**
Gets a PDF document, given its path.
Initializes the text stripper engine if initTextStripper is true.
*/
private fun getDoc(result: Result, path: String, password: String = "",
initTextStripper: Boolean = true): PDDocument? {
// Checking for cached document
if (cachedDoc != null && cachedDocPath == path) {
return cachedDoc
}
private fun getDoc(result: Result, path: String, password: String = ""): PDDocument? {
return try {
val doc = PDDocument.load(File(path), password)
cachedDoc = doc
cachedDocPath = path
if (initTextStripper) {
initTextStripperEngine(doc)
}
doc
PDDocument.load(File(path), password)
} catch (e: Exception) {
Handler(Looper.getMainLooper()).post {
result.error("INVALID_PATH",
Expand All @@ -199,13 +182,4 @@ public class PdfTextPlugin: FlutterPlugin, MethodCallHandler {
null
}
}

/**
* Initializes the text stripper engine. This can take some time.
*/
private fun initTextStripperEngine(doc: PDDocument) {
pdfTextStripper.startPage = 1
pdfTextStripper.endPage = 1
pdfTextStripper.getText(doc)
}
}
Loading

0 comments on commit 225e478

Please sign in to comment.