Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updated README.md . It would help your audience to be more attached to this project and wait for more updates. I loved this idea of making a module for extracting . And you have added various formats also , which is pretty impressive. Please keep the good work up. #210

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 5 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@ A text extraction node module.
* DXF
* `application/javascript`
* All `text/*` mime-types.
* We will be adding more text formats for you all.

In almost all cases above, what textract cares about is the mime type. So `.html` and `.htm`, both possessing the same mime type, will be extracted. Other extensions that share mime types with those above should also extract successfully. For example, `application/vnd.ms-excel` is the mime type for `.xls`, but also for 5 other file types.

Expand Down Expand Up @@ -167,4 +168,7 @@ textract.fromUrl(url, config, function( error, text ) {})
- `sudo port install tesseract-chi-sim`
- `sudo port install tesseract-eng`
- You will also want to disable textract's usage of textutil as the tests are based on output from antiword.
- Go into `/lib/extractors/{doc|doc-osx|rtf}` and modify the code under `if ( os.platform() === 'darwin' ) {`. Uncommented the commented lines in these sections.
- Go into `/lib/extractors/{doc|doc-osx|rtf}` and modify the code under `if ( os.platform() === 'darwin' ) {`. Uncommented the commented lines in these sections.


* We are working continously to make this project more efficient. Till then , keep extracting!!!!!!!!!!!!!