beta version of browser support

This commit is contained in:
Moos 2019-02-18 16:50:51 -08:00
parent f8e6173062
commit b6efc1e506
6 changed files with 114 additions and 10 deletions

View File

@ -4,11 +4,13 @@ wordpos
[![NPM version](https://img.shields.io/npm/v/wordpos.svg)](https://www.npmjs.com/package/wordpos) [![NPM version](https://img.shields.io/npm/v/wordpos.svg)](https://www.npmjs.com/package/wordpos)
[![Build Status](https://img.shields.io/travis/moos/wordpos/master.svg)](https://travis-ci.org/moos/wordpos) [![Build Status](https://img.shields.io/travis/moos/wordpos/master.svg)](https://travis-ci.org/moos/wordpos)
wordpos is a set of *fast* part-of-speech (POS) utilities for Node.js using fast lookup in the WordNet database. wordpos is a set of *fast* part-of-speech (POS) utilities for Node.js **and** browser using fast lookup in the WordNet database.
Version 1.x is a major update with no direct dependence on [natural's](https://github.com/NaturalNode/natural#wordnet) WordNet module, with support for [Promises](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Promise), and roughly 5x speed improvement over previous version. Version 1.x is a major update with no direct dependence on [natural's](https://github.com/NaturalNode/natural#wordnet) WordNet module, with support for [Promises](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Promise), and roughly 5x speed improvement over previous version.
**CAUTION** The WordNet database [wordnet-db](https://github.com/moos/wordnet-db) comprises [155,287 words](https://wordnet.princeton.edu/documentation/wnstats7wn) (3.0 numbers) which uncompress to over **30 MB** of data in several *un*[browserify](https://github.com/substack/node-browserify)-able files. It is *not* meant for the browser environment. ~~**CAUTION** The WordNet database [wordnet-db](https://github.com/moos/wordnet-db) comprises [155,287 words](https://wordnet.princeton.edu/documentation/wnstats7wn) (3.0 numbers) which uncompress to over **30 MB** of data in several *un*[browserify](https://github.com/substack/node-browserify)-able files. It is *not* meant for the browser environment.~~
:zap: v2.x can work in browsers -- see below for example.
## Quick usage ## Quick usage
@ -68,7 +70,30 @@ WordPOS.defaults = {
* if array, stopwords to exclude, eg, ['all','of','this',...] * if array, stopwords to exclude, eg, ['all','of','this',...]
* if false, do not filter any stopwords. * if false, do not filter any stopwords.
*/ */
stopwords: true stopwords: true,
/**
* preload files (in browser only)
* true - preload all POS
* false - do not preload any POS
* 'a' - preload adj
* ['a','v'] - preload adj & verb
* @type {boolean|string|Array}
*/
preload: false,
/**
* include data files in preload
* @type {boolean}
*/
includeData: false, // WIP
/**
* set to true to enable debug logging
* @type {boolean}
*/
debug: false
}; };
``` ```
To override, pass an options hash to the constructor. With the `profile` option, most callbacks receive a last argument that is the execution time in msec of the call. To override, pass an options hash to the constructor. With the `profile` option, most callbacks receive a last argument that is the execution time in msec of the call.
@ -224,7 +249,7 @@ wordpos.rand({starsWith: 'zzz'}, console.log)
// [] 'zzz' // [] 'zzz'
``` ```
**Note on performance**: random lookups could involve heavy disk reads. It is better to use the `count` option to get words in batches. This may benefit from the cached reads of similarly keyed entries as well as shared open/close of the index files. **Note on performance**: (node only) random lookups could involve heavy disk reads. It is better to use the `count` option to get words in batches. This may benefit from the cached reads of similarly keyed entries as well as shared open/close of the index files.
Getting random POS (`randNoun()`, etc.) is generally faster than `rand()`, which may look at multiple POS files until `count` requirement is met. Getting random POS (`randNoun()`, etc.) is generally faster than `rand()`, which may look at multiple POS files until `count` requirement is met.
@ -269,8 +294,31 @@ wordpos.isVerb('fish', console.log)
``` ```
Note that callback receives full arguments (including profile, if enabled), while the Promise receives only the result of the call. Also, beware that exceptions in the _callback_ will result in the Promise being _rejected_ and caught by `catch()`, if provided. Note that callback receives full arguments (including profile, if enabled), while the Promise receives only the result of the call. Also, beware that exceptions in the _callback_ will result in the Promise being _rejected_ and caught by `catch()`, if provided.
## Running inside the browsers
## Fast Index v2.0 introduces the capability of running wordpos in the browser. The dictionary files are optimized for fast access (lookup by lemma), but they must be fetched, parsed and loaded into browser memory. The files are loaded on-demand (unless the option `preload: true` is given).
The dict files can be served locally or from CDN (coming soon). Include the following scripts in your `index.html`:
```html
<script src="wordpos/dist/wordpos.min.js"></script>
<script>
let wordpos = new WordPOS({
// preload: true,
dictPath: '/wordpos/dict',
profile: true
});
wordpos.getAdverbs('this is is lately a likely tricky business this is')
.then(res => {
console.log(res); // ["lately", "likely"]
});
</script>
```
Above assumes wordpos is installed to the directory `./wordpos`. `./wordpos/dict` holds the index and data WordNet files generated for the web in a postinstall script.
See [samples/self-hosted](samples/self-hosted/main.js).
## Fast Index (node)
Version 0.1.4 introduces `fastIndex` option. This uses a secondary index on the index files and is much faster. It is on by default. Secondary index files are generated at install time and placed in the same directory as WNdb.path. Details can be found in tools/stat.js. Version 0.1.4 introduces `fastIndex` option. This uses a secondary index on the index files and is much faster. It is on by default. Secondary index files are generated at install time and placed in the same directory as WNdb.path. Details can be found in tools/stat.js.
@ -287,8 +335,16 @@ For CLI usage and examples, see [bin/README](bin).
See [bench/README](bench). See [bench/README](bench).
## TODO
- implement `includeData` option for preload
## Changes ## Changes
**2.0.0**
- Support for running wordpos in browser (no breaking change for node environment)
1.2.0 1.2.0
- Fix `new Buffer()` deprecation warning. - Fix `new Buffer()` deprecation warning.
- Fix npm audit vulnerabilities - Fix npm audit vulnerabilities
@ -347,4 +403,4 @@ License
(The MIT License) (The MIT License)
Copyright (c) 2012, 2014, 2016 mooster@42at.com Copyright (c) 2012-2019 mooster@42at.com

View File

@ -1,6 +1,6 @@
{ {
"name": "wordpos", "name": "wordpos",
"version": "2.0.0-alpha", "version": "2.0.0-beta",
"description": "wordpos is a set of part-of-speech utilities for Node.js & browser using the WordNet database.", "description": "wordpos is a set of part-of-speech utilities for Node.js & browser using the WordNet database.",
"author": "Moos <mooster@42at.com>", "author": "Moos <mooster@42at.com>",
"keywords": [ "keywords": [
@ -67,7 +67,8 @@
"postinstall": "npm run postinstall-web && npm run postinstall-node", "postinstall": "npm run postinstall-web && npm run postinstall-node",
"postinstall-node": "node tools/stat.js --no-stats index.adv index.adj index.verb index.noun", "postinstall-node": "node tools/stat.js --no-stats index.adv index.adj index.verb index.noun",
"postinstall-web": "node scripts/makeJsonDict.js index data", "postinstall-web": "node scripts/makeJsonDict.js index data",
"build": "parcel build --detailed-report -o wordpos.min.js --global WordPOS -t browser src/browser/index.js", "build": "parcel build --detailed-report -d dist -o wordpos.min.js --global WordPOS -t browser src/browser/index.js",
"postbuild": "sed -i 's/ES6_IMPORT/import/' dist/wordpos.min.js",
"test": "npm run test-node && npm run test-browser", "test": "npm run test-node && npm run test-browser",
"test-node": "mocha test", "test-node": "mocha test",
"test-browser": "mocha test/wordpos_test --require @babel/register", "test-browser": "mocha test/wordpos_test --require @babel/register",

43
samples/cdn/index.html Normal file
View File

@ -0,0 +1,43 @@
<!doctype html>
<html>
<head>
<meta http-equiv="Content-Security-Policy" content="script-src https: http: 'unsafe-inline' 'unsafe-eval'">
<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/9.12.0/styles/github.min.css" />
<script defer src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/9.12.0/highlight.min.js"></script>
<script defer src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/9.12.0/languages/javascript.min.js"></script>
<script src="./main.js" name="main"></script>
<style>
pre {
padding: 2em;
display: block;
}
</style>
</head>
<body>
<h1>CDN WordPOS sample</h1>
Open console to see results.
<h1>Coming soon...</h1>
<pre><code> </code></pre>
<script>
var el = document.querySelector('code');
if (fetch) {
fetch('main.js')
.then(res => res.text())
.then(txt => {
el.innerText = txt;
window.hljs && hljs.initHighlightingOnLoad();
});
} else {
el.innerHTML = 'Open <a href=main.js>main.js</a>.';
}
</script>
</body>
</html>

View File

@ -1,6 +1,9 @@
<!doctype html> <!doctype html>
<html> <html>
<head> <head>
<meta http-equiv="Content-Security-Policy" content="script-src https: http: 'unsafe-inline' 'unsafe-eval'">
<title>Wordpos in the browser</title>
<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/9.12.0/styles/github.min.css" /> <link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/9.12.0/styles/github.min.css" />
<script defer src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/9.12.0/highlight.min.js"></script> <script defer src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/9.12.0/highlight.min.js"></script>
<script defer src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/9.12.0/languages/javascript.min.js"></script> <script defer src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/9.12.0/languages/javascript.min.js"></script>
@ -28,7 +31,7 @@
.then(res => res.text()) .then(res => res.text())
.then(txt => { .then(txt => {
el.innerText = txt; el.innerText = txt;
hljs.initHighlightingOnLoad(); window.hljs && hljs.initHighlightingOnLoad();
}); });
} else { } else {
el.innerHTML = 'Open <a href=main.js>main.js</a>.'; el.innerHTML = 'Open <a href=main.js>main.js</a>.';

View File

@ -39,7 +39,7 @@ class BaseFile {
let promise = isTest let promise = isTest
? Promise.resolve(require(this.filePath)) ? Promise.resolve(require(this.filePath))
: eval(`import('${this.filePath}')`); // prevent parcel from clobbering dynamic import : ES6_IMPORT(`${this.filePath}`); // prevent parcel from clobbering dynamic import
this.options.debug && console.timeEnd('index load ' + this.posName) this.options.debug && console.timeEnd('index load ' + this.posName)
return promise return promise

View File

@ -20,6 +20,7 @@ const POS = {
r: 'adv' r: 'adv'
}; };
class WordPOS { class WordPOS {
options = {}; options = {};