diff --git a/README.md b/README.md index 305c21a..7591265 100644 --- a/README.md +++ b/README.md @@ -4,11 +4,13 @@ wordpos [![NPM version](https://img.shields.io/npm/v/wordpos.svg)](https://www.npmjs.com/package/wordpos) [![Build Status](https://img.shields.io/travis/moos/wordpos/master.svg)](https://travis-ci.org/moos/wordpos) -wordpos is a set of *fast* part-of-speech (POS) utilities for Node.js using fast lookup in the WordNet database. +wordpos is a set of *fast* part-of-speech (POS) utilities for Node.js **and** browser using fast lookup in the WordNet database. Version 1.x is a major update with no direct dependence on [natural's](https://github.com/NaturalNode/natural#wordnet) WordNet module, with support for [Promises](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Promise), and roughly 5x speed improvement over previous version. -**CAUTION** The WordNet database [wordnet-db](https://github.com/moos/wordnet-db) comprises [155,287 words](https://wordnet.princeton.edu/documentation/wnstats7wn) (3.0 numbers) which uncompress to over **30 MB** of data in several *un*[browserify](https://github.com/substack/node-browserify)-able files. It is *not* meant for the browser environment. +~~**CAUTION** The WordNet database [wordnet-db](https://github.com/moos/wordnet-db) comprises [155,287 words](https://wordnet.princeton.edu/documentation/wnstats7wn) (3.0 numbers) which uncompress to over **30 MB** of data in several *un*[browserify](https://github.com/substack/node-browserify)-able files. It is *not* meant for the browser environment.~~ + +:zap: v2.x can work in browsers -- see below for example. ## Quick usage @@ -68,7 +70,30 @@ WordPOS.defaults = { * if array, stopwords to exclude, eg, ['all','of','this',...] * if false, do not filter any stopwords. */ - stopwords: true + stopwords: true, + + /** + * preload files (in browser only) + * true - preload all POS + * false - do not preload any POS + * 'a' - preload adj + * ['a','v'] - preload adj & verb + * @type {boolean|string|Array} + */ + preload: false, + + /** + * include data files in preload + * @type {boolean} + */ + includeData: false, // WIP + + /** + * set to true to enable debug logging + * @type {boolean} + */ + debug: false + }; ``` To override, pass an options hash to the constructor. With the `profile` option, most callbacks receive a last argument that is the execution time in msec of the call. @@ -224,7 +249,7 @@ wordpos.rand({starsWith: 'zzz'}, console.log) // [] 'zzz' ``` -**Note on performance**: random lookups could involve heavy disk reads. It is better to use the `count` option to get words in batches. This may benefit from the cached reads of similarly keyed entries as well as shared open/close of the index files. +**Note on performance**: (node only) random lookups could involve heavy disk reads. It is better to use the `count` option to get words in batches. This may benefit from the cached reads of similarly keyed entries as well as shared open/close of the index files. Getting random POS (`randNoun()`, etc.) is generally faster than `rand()`, which may look at multiple POS files until `count` requirement is met. @@ -269,8 +294,31 @@ wordpos.isVerb('fish', console.log) ``` Note that callback receives full arguments (including profile, if enabled), while the Promise receives only the result of the call. Also, beware that exceptions in the _callback_ will result in the Promise being _rejected_ and caught by `catch()`, if provided. +## Running inside the browsers -## Fast Index +v2.0 introduces the capability of running wordpos in the browser. The dictionary files are optimized for fast access (lookup by lemma), but they must be fetched, parsed and loaded into browser memory. The files are loaded on-demand (unless the option `preload: true` is given). + +The dict files can be served locally or from CDN (coming soon). Include the following scripts in your `index.html`: +```html + + +``` +Above assumes wordpos is installed to the directory `./wordpos`. `./wordpos/dict` holds the index and data WordNet files generated for the web in a postinstall script. + +See [samples/self-hosted](samples/self-hosted/main.js). + +## Fast Index (node) Version 0.1.4 introduces `fastIndex` option. This uses a secondary index on the index files and is much faster. It is on by default. Secondary index files are generated at install time and placed in the same directory as WNdb.path. Details can be found in tools/stat.js. @@ -287,8 +335,16 @@ For CLI usage and examples, see [bin/README](bin). See [bench/README](bench). + +## TODO +- implement `includeData` option for preload + + ## Changes +**2.0.0** + - Support for running wordpos in browser (no breaking change for node environment) + 1.2.0 - Fix `new Buffer()` deprecation warning. - Fix npm audit vulnerabilities @@ -347,4 +403,4 @@ License (The MIT License) -Copyright (c) 2012, 2014, 2016 mooster@42at.com +Copyright (c) 2012-2019 mooster@42at.com diff --git a/package.json b/package.json index 2dcd975..a29b7c2 100755 --- a/package.json +++ b/package.json @@ -1,6 +1,6 @@ { "name": "wordpos", - "version": "2.0.0-alpha", + "version": "2.0.0-beta", "description": "wordpos is a set of part-of-speech utilities for Node.js & browser using the WordNet database.", "author": "Moos ", "keywords": [ @@ -67,7 +67,8 @@ "postinstall": "npm run postinstall-web && npm run postinstall-node", "postinstall-node": "node tools/stat.js --no-stats index.adv index.adj index.verb index.noun", "postinstall-web": "node scripts/makeJsonDict.js index data", - "build": "parcel build --detailed-report -o wordpos.min.js --global WordPOS -t browser src/browser/index.js", + "build": "parcel build --detailed-report -d dist -o wordpos.min.js --global WordPOS -t browser src/browser/index.js", + "postbuild": "sed -i 's/ES6_IMPORT/import/' dist/wordpos.min.js", "test": "npm run test-node && npm run test-browser", "test-node": "mocha test", "test-browser": "mocha test/wordpos_test --require @babel/register", diff --git a/samples/cdn/index.html b/samples/cdn/index.html new file mode 100644 index 0000000..a3ce54b --- /dev/null +++ b/samples/cdn/index.html @@ -0,0 +1,43 @@ + + + + + + + + + + + + + + +

CDN WordPOS sample

+ Open console to see results. + +

Coming soon...

+ +
 
+ + + + + + diff --git a/samples/self-hosted/index.html b/samples/self-hosted/index.html index d35bea8..1c1b8d3 100644 --- a/samples/self-hosted/index.html +++ b/samples/self-hosted/index.html @@ -1,6 +1,9 @@ + + Wordpos in the browser + @@ -28,7 +31,7 @@ .then(res => res.text()) .then(txt => { el.innerText = txt; - hljs.initHighlightingOnLoad(); + window.hljs && hljs.initHighlightingOnLoad(); }); } else { el.innerHTML = 'Open main.js.'; diff --git a/src/browser/baseFile.js b/src/browser/baseFile.js index 3903f32..abce7c6 100644 --- a/src/browser/baseFile.js +++ b/src/browser/baseFile.js @@ -39,7 +39,7 @@ class BaseFile { let promise = isTest ? Promise.resolve(require(this.filePath)) - : eval(`import('${this.filePath}')`); // prevent parcel from clobbering dynamic import + : ES6_IMPORT(`${this.filePath}`); // prevent parcel from clobbering dynamic import this.options.debug && console.timeEnd('index load ' + this.posName) return promise diff --git a/src/browser/index.js b/src/browser/index.js index 9eaa59f..131a539 100644 --- a/src/browser/index.js +++ b/src/browser/index.js @@ -20,6 +20,7 @@ const POS = { r: 'adv' }; + class WordPOS { options = {};