$ cnpm install parse-english
Natural language parser, for the English language, that produces nlcst.
This package exposes a parser that takes English natural language and produces a syntax tree.
If you want to handle English natural language as syntax trees manually, use this.
Alternatively, you can use the retext plugin retext-english,
which wraps this project to also parse natural language at a higher-level
(easier) abstraction.
For Dutch or most Latin-script languages, you can instead use
parse-dutch or parse-latin.
This package is ESM only. In Node.js (version 16+), install with npm:
npm install parse-english
In Deno with esm.sh:
import {ParseEnglish} from 'https://esm.sh/parse-english@7'
In browsers with esm.sh:
<script type="module">
import {ParseEnglish} from 'https://esm.sh/parse-english@7?bundle'
</script>
import {ParseEnglish} from 'parse-english'
import {inspect} from 'unist-util-inspect'
const tree = new ParseEnglish().parse(
'Mr. Henry Brown: A hapless but friendly City of London worker.'
)
console.log(inspect(tree))
Yields:
RootNode[1] (1:1-1:63, 0-62)
└─0 ParagraphNode[1] (1:1-1:63, 0-62)
└─0 SentenceNode[23] (1:1-1:63, 0-62)
├─0 WordNode[2] (1:1-1:4, 0-3)
│ ├─0 TextNode "Mr" (1:1-1:3, 0-2)
│ └─1 PunctuationNode "." (1:3-1:4, 2-3)
├─1 WhiteSpaceNode " " (1:4-1:5, 3-4)
├─2 WordNode[1] (1:5-1:10, 4-9)
│ └─0 TextNode "Henry" (1:5-1:10, 4-9)
├─3 WhiteSpaceNode " " (1:10-1:11, 9-10)
├─4 WordNode[1] (1:11-1:16, 10-15)
│ └─0 TextNode "Brown" (1:11-1:16, 10-15)
├─5 PunctuationNode ":" (1:16-1:17, 15-16)
├─6 WhiteSpaceNode " " (1:17-1:18, 16-17)
├─7 WordNode[1] (1:18-1:19, 17-18)
│ └─0 TextNode "A" (1:18-1:19, 17-18)
├─8 WhiteSpaceNode " " (1:19-1:20, 18-19)
├─9 WordNode[1] (1:20-1:27, 19-26)
│ └─0 TextNode "hapless" (1:20-1:27, 19-26)
├─10 WhiteSpaceNode " " (1:27-1:28, 26-27)
├─11 WordNode[1] (1:28-1:31, 27-30)
│ └─0 TextNode "but" (1:28-1:31, 27-30)
├─12 WhiteSpaceNode " " (1:31-1:32, 30-31)
├─13 WordNode[1] (1:32-1:40, 31-39)
│ └─0 TextNode "friendly" (1:32-1:40, 31-39)
├─14 WhiteSpaceNode " " (1:40-1:41, 39-40)
├─15 WordNode[1] (1:41-1:45, 40-44)
│ └─0 TextNode "City" (1:41-1:45, 40-44)
├─16 WhiteSpaceNode " " (1:45-1:46, 44-45)
├─17 WordNode[1] (1:46-1:48, 45-47)
│ └─0 TextNode "of" (1:46-1:48, 45-47)
├─18 WhiteSpaceNode " " (1:48-1:49, 47-48)
├─19 WordNode[1] (1:49-1:55, 48-54)
│ └─0 TextNode "London" (1:49-1:55, 48-54)
├─20 WhiteSpaceNode " " (1:55-1:56, 54-55)
├─21 WordNode[1] (1:56-1:62, 55-61)
│ └─0 TextNode "worker" (1:56-1:62, 55-61)
└─22 PunctuationNode "." (1:62-1:63, 61-62)
This package exports the identifier ParseEnglish.
There is no default export.
ParseEnglish()Create a new parser.
ParseEnglish extends ParseLatin.
See parse-latin for API docs.
All of parse-latin is included, and the following support for
the English natural language:
tsp., tbsp., oz., ft., and more)sec., min., tues., thu., feb., and more)Inc. and Ltd.)Mr., Mmes., Sr., and more)Dr., Rep., Gen., Prof., Pres., and more)Ave., Blvd., Ft., Hwy., and more)Ala., Minn., La., Tex., and more)Alta., Qué., Yuk., and more)Beds., Leics., Shrops., and more)’n’, ’o, ’em, ’twas, ’80s,
and more)This package is fully typed with TypeScript. It exports no additional types.
Projects maintained by me are compatible with maintained versions of Node.js.
When I cut a new major release, I drop support for unmaintained versions of
Node.
This means I try to keep the current release line, parse-english@^7,
compatible with Node.js 16.
This package is safe.
parse-latin
— Latin-script natural language parserparse-dutch
— Dutch natural language parserYes please! See How to Contribute to Open Source.
Copyright 2013 - present © cnpmjs.org | Home |