And the final bit of coding today. I started working on the XPath-based splitter. This is what is going to break apart a chapter into paragraphs, lines, and tokens/words.
That way, an echo plugin can search for `//token[length() > 3]` to get all tokens over three characters long.
`//para//token[index() = 1]` for the first word of every paragraph. I want to use that to make sure no paragraph starts with the same word.
I figured I'm 2-4 days from seeing if this POC works.
Author Intrusion Show more
Most of these plugins will have a "scope" which says how to break up the code. The default will be "content" which is actually `//file[has-class('content']` to only analyze chapters.
So, the echo plugins would only process per chapter.
If one used `/project` for the scope, then the echo would consider the last paragraphs of the previous chapter instead of only per-file checking.
This will also let the frequency plugins look for overuse of a word per file or per project.