To broaden the scope of Freelinking, we need a method by which plugins can specify the syntax used to activate them. To smoothly scale up to supporting many different syntax structures, and allow them to vary by plugin, there's some groundwork to put in place.
- The basic architecture of
freelinking_filter() needs to build a complete regular expression to match all instances of a given plugin with a single preg_match_all(). The indicator will not come along in a secondary stage. The indicator will no longer include pattern modifiers or expression delimiters. This has the minor affect of dropping the order of links in the text.
- Support for a single default plugin when multiple "bracket matching" schemes are possible is unnecessarily limiting. Instead, Plugin Weights should be used, giving precedence to the lowest-weight plugin using a given bracket-matching scheme. This creates implicit default plugins for every type of syntax, and also results in automatic failover to "higher weight" plugins in the event a given plugin fails with a no-effect result.
- The current "Default Plugin" setting should be used to create a magical weight override that pushes the selected plugin to the top of the stack. It should be optional and play nice with #634348: Configuration by Input Format.
- The current "Syntax Selection" setting should be used to create a default in the event a plugin does not care to define a syntax. This should also play nice with #634348: Configuration by Input Format.
What is bracket-matching syntax?
[[link]], [link], #link, @link, link@plugin, [Link Text](link), and so on.
Comments
Comment #1
arhak commentedwelcome delimiters!!
but be careful, IMO you'll have (should prefer) to deal with open/closing delimiters
otherwise having just an opening delimiter (
@link) would match against what ending?EOL (end of line), non-word chars? spaces? tabs?
if you start with couple of delimiters you'd support many wiki-style syntaxes
then you might want to move on against opening-EOL or SOL-closing (Start Of Line)
but you shouldn't try to automatically recognize what is word or non-word, because it might become a tricky path
Comment #2
arhak commentedalso, a case like
[Link Text](link)is even wider that the otherswhat would that be? an opening square bracket against a closing parenthesis?
here your hitting the edge of a more powerful wiki-syntax tool than the current scope of what FL3 is aiming
maybe for FL4?
Comment #3
Grayside commented[Link Text](link) is Markdown-style, and basic support for that was folded into Alpha3 as a global option. If testing proves it's too complex, it may be dropped before a full 3.0 release.
I had some vague thought that prefix-only delimiters would terminate with the first non-escaped whitespace character (So #This\ is\ fine), but I am still contemplating how to build a flexible system, rather than the particulars of how to implement a given regular expression.
Comment #4
arhak commented"prefix-only", "whitespace character"
what would a prefix-only be in a right-to-left language?
is every language ok with the definition of "whitespace character"?
Comment #5
Grayside commentedAbsolutely, there are definitely considerations in doing this. Before I try to address your points I want to continue working through the basic approach of how plugins will specify their matching scheme, as that will inform how we approach the complexities of multiple languages.
There are two basic approaches to granting plugins the ability to define their own matching scheme. They are not mutually exclusive:
'match' => 'double bracket')Comment #6
arhak commentedungreedy
\b#(.+?)@(.+?):(.+)\bor reserved delimiter
\b#([^@]+)@([^:]+):(.+)\bComment #7
Grayside commentedThanks for the tweaks, still has the problems of #4. The purpose of my post was to explore the match array structure. Is that complete enough?
In Option #1, we have the problem of pushing the plugin's indicator into the expression. We no longer want wildcard indicators unless we want a universal fallback syntax ([[plugin:target]] always works, but the plugin specifies #target). That means we need to define indicators in general-use match specifications that allows indicators to be stripped out or replaced with something plugin-specific.
"\b#(.+?)INDICATOR:(.+)\b" becomes "\b#(.+?)@plugin:(.+)\b"
Comment #8
arhak commentedthe array structure of #5 looks good, legible enough, wide enough
(nevertheless there will be always cases out of its scope)
#7 becomes pretty awful/unreadable
expression looks almost like a regex but INDICATOR will be replaced by indicator prefix/suffix ...
NO please, it will be madness (IMO)
Comment #9
arhak commentedarguments & argument separator are a good idea/approach
but, for instead, image/video filters might use arguments in more than one position
and have complex separators like
size=640x480size=80%I mention this, just to point out that some image/video filters can provide use cases to test whether you're being flexible enough
BUT I don't think you should aim to cover them all
Comment #10
Grayside commentedThere is a sort of unified structure for arguments. A routine will parse out the arguments into something like
$target['size'] = 80%.I agree that the INDICATOR token thing is sloppy. Here is what we are discussing now:
Option 1
Much like it currently works, you may specify an indicator and it will use that to match against the global "default" in bracket matching. This will function as it currently does.
'indicator' => 'nt|nodetitle|title'Option 2
The array structure from #5, and you may make it as specific or vague as you want.
Comment #11
arhak commentedright now I'm in fence...
it should be a developers call, what would the majority of developers prefer
option 1 seems very straightforward
while option 2 seems more powerful/flexible
maybe starting with straightforward until plugin system become so popular than a wider API gets more required
Comment #12
Grayside commentedIt seems to me the options are not mutually exclusive. In fact, if you define both an 'indicator' and a 'match' both could be used.
Comment #13
gisleFive years ago, Grayside wrote:
While this provides excellent flexibility, and markdown and single bracket are already partially implemented for version 3, this approach is not without disadvantages:
preg_match_all()approach breaks when replacing markup where the same target is linked with two different title attributes. For example, given:[[nid:1|foo]] [[nid:1|bar]], the "match all" will result in anchor text (title) of the second link being set to "foo" instead of "bar" as the user would expect.Because of these disadvantages, I am planning to pull back from this approach. The upcoming Freelinking 7.x-3.4 release will only support the double square bracket syntax.