FUDforum
Fast Uncompromising Discussions. FUDforum will get your users talking.

Home » Imported messages » comp.lang.php » PHP functions to convert markup efficiently
Show: Today's Messages :: Unread Messages :: Show Polls :: Message Navigator
| Subscribe to topic | Bookmark topic 
Switch to threaded view of this topic Create a new topic Submit Reply
PHP functions to convert markup efficiently [message #183811] Thu, 21 November 2013 13:30 Go to next message
James Harris is currently offline  James Harris
Messages: 11
Registered: November 2013
Karma: 0
Junior Member
add to buddy list
ignore all messages by this user
I am looking for a way to mark up text in a way that PHP would be able to
efficiently and quickly convert to HTML.

I could either use an existing markup language or design a new one but I
wanted to know of which PHP functions would be ideal to use to process it
most efficiently. To a large extent that will guide the choice of markup
tags if I have to design it myself.

For example, it looks like I could choose between PHP's expand(), fgets()
and regular expression handling.

Of course, implemenations may differ slightly but, on average, are there
certain PHP approaches that could be expected to be faster than others? What
is the accepted wisdom?

James
Re: PHP functions to convert markup efficiently [message #183812 is a reply to message #183811] Thu, 21 November 2013 13:39 Go to previous messageGo to next message
The Natural Philosoph is currently offline  The Natural Philosoph
Messages: 993
Registered: September 2010
Karma: 0
Senior Member
add to buddy list
ignore all messages by this user
On 21/11/13 18:30, James Harris wrote:
> I am looking for a way to mark up text in a way that PHP would be able to
> efficiently and quickly convert to HTML.
>

mark the text up with HTML!


> I could either use an existing markup language or design a new one but I
> wanted to know of which PHP functions would be ideal to use to process it
> most efficiently. To a large extent that will guide the choice of markup
> tags if I have to design it myself.
>
> For example, it looks like I could choose between PHP's expand(), fgets()
> and regular expression handling.
>
> Of course, implemenations may differ slightly but, on average, are there
> certain PHP approaches that could be expected to be faster than others? What
> is the accepted wisdom?
>
> James
>
>


--
Ineptocracy

(in-ep-toc’-ra-cy) – a system of government where the least capable to
lead are elected by the least capable of producing, and where the
members of society least likely to sustain themselves or succeed, are
rewarded with goods and services paid for by the confiscated wealth of a
diminishing number of producers.
Re: PHP functions to convert markup efficiently [message #183813 is a reply to message #183812] Thu, 21 November 2013 13:57 Go to previous messageGo to next message
James Harris is currently offline  James Harris
Messages: 11
Registered: November 2013
Karma: 0
Junior Member
add to buddy list
ignore all messages by this user
"The Natural Philosopher" <tnp(at)invalid(dot)invalid> wrote in message
news:l6lk04$e54$1(at)news(dot)albasani(dot)net...
> On 21/11/13 18:30, James Harris wrote:
>> I am looking for a way to mark up text in a way that PHP would be able to
>> efficiently and quickly convert to HTML.
>>
>
> mark the text up with HTML!

No good. Too complex to vet, not secure (people other than me could add
markup) and would not allow enhanced functions that I anticipate needing to
add. It needs to be markup I can control.

>> I could either use an existing markup language or design a new one but I
>> wanted to know of which PHP functions would be ideal to use to process it
>> most efficiently. To a large extent that will guide the choice of markup
>> tags if I have to design it myself.
>>
>> For example, it looks like I could choose between PHP's expand(), fgets()
>> and regular expression handling.
>>
>> Of course, implemenations may differ slightly but, on average, are there
>> certain PHP approaches that could be expected to be faster than others?
>> What
>> is the accepted wisdom?

James
Re: PHP functions to convert markup efficiently [message #183814 is a reply to message #183811] Thu, 21 November 2013 14:03 Go to previous messageGo to next message
Salvatore is currently offline  Salvatore
Messages: 38
Registered: September 2012
Karma: 0
Member
add to buddy list
ignore all messages by this user
On 2013-11-21, James Harris <james(dot)harris(dot)1(at)gmail(dot)com> wrote:
> I am looking for a way to mark up text in a way that PHP would be able to
> efficiently and quickly convert to HTML.

Have you tried Markdown? There's a PHP class that you can drop right
into any project. It doesn't support images, I don't believe, but it
does allow hyperlinks and basic text formatting.

--
Blah blah bleh...
GCS/CM d(-)@>-- s+:- !a C++$ UBL++++$ L+$ W+++$ w M++ Y++ b++
Re: PHP functions to convert markup efficiently [message #183815 is a reply to message #183813] Thu, 21 November 2013 14:10 Go to previous messageGo to next message
Christoph Michael Bec is currently offline  Christoph Michael Bec
Messages: 207
Registered: June 2013
Karma: 0
Senior Member
add to buddy list
ignore all messages by this user
James Harris wrote:

> "The Natural Philosopher" <tnp(at)invalid(dot)invalid> wrote in message
> news:l6lk04$e54$1(at)news(dot)albasani(dot)net...
>> On 21/11/13 18:30, James Harris wrote:
>>> I am looking for a way to mark up text in a way that PHP would be able to
>>> efficiently and quickly convert to HTML.
>>>
>>
>> mark the text up with HTML!
>
> No good. Too complex to vet, not secure (people other than me could add
> markup)

There are tools which help with this, e.g. <http://htmlpurifier.org/>.

> and would not allow enhanced functions that I anticipate needing to
> add. It needs to be markup I can control.

Then you may consider XML. There are several libraries dealing with XML
that come bundled with PHP[1].

>>> I could either use an existing markup language or design a new one but I
>>> wanted to know of which PHP functions would be ideal to use to process it
>>> most efficiently. To a large extent that will guide the choice of markup
>>> tags if I have to design it myself.
>>>
>>> For example, it looks like I could choose between PHP's expand(), fgets()
>>> and regular expression handling.

I never heard of a PHP function called expand(). Anyway, if you want to
use your own markup, regular expression are most likely the way to go,
as scanning by characters might be too slow in pure PHP.

>>> Of course, implemenations may differ slightly but, on average, are there
>>> certain PHP approaches that could be expected to be faster than others?

Most likely those which are programmed in C.

[1] <http://php.net/manual/en/refs.xml.php>

--
Christoph M. Becker
Re: PHP functions to convert markup efficiently [message #183816 is a reply to message #183815] Thu, 21 November 2013 14:29 Go to previous messageGo to next message
James Harris is currently offline  James Harris
Messages: 11
Registered: November 2013
Karma: 0
Junior Member
add to buddy list
ignore all messages by this user
"Christoph Michael Becker" <cmbecker69(at)arcor(dot)de> wrote in message
news:528e5aa6$0$6627$9b4e6d93(at)newsspool2(dot)arcor-online(dot)net...

....

>>>> For example, it looks like I could choose between PHP's expand(),
>>>> fgets()
>>>> and regular expression handling.
>
> I never heard of a PHP function called expand().

Oops - that should have been explode().

>>>> Of course, implemenations may differ slightly but, on average, are
>>>> there
>>>> certain PHP approaches that could be expected to be faster than others?
>
> Most likely those which are programmed in C.

Is there a way to identify those?

That may not be the only consideration. ISTM that even if the regular
expression handler is programmed in C simple string handling (hopefully also
programmed in C) should be much faster as long as it is written sensibly.

James
Re: PHP functions to convert markup efficiently [message #183817 is a reply to message #183816] Thu, 21 November 2013 15:18 Go to previous messageGo to next message
Christoph Michael Bec is currently offline  Christoph Michael Bec
Messages: 207
Registered: June 2013
Karma: 0
Senior Member
add to buddy list
ignore all messages by this user
James Harris wrote:

>>>> > Of course, implemenations may differ slightly but, on average, are
>>>> > >>>> there
>>>> > >>>> certain PHP approaches that could be expected to be faster than others?
>>>
>>> Most likely those which are programmed in C.
> Is there a way to identify those?

The core of PHP and all bundled extensions are written in C, as well as
all PECL packages. PEAR packages and many other libraries are
programmed in pure PHP.

> That may not be the only consideration. ISTM that even if the regular
> expression handler is programmed in C simple string handling (hopefully also
> programmed in C) should be much faster as long as it is written sensibly.

Of course, a strpos() is faster than an respective preg_match(), but
simple string functions are not as powerful as regular expressions, and
so the number of times they have to be called will sum up. Scanning
character by character, however, might involve a lot of conditional
statements. However, the perfomance might not matter for relatively
small texts at all.

--
Christoph M. Becker
Re: PHP functions to convert markup efficiently [message #183818 is a reply to message #183811] Thu, 21 November 2013 15:27 Go to previous messageGo to next message
Thomas 'PointedEars'  is currently offline  Thomas 'PointedEars'
Messages: 701
Registered: October 2010
Karma: 0
Senior Member
add to buddy list
ignore all messages by this user
James Harris wrote:

> I am looking for a way to mark up text in a way that PHP would be able to
> efficiently and quickly convert to HTML.

You are probably looking for a template engine like <http://www.smarty.net/>
then.

> […]
> For example, it looks like I could choose between PHP's expand(), fgets()
> and regular expression handling.

There is no built-in expand() function in PHP. fgets() reads files line by
line; HTML converts whitespace outside of attribute values to one space –
not what you want. Regular expressions are used for pattern matching; I do
not see how they would be applicable here.

> Of course, implemenations may differ slightly but, on average, are there
> certain PHP approaches that could be expected to be faster than others?
> What is the accepted wisdom?

Depends on what you *really* want to do.


PointedEars
--
realism: HTML 4.01 Strict
evangelism: XHTML 1.0 Strict
madness: XHTML 1.1 as application/xhtml+xml
-- Bjoern Hoehrmann
Re: PHP functions to convert markup efficiently [message #183839 is a reply to message #183811] Fri, 22 November 2013 05:33 Go to previous messageGo to next message
Arno Welzel is currently offline  Arno Welzel
Messages: 317
Registered: October 2011
Karma: 0
Senior Member
add to buddy list
ignore all messages by this user
Am 21.11.2013 19:30, schrieb James Harris:

> I am looking for a way to mark up text in a way that PHP would be able to
> efficiently and quickly convert to HTML.

What kind of text?

What kind of markup is needed?


--
Arno Welzel
http://arnowelzel.de
http://de-rec-fahrrad.de
Re: PHP functions to convert markup efficiently [message #183844 is a reply to message #183839] Fri, 22 November 2013 07:48 Go to previous messageGo to next message
James Harris is currently offline  James Harris
Messages: 11
Registered: November 2013
Karma: 0
Junior Member
add to buddy list
ignore all messages by this user
Having looked at Markdown and Smarty that people suggested it might be
better if I bite the bullet and develop my own markup language. It would
then be easier both to limit what it allows and also to add features as
needed. I have been running some syntax tests and it seems quite easy to do
and fast to process even though the markup is currently a little cumbersome
to write.

"Arno Welzel" <usenet(at)arnowelzel(dot)de> wrote in message
news:528F32EE(dot)5030800(at)arnowelzel(dot)de...
> Am 21.11.2013 19:30, schrieb James Harris:
>
>> I am looking for a way to mark up text in a way that PHP would be able to
>> efficiently and quickly convert to HTML.
>
> What kind of text?

What are the options? I was just thinking about normal text of the kind that
would appear on a web page.

> What kind of markup is needed?

Quite a lot. Things like these:
* bold, italic
* links to local pages and remote URLs
* images, code
* headers, lists, line breaks
* tables
* other things will likely be required but are not defined yet

For completeness, I should add that the markup would also need to pull in
similarly marked up text from other files and to populate tables from
databases.

James
Re: PHP functions to convert markup efficiently [message #183849 is a reply to message #183844] Fri, 22 November 2013 08:53 Go to previous messageGo to next message
Jerry Stuckle is currently offline  Jerry Stuckle
Messages: 2598
Registered: September 2010
Karma: 0
Senior Member
add to buddy list
ignore all messages by this user
On 11/22/2013 7:48 AM, James Harris wrote:
> Having looked at Markdown and Smarty that people suggested it might be
> better if I bite the bullet and develop my own markup language. It would
> then be easier both to limit what it allows and also to add features as
> needed. I have been running some syntax tests and it seems quite easy to do
> and fast to process even though the markup is currently a little cumbersome
> to write.
>
> "Arno Welzel" <usenet(at)arnowelzel(dot)de> wrote in message
> news:528F32EE(dot)5030800(at)arnowelzel(dot)de...
>> Am 21.11.2013 19:30, schrieb James Harris:
>>
>>> I am looking for a way to mark up text in a way that PHP would be able to
>>> efficiently and quickly convert to HTML.
>>
>> What kind of text?
>
> What are the options? I was just thinking about normal text of the kind that
> would appear on a web page.
>
>> What kind of markup is needed?
>
> Quite a lot. Things like these:
> * bold, italic
> * links to local pages and remote URLs
> * images, code
> * headers, lists, line breaks
> * tables
> * other things will likely be required but are not defined yet
>
> For completeness, I should add that the markup would also need to pull in
> similarly marked up text from other files and to populate tables from
> databases.
>
> James
>
>

So you want to make people learn another markup language which will end
up being just another version of HTML?

HTML is not all that complicated. It just takes a little time. Using a
good validator is also helpful.

--
==================
Remove the "x" from my email address
Jerry Stuckle
JDS Computer Training Corp.
jstucklex(at)attglobal(dot)net
==================
Re: PHP functions to convert markup efficiently [message #183855 is a reply to message #183844] Fri, 22 November 2013 10:42 Go to previous messageGo to next message
Arno Welzel is currently offline  Arno Welzel
Messages: 317
Registered: October 2011
Karma: 0
Senior Member
add to buddy list
ignore all messages by this user
Am 22.11.2013 13:48, schrieb James Harris:

> Having looked at Markdown and Smarty that people suggested it might be
> better if I bite the bullet and develop my own markup language. It would
> then be easier both to limit what it allows and also to add features as
> needed. I have been running some syntax tests and it seems quite easy to do
> and fast to process even though the markup is currently a little cumbersome
> to write.

I think this is not a good idea.

Existing markup languages are well documented and there are existing
implementations to parse the markup and convert it to HTML (or other
formats) which are not only used by one person - so it is likely that
bugs will be fixed as well within a reasonable time.

Besides Markdown and Smarty you can also try DokuWiki - it's markup can
be extended using syntax plugins (and you can create your own plugins as
well to handle block level or inline elements etc.).

[...]
>> What kind of markup is needed?
>
> Quite a lot. Things like these:
> * bold, italic
> * links to local pages and remote URLs
> * images, code
> * headers, lists, line breaks
> * tables
> * other things will likely be required but are not defined yet

So - if HTML is to complicated you should really try DokuWiki first.

And if even this syntax is too complicated you should not use a markup
language at all but a WYSWIG editor which produces valid HTML (for
example TinyMCE or CKEditor).

> For completeness, I should add that the markup would also need to pull in
> similarly marked up text from other files and to populate tables from
> databases.

This is out of scope for "markup" - but DokuWiki can handle that as well
using includes - there are several plugins to include other files within
a document and you can extend these plugins to use only "marked" content
from a file as well.


--
Arno Welzel
http://arnowelzel.de
http://de-rec-fahrrad.de
Re: PHP functions to convert markup efficiently [message #183871 is a reply to message #183849] Sat, 23 November 2013 01:21 Go to previous messageGo to next message
James Harris is currently offline  James Harris
Messages: 11
Registered: November 2013
Karma: 0
Junior Member
add to buddy list
ignore all messages by this user
"Jerry Stuckle" <jstucklex(at)attglobal(dot)net> wrote in message
news:l6nnl9$g03$2(at)dont-email(dot)me...

....

>>> What kind of markup is needed?
>>
>> Quite a lot. Things like these:
>> * bold, italic
>> * links to local pages and remote URLs
>> * images, code
>> * headers, lists, line breaks
>> * tables
>> * other things will likely be required but are not defined yet

....

> So you want to make people learn another markup language which will end up
> being just another version of HTML?

No but wouldn't it be unsafe to let people to enter raw HTML?

James
Re: PHP functions to convert markup efficiently [message #183873 is a reply to message #183871] Sat, 23 November 2013 06:59 Go to previous messageGo to next message
Jerry Stuckle is currently offline  Jerry Stuckle
Messages: 2598
Registered: September 2010
Karma: 0
Senior Member
add to buddy list
ignore all messages by this user
On 11/23/2013 1:21 AM, James Harris wrote:
> "Jerry Stuckle" <jstucklex(at)attglobal(dot)net> wrote in message
> news:l6nnl9$g03$2(at)dont-email(dot)me...
>
> ...
>
>>>> What kind of markup is needed?
>>>
>>> Quite a lot. Things like these:
>>> * bold, italic
>>> * links to local pages and remote URLs
>>> * images, code
>>> * headers, lists, line breaks
>>> * tables
>>> * other things will likely be required but are not defined yet
>
> ...
>
>> So you want to make people learn another markup language which will end up
>> being just another version of HTML?
>
> No but wouldn't it be unsafe to let people to enter raw HTML?
>
> James
>
>

What is unsafe about it?


--
==================
Remove the "x" from my email address
Jerry Stuckle
JDS Computer Training Corp.
jstucklex(at)attglobal(dot)net
==================
Re: PHP functions to convert markup efficiently [message #183875 is a reply to message #183871] Sat, 23 November 2013 07:12 Go to previous messageGo to next message
Christoph Michael Bec is currently offline  Christoph Michael Bec
Messages: 207
Registered: June 2013
Karma: 0
Senior Member
add to buddy list
ignore all messages by this user
James Harris wrote:

> No but wouldn't it be unsafe to let people to enter raw HTML?

At least, if you're going to let untrusted users (e.g. visitors of a
publicly available site) to enter HTML, you will have to filter the HTML
to avoid injection of scripts and maybe other undesired markup.
HTMLPurifier does a good job here, as may do other solutions.

--
Christoph M. Becker
Re: PHP functions to convert markup efficiently [message #183876 is a reply to message #183855] Sat, 23 November 2013 11:38 Go to previous messageGo to next message
James Harris is currently offline  James Harris
Messages: 11
Registered: November 2013
Karma: 0
Junior Member
add to buddy list
ignore all messages by this user
"Arno Welzel" <usenet(at)arnowelzel(dot)de> wrote in message
news:528F7B7D(dot)4030701(at)arnowelzel(dot)de...
> Am 22.11.2013 13:48, schrieb James Harris:
>
>> Having looked at Markdown and Smarty that people suggested it might be
>> better if I bite the bullet and develop my own markup language. It would
>> then be easier both to limit what it allows and also to add features as
>> needed. I have been running some syntax tests and it seems quite easy to
>> do
>> and fast to process even though the markup is currently a little
>> cumbersome
>> to write.
>
> I think this is not a good idea.
>
> Existing markup languages are well documented and there are existing
> implementations to parse the markup and convert it to HTML (or other
> formats) which are not only used by one person - so it is likely that
> bugs will be fixed as well within a reasonable time.

Noted.

> Besides Markdown and Smarty you can also try DokuWiki - it's markup can
> be extended using syntax plugins (and you can create your own plugins as
> well to handle block level or inline elements etc.).

I took a look at DokuWiki (having previously looked at the others). It's
good but does things I don't want. At first glance I couldn't see a way to
take them away and to restrict the formatting it accepts. I'm not sure I
want to invest the time needed to learn any of those packages when, at least
for now, it seems that native PHP is easy enough to use and is far more
flexible.

>>> What kind of markup is needed?
>>
>> Quite a lot. Things like these:
>> * bold, italic
>> * links to local pages and remote URLs
>> * images, code
>> * headers, lists, line breaks
>> * tables
>> * other things will likely be required but are not defined yet
>
> So - if HTML is to complicated you should really try DokuWiki first.
>
> And if even this syntax is too complicated you should not use a markup
> language at all but a WYSWIG editor which produces valid HTML (for
> example TinyMCE or CKEditor).

I am not trying to avoid complexity. Using PHP to convert markup to HTML
allows me to do things like these:
* restrict the elements that can be used (for security)
* add features such as a server-side TOC
* pull in data from various sources
* choose where to place elements such as footnotes
* make each page of the site a consistent structure

Basically, the combination of HTML, CSS, PHP and my own markup codes seems
ideal. Aside from having to devise the coding the rest is completely
standard and incredibly lightweight. As such, there will be no packages and
associated bugfixes to install and it should be very fast.

I'll keep in mind that there are prebuilt options, though, in case I run
into difficulties as I work on this.

James
Re: PHP functions to convert markup efficiently [message #183877 is a reply to message #183811] Sat, 23 November 2013 14:21 Go to previous messageGo to next message
James Harris is currently offline  James Harris
Messages: 11
Registered: November 2013
Karma: 0
Junior Member
add to buddy list
ignore all messages by this user
"James Harris" <james(dot)harris(dot)1(at)gmail(dot)com> wrote in message
news:l6ljfj$fue$1(at)dont-email(dot)me...

> I am looking for a way to mark up text in a way that PHP would be able to
> efficiently and quickly convert to HTML.

In case anyone is interested, here is what I have come up with so far.

The markup is designed to be fast to parse rather than to be beautiful.
However, it doesn't look too bad, IMO. I'll explain the markup first and
then the PHP which carries out the conversion.

This is very much experimental at this stage. I may well have to change any
of this including the tag formats. But it is working code as it stands.

There are simple tags which have a one-to-one translation. Here are some
examples. The markup is on the left and what it translates to on the right.

@(hr) --> <hr>
@(b) --> <b>
@(/b) --> </b>
@(nl) --> <br>
@(at) --> @

For example, "Please @(b)STOP@(/b) here" will print STOP in bold and the
rest non-bold.

There are markup tags with simple parameters such as these.

@(h,2) --> <h2>
@(/h,2) --> </h2>

And there are tags which are more inclusive such as these.

@(sect,2,Section X) --> <h2>Section X</h2>
@(link,Local Page) --> <a href="Local Page">Local Page</a>
@(link,http://xe.com,XE) --> <a href="http://xe.com">XE</a>

As you can see, a markup tag is identified by an @ sign followed by an
opening delimiter. The opening delimiter is "(" in all the above cases but
could be a different character. Each opening delimiter character has a
corresponding closing delimiter. For most punctuation characters the closing
delimiter is the same as the opening delimiter but for pairable bracket
characters the logical closing bracket is used. Therefore the following all
mean the same.

@(i)
@[i]
@|i|
@*i*

The point is that the person writing the code can and must choose a closing
delimiter that does not appear in the text between the delimiters. This is
to help recognition speed; the complete tag can be isolated without needing
to consider context such as quoted strings.

I haven't performed timing comparisons but I took Christoph's advice for
speed and chose to use PHPs inbuilt functions which are likely written in C.
I try to avoid calling them repeatedly so as to avoid call overhead. As a
result, the markup parsing works as follows. Feel free to criticise.

First, the page of marked-up text has htmlspecialchars() applied and then is
split on @ symbols using a single call to PHP's explode(). This creates an
array of strings which, for the sake of something to name them, I call
Sections. The PHP code is, in essence, as follows.

$contents = file_get_contents($target_page);
$contents = htmlspecialchars($contents, ENT_NOQUOTES);
$sects = explode("@", $contents);
$contents = ""; //Original text no longer needed

The first section, $sects[0], is what preceded the first @ sign. It is not
marked up so it is written verbatim and then split off using the following
code.

echo $sects[0];
$sects = array_slice($sects, 1);

Second, for each remaining section the initial character (which followed an
@ sign) is taken as an opening delimiter and a matching closing delimiter is
chosen. Then explode(,,2) is called to split the section into just two
parts: before and after the closing delimiter. The most important part of
that is

$sectparts = explode($delimiter, substr($sect, 1), 2);

This converts each section into two parts: a tag and some text.

Third, so that tag parameters can include whatever is necessary, especially
for where the include commas in quoted strings, I use the CSV module as
follows.

$tagparts = str_getcsv($sectparts[0]);

That divides the complete tag into manageable parts. All that's left is to
deal with each part as in

switch ($tagparts[0]) {
case "at": echo "@"; break;
case "b": echo "<b>"; break;
etc.

Finally, once the tag has been written the following non-tag text is written
with

echo $sectparts[1];

That's it so far. I may have missed something fundamental but so far it
seems to work well. It is simple and flexible and the code is very short. No
need for a complex package. There are a few functions I would rather have
not had to use but PHP seems to require them. In any case, the code avoids
things which might slow it down such as large packages, char-by-char
processing (except, presumably, in the CSV module) and regular expressions.
So it should be fast as it stands.

James
Re: PHP functions to convert markup efficiently [message #183879 is a reply to message #183876] Sat, 23 November 2013 16:24 Go to previous messageGo to next message
Richard Yates is currently offline  Richard Yates
Messages: 86
Registered: September 2013
Karma: 0
Member
add to buddy list
ignore all messages by this user
On Sat, 23 Nov 2013 16:38:45 -0000, "James Harris"
<james(dot)harris(dot)1(at)gmail(dot)com> wrote:

> "Arno Welzel" <usenet(at)arnowelzel(dot)de> wrote in message
> news:528F7B7D(dot)4030701(at)arnowelzel(dot)de...
>> Am 22.11.2013 13:48, schrieb James Harris:
>>
>>> Having looked at Markdown and Smarty that people suggested it might be
>>> better if I bite the bullet and develop my own markup language. It would
>>> then be easier both to limit what it allows and also to add features as
>>> needed. I have been running some syntax tests and it seems quite easy to
>>> do
>>> and fast to process even though the markup is currently a little
>>> cumbersome
>>> to write.
>>
>> I think this is not a good idea.
>>
>> Existing markup languages are well documented and there are existing
>> implementations to parse the markup and convert it to HTML (or other
>> formats) which are not only used by one person - so it is likely that
>> bugs will be fixed as well within a reasonable time.
>
> Noted.
>
>> Besides Markdown and Smarty you can also try DokuWiki - it's markup can
>> be extended using syntax plugins (and you can create your own plugins as
>> well to handle block level or inline elements etc.).
>
> I took a look at DokuWiki (having previously looked at the others). It's
> good but does things I don't want. At first glance I couldn't see a way to
> take them away and to restrict the formatting it accepts. I'm not sure I
> want to invest the time needed to learn any of those packages when, at least
> for now, it seems that native PHP is easy enough to use and is far more
> flexible.
>
>>>> What kind of markup is needed?
>>>
>>> Quite a lot. Things like these:
>>> * bold, italic
>>> * links to local pages and remote URLs
>>> * images, code
>>> * headers, lists, line breaks
>>> * tables
>>> * other things will likely be required but are not defined yet
>>
>> So - if HTML is to complicated you should really try DokuWiki first.
>>
>> And if even this syntax is too complicated you should not use a markup
>> language at all but a WYSWIG editor which produces valid HTML (for
>> example TinyMCE or CKEditor).
>
> I am not trying to avoid complexity. Using PHP to convert markup to HTML
> allows me to do things like these:
> * restrict the elements that can be used (for security)
> * add features such as a server-side TOC
> * pull in data from various sources
> * choose where to place elements such as footnotes
> * make each page of the site a consistent structure
>
> Basically, the combination of HTML, CSS, PHP and my own markup codes seems
> ideal. Aside from having to devise the coding the rest is completely
> standard and incredibly lightweight. As such, there will be no packages and
> associated bugfixes to install and it should be very fast.
>
> I'll keep in mind that there are prebuilt options, though, in case I run
> into difficulties as I work on this.

Can you use HTML codes, plus any markup you invent, but sanitize the
input by stripping any HTML or other tags that you do not want or that
could be a risk?

I have a page where users can enter raw MySQL queries to generate
reports. The first thng that happens to input is to check that only
SELECT queries are processed (plus a lot of other safeguards). I also
devised a 'COPY from table where index=x' command that allows copying
one record easily. So, the page uses a limited form of a standard
markup, supplemented with extras, and is completely safe.

Seems you could do the same with HTML.
Re: PHP functions to convert markup efficiently [message #183880 is a reply to message #183879] Sat, 23 November 2013 17:07 Go to previous messageGo to next message
James Harris is currently offline  James Harris
Messages: 11
Registered: November 2013
Karma: 0
Junior Member
add to buddy list
ignore all messages by this user
"Richard Yates" <richard(at)yatesguitar(dot)com> wrote in message
news:co6299tplsjmmmr0uovsldejl3431363l5(at)4ax(dot)com...

....

>> I am not trying to avoid complexity. Using PHP to convert markup to HTML
>> allows me to do things like these:
>> * restrict the elements that can be used (for security)
>> * add features such as a server-side TOC
>> * pull in data from various sources
>> * choose where to place elements such as footnotes
>> * make each page of the site a consistent structure
>>
>> Basically, the combination of HTML, CSS, PHP and my own markup codes
>> seems
>> ideal. Aside from having to devise the coding the rest is completely
>> standard and incredibly lightweight. As such, there will be no packages
>> and
>> associated bugfixes to install and it should be very fast.
>>
>> I'll keep in mind that there are prebuilt options, though, in case I run
>> into difficulties as I work on this.
>
> Can you use HTML codes, plus any markup you invent, but sanitize the
> input by stripping any HTML or other tags that you do not want or that
> could be a risk?

Theoretically yes but that would be hard to do and much slower. Consider
that if you see <p> on a page you don't know whether it is an HTML paragraph
tag or not unless you know its context. It might be part of a Java program,
for example, as in

f<p>();

or it could be just an insignificant piece of text that should appear as
written. The only way to tell for sure is to parse the file from the top and
recognise every element that precedes it. That would be a lot of work.

> I have a page where users can enter raw MySQL queries to generate
> reports. The first thng that happens to input is to check that only
> SELECT queries are processed (plus a lot of other safeguards). I also
> devised a 'COPY from table where index=x' command that allows copying
> one record easily. So, the page uses a limited form of a standard
> markup, supplemented with extras, and is completely safe.
>
> Seems you could do the same with HTML.

It is possible but would require lots of parsing code. By contrast,
converting markup to HTML can be made much easier. FWIW, I found I could do
something that works much more simply and wrote it up in a post made just a
few hours ago.

James
Re: PHP functions to convert markup efficiently [message #183884 is a reply to message #183876] Sun, 24 November 2013 11:09 Go to previous messageGo to next message
Arno Welzel is currently offline  Arno Welzel
Messages: 317
Registered: October 2011
Karma: 0
Senior Member
add to buddy list
ignore all messages by this user
James Harris, 2013-11-23 17:38:

> "Arno Welzel" <usenet(at)arnowelzel(dot)de> wrote in message
[...]
>> Besides Markdown and Smarty you can also try DokuWiki - it's markup can
>> be extended using syntax plugins (and you can create your own plugins as
>> well to handle block level or inline elements etc.).
>
> I took a look at DokuWiki (having previously looked at the others). It's
> good but does things I don't want. At first glance I couldn't see a way to

What exactly?

> take them away and to restrict the formatting it accepts. I'm not sure I

You can. But you have to write a plugin to filter the markup of course.

> want to invest the time needed to learn any of those packages when, at least
> for now, it seems that native PHP is easy enough to use and is far more
> flexible.

Your mileage may vary.

Using just PHP itself and creating all the scripts on your own is of
course the most flexible way - but I doubt if it is really "easy" if you
want to create code which you still understand 2 years later ;-)


--
Arno Welzel
http://arnowelzel.de
http://de-rec-fahrrad.de
Re: PHP functions to convert markup efficiently [message #183894 is a reply to message #183884] Sun, 24 November 2013 13:11 Go to previous messageGo to next message
James Harris is currently offline  James Harris
Messages: 11
Registered: November 2013
Karma: 0
Junior Member
add to buddy list
ignore all messages by this user
"Arno Welzel" <usenet(at)arnowelzel(dot)de> wrote in message
news:529224C1(dot)4060400(at)arnowelzel(dot)de...
> James Harris, 2013-11-23 17:38:
>
>> "Arno Welzel" <usenet(at)arnowelzel(dot)de> wrote in message
> [...]
>>> Besides Markdown and Smarty you can also try DokuWiki - it's markup can
>>> be extended using syntax plugins (and you can create your own plugins as
>>> well to handle block level or inline elements etc.).
>>
>> I took a look at DokuWiki (having previously looked at the others). It's
>> good but does things I don't want. At first glance I couldn't see a way
>> to
>
> What exactly?

Would exactly would I want to prevent it doing? Unfortunately, most of it!
Things like:
* automatic conversion of URLs which are not marked up
* automatic lowering of the case of page names
* interwiki links
* windows shares links
* ALL of its smileys
* its Usenet-style quoting
* most table cell merging
* embeddable HTML and PHP

Sorry, but you did ask! It's not all bad. FWIW there are some things I like
about it.

>> take them away and to restrict the formatting it accepts. I'm not sure I
>
> You can. But you have to write a plugin to filter the markup of course.
>
>> want to invest the time needed to learn any of those packages when, at
>> least
>> for now, it seems that native PHP is easy enough to use and is far more
>> flexible.
>
> Your mileage may vary.
>
> Using just PHP itself and creating all the scripts on your own is of
> course the most flexible way - but I doubt if it is really "easy" if you
> want to create code which you still understand 2 years later ;-)

That is possible but the code so far is tiny. Partly because it has been
easy so far I am more concerned that I may have failed to consider something
fundamental!

James
Re: PHP functions to convert markup efficiently [message #183900 is a reply to message #183894] Sun, 24 November 2013 17:18 Go to previous messageGo to next message
Arno Welzel is currently offline  Arno Welzel
Messages: 317
Registered: October 2011
Karma: 0
Senior Member
add to buddy list
ignore all messages by this user
James Harris, 2013-11-24 19:11:

> "Arno Welzel" <usenet(at)arnowelzel(dot)de> wrote in message
> news:529224C1(dot)4060400(at)arnowelzel(dot)de...
>> James Harris, 2013-11-23 17:38:
>>
>>> "Arno Welzel" <usenet(at)arnowelzel(dot)de> wrote in message
>> [...]
>>>> Besides Markdown and Smarty you can also try DokuWiki - it's markup can
>>>> be extended using syntax plugins (and you can create your own plugins as
>>>> well to handle block level or inline elements etc.).
>>>
>>> I took a look at DokuWiki (having previously looked at the others). It's
>>> good but does things I don't want. At first glance I couldn't see a way
>>> to
>>
>> What exactly?
>
> Would exactly would I want to prevent it doing? Unfortunately, most of it!
> Things like:
> * automatic conversion of URLs which are not marked up

This can at least be avoided "manually" by embedding an URL within %% -
e.h. %%http://something.example%% would not be converted to a link.

> * automatic lowering of the case of page names

This can be changed in the code - according to
<https://forum.dokuwiki.org/thread/5417>:

In function cleanID(), located in inc/pageutils.php just comment out the
line with

$id = utf8_strtolower($id);

> * interwiki links

If you don't need them, remove all of them in conf/interwiki.conf.

> * windows shares links
> * ALL of its smileys

This is also configuration option and can be turned off.

> * its Usenet-style quoting

I'm not sure where to turn this off - but it should be possible.

> * most table cell merging

You don't like the way it works? Or don't you like the possibility to
merged table cells at all?

> * embeddable HTML and PHP

Which is also a configuration option and turned off by default anyway.



--
Arno Welzel
http://arnowelzel.de
http://de-rec-fahrrad.de
Re: PHP functions to convert markup efficiently [message #183901 is a reply to message #183900] Sun, 24 November 2013 18:44 Go to previous messageGo to next message
James Harris is currently offline  James Harris
Messages: 11
Registered: November 2013
Karma: 0
Junior Member
add to buddy list
ignore all messages by this user
"Arno Welzel" <usenet(at)arnowelzel(dot)de> wrote in message
news:52927B39(dot)3020407(at)arnowelzel(dot)de...

....

>>>> > Besides Markdown and Smarty you can also try DokuWiki - it's markup
>>>> > can

....

>> Would exactly would I want to prevent it doing? Unfortunately, most of
>> it!
>> Things like:

....

>> * most table cell merging
>
> You don't like the way it works? Or don't you like the possibility to
> merged table cells at all?

There may be a need to make merging impossible. I have in mind making tables
sortable (by means of client-side JavaScript). That would not work with
merged cells.

Thank you for going to the trouble of making suggestions (snipped) for the
points I raised.

James
Re: PHP functions to convert markup efficiently [message #183912 is a reply to message #183901] Mon, 25 November 2013 02:11 Go to previous message
Arno Welzel is currently offline  Arno Welzel
Messages: 317
Registered: October 2011
Karma: 0
Senior Member
add to buddy list
ignore all messages by this user
James Harris, 2013-11-25 00:44:

> "Arno Welzel" <usenet(at)arnowelzel(dot)de> wrote in message
> news:52927B39(dot)3020407(at)arnowelzel(dot)de...
>
> ...
>
>>>> >> Besides Markdown and Smarty you can also try DokuWiki - it's markup
>>>> >> can
>
> ...
>
>>> Would exactly would I want to prevent it doing? Unfortunately, most of
>>> it!
>>> Things like:
>
> ...
>
>>> * most table cell merging
>>
>> You don't like the way it works? Or don't you like the possibility to
>> merged table cells at all?
>
> There may be a need to make merging impossible. I have in mind making tables
> sortable (by means of client-side JavaScript). That would not work with
> merged cells.

JFTR: This plugin allows sortable tables in DokuWiki using JavaScript:

<https://www.dokuwiki.org/plugin:sortablejs>

More detailed explanation with live samples:

<http://docs.oseems.com/general/web/dokuwiki/sort-table>


--
Arno Welzel
http://arnowelzel.de
http://de-rec-fahrrad.de
Quick Reply
Formatting Tools:   
  Switch to threaded view of this topic Create a new topic
Previous Topic: changing iframe source via php
Next Topic: converting numbers to ascii values
Goto Forum:
  

-=] Back to Top [=-
[ Syndicate this forum (XML) ] [ RSS ]

Current Time: Mon Dec 11 02:43:42 EST 2017

Total time taken to generate the page: 0.01013 seconds