Rich-text formatting in PHP: HTML, Markdown, rich-text editors like TinyMCE and doing it securely

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
in an html form a text area element is used to collect a sizeable amount of text this could be the content of a blog post a product description a comment and so on you can enter more text than a regular text input on several lines but it is still just plain text by default there's no way to add any formats or styles so how do we allow the user of this form to add formatting to this content let's look at a few options first let's look at the code for the page we just saw this is very simple code the start of a regular html document and inside that a form that uses the post method as there is no action attribute the form will be submitted to itself inside the form we have a text area control and a submit button below the form we check for the form being submitted and if so we just print out the contents of the text area so any text we put in the text area will be displayed back to us when we submit the form so one way to allow formatting would be to simply allow the user to insert some html for example i can surround a word with b tags and by default this will show the word in bold there is a problem with this though as this makes the site vulnerable to code injection at best someone could even accidentally insert some formatting code that interferes with the design of the rest of the site at worst though someone could intentionally insert some code and do all sorts of damage there are ways to prevent this though which we'll look at shortly the big problem with this though from a user interface point of view is that it requires the user to know html plus even if they do anything other than simple formatting makes this difficult to read and edit so simply allowing html isn't a good solution another option is to use a plain text markup language like markdown with markdown you add simple formatting instructions to the text for example to add emphasis to a word you surround it with asterisks to display this with actual emphasis in the browser a markdown parser processes this text and converts these instructions into html let's have a look at an example first we need a markdown parser of which there are many let's install this one called pass down we'll do this on the command line using composer to use this first let's add some php code before the start of the html then we'll require composer's autoloader so that the passdown class will be loaded automatically then we'll create a new object of the pass down class and in addition to the original content let's display the output from the markdown parser so let's add another div element and in here to convert the markdown content into html we call the text method on the parser object passing in the content from the form let's give that a try if i enter some text and surround some with double asterisks this is markdown syntax for strong emphasis when i submit the form that word is shown in bold if we view the source we can see that the double asterisks have been converted to strong tags so if you want to keep the form simple and only want to allow a few basic formatting tags using markdown like this could be a good solution note this also works without requiring javascript in the browser it does require the user to know markdown though although it is very simple and the syntax guide could easily be shown on the help page also note that this solution as shown still does allow html to be inserted but we'll cover how to remove html tags in a moment a third option is to use a rich text editor sometimes known as a wysiwyg editor which stands for what you see is what you get using an editor like this allows you to apply styles and see what they look like straight away just like using a word processor there are many of these available some free some paid all of them require javascript to be enabled in the browser tinymce is one of the most popular and is open source and the basic version is totally free so we'll use this one to use it we simply need to include these two lines in our html most html rich text editors work like this so let's copy these and paste them in the head section before we see how this looks let's just remove the line where we're displaying the output from the markdown parser in the browser the editor looks like this just like a word processor in addition to specifying the selector setting let's add a few more settings to make it simpler just to demonstrate how this works we'll turn the menu bar off load the code plugin which allows us to view the source code specify the toolbar as just having the bold italic and code buttons now in the browser it's much simpler we can add text and format it and what is shown in the editor is what will be submitted the source code button shows us the html that's been generated by the editor we can see that the bold button adds strong tags around the selected text if i submit this the submitted code looks just like what we saw in the editor content window if we view the source there's the exact same code we just saw in the editor the problem with this though as we saw earlier is that if we allow any html to be submitted then someone could inject some unwanted code into our site for example let's say you only want users to be able to format text as bold and italic like you might do in a comment section of an article let's enter some text and make this word bold and this one italic however if we edit the source we can add any markup we like for example h1 tags around this word if i submit this this is what is sent to our script and what we're displaying under the editor and just in case you're thinking we could simply remove the source code button i included this just to keep it simple the fact is the code that's processing the form when it's submitted simply takes the value of the text area and displays it even if the source code button wasn't there someone could in theory submit whatever html they like to our script and potentially inject any html they want to fix this we need to process the submitted html to remove any unwanted code before we display it one way to do this is to use the php strip tags function this will remove all html tags from a string but we can pass it a list of tags we want to allow so in the body of the html where we're displaying the submitted content let's wrap this in a call to strip tags and for the second argument pass in an array containing strong and em as the tags will allow in the browser let's try that again we'll make this word bold this one italic and edit the source to add an h1 tag around this one now when we submit it the strong and the em tag have been allowed but the h1 tag has been removed however there is a limitation with this the strip tags function can only be used to remove unwanted tags so continuing with the example of a comment section if we add some text and make it bold if we edit the source we can add an attribute to this tag for example a style attribute when we submit this the attribute is part of a tag that we allow so it isn't removed so the strip tags function has its limitations what we need is something that will also strip unwanted attributes one such package that does this is html purifier this is free and open source the easiest way to install this is using composer using this command on the command line we'll paste and run that command to use html purifier first we need to create a configuration object which we'll do with the create default method on the config class then we set configuration values using the set method on that object first let's turn the cache off which makes it slightly slower but simpler for the purposes of this demo then we'll set the allowed html elements to strong and em then we'll set the allowed html attributes to an empty array which means none are allowed then we create an object of the html purifier class passing in the config object then in the html let's remove the call to strip tags and instead call the purify method on the purifier object let's give that a try we'll enter some text as before make it bold and then edit it to add a style attribute let's also add an h1 element too so we can check this tag is removed when we submit this now the style attribute has been removed from the strung element and the h1 element has been removed too so if you want to allow a user to submit formatted text you have various options from allowing html to markdown to a rich text editor like this if you do allow html just make sure that when you display it you sanitize it to remove any unwanted tags and attributes
Info
Channel: Dave Hollingworth
Views: 10,153
Rating: undefined out of 5
Keywords:
Id: Udgi43MG0a4
Channel Id: undefined
Length: 16min 55sec (1015 seconds)
Published: Thu Dec 03 2020
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.