Posted By

pitje on 09/02/07


Tagged

tags close


Versions (?)

Who likes this?

2 people have marked this snippet as a favorite

bitcrumb
vali29


close tags in a html-snippet


 / Published in: PHP
 

suppose you have some html-formatted text of which you would like to show the first 45 characters. This function closes any tags that are not-closed because of cutting the first 45 characters.

Note that tags are also counted when defining the first 45 characters!

  1. <?php
  2.  
  3. function closetags ( $html )
  4. {
  5. #put all opened tags into an array
  6. preg_match_all ( "#<([a-z]+)( .*)?(?!/)>#iU", $html, $result );
  7. $openedtags = $result[1];
  8.  
  9. #put all closed tags into an array
  10. preg_match_all ( "#</([a-z]+)>#iU", $html, $result );
  11. $closedtags = $result[1];
  12. $len_opened = count ( $openedtags );
  13. # all tags are closed
  14. if( count ( $closedtags ) == $len_opened )
  15. {
  16. return $html;
  17. }
  18. $openedtags = array_reverse ( $openedtags );
  19. # close tags
  20. for( $i = 0; $i < $len_opened; $i++ )
  21. {
  22. if ( !in_array ( $openedtags[$i], $closedtags ) )
  23. {
  24. $html .= "</" . $openedtags[$i] . ">";
  25. }
  26. else
  27. {
  28. unset ( $closedtags[array_search ( $openedtags[$i], $closedtags)] );
  29. }
  30. }
  31. return $html;
  32. }
  33.  
  34. $str = "<div>This is some interesting <strong><em>content!</em> And this</strong> line is <em>";
  35. $str .= "abundantly</em> formatted</div>";
  36.  
  37. $snippet = substr ( $str, 0, 45 );
  38.  
  39. $snippet = strrpos ( $snippet, "<" ) > strrpos ( $snippet, ">" ) ? rtrim ( substr ( $str, 0, strrpos ( $snippet, "<" ) ) ) . "....." : rtrim ( $snippet ) . ".....";
  40.  
  41. $x = closetags ( $snippet );
  42.  
  43.  
  44. ?>

Report this snippet  

Comments

RSS Icon Subscribe to comments
Posted By: drb292 on December 4, 2008

This is a v handy script tho one issue is that it closes and , I've str_replace modded it but and works v well

Posted By: drb292 on December 4, 2008

spot the missing words: br tags and img tags

Posted By: pitje on March 2, 2009

yeah, it doesn't take into account any self-closing tag (like , and )

Posted By: pitje on March 2, 2009

ah, i meant like br, img and hr

Posted By: ellisgl on May 6, 2009

Here's a fix for the "H" tags.

put all opened tags into an array

pregmatchall('##iU', $html, $result);

put all closed tags into an array

pregmatchall('##iU', $html, $result);

Posted By: endiku on July 22, 2009

Fantastic bit of work. Was half way through my own project when I found this gem and its perfect.

To fix the BR, HR, IMG issue you can just use this extra line after the array_reverse

$openedtags = array_diff($openedtags, array("img", "hr", "br"));

That will remove those tags from the close creation.

Posted By: endiku on July 22, 2009

Correction. Add these lines

$openedtags = arraydiff($openedtags, array("img", "hr", "br")); $openedtags = arrayvalues($openedtags);

After

$openedtags = $result[1];

That will resort your array. The first way I mentioned will leave you with empty closing tags.

Posted By: webanol on September 14, 2009

I found another shortcoming on this script - if you truncated the html before sending it to this function, there's a decent chance you'll end up with a half-tag at the end, like:

<span

at the end instead of : <span>

To make this function account for that, put the following lines at the very beginning:

    # Strip any mangled tags off the end
    $html=preg_replace("#]*$#", " ", $html);        

This goes before: pregmatchall ( "##iU", $html, $result );

This new code strips out the broken tag completely - there would be no intelligent way to restore it, as we wouldn't know what important info was lost with the truncation.

Great function, though! It was just what I needed for a project I'm working on currently.

Posted By: pitje on December 11, 2009

great, thanks! :)

Posted By: berto on March 19, 2010

Here's an amendment to the regular expression to make the thing work really smoothly:

function closetags($html) { pregmatchall('##iU', $html, $result); $closedtags = $result[1]; $lenopened = count($openedtags); if (count($closedtags) == $lenopened) { return $html; } $openedtags = arrayreverse($openedtags); for ($i=0; $i < $lenopened; $i++) { if (!inarray($openedtags[$i], $closedtags)) { $html .= ''; } else { unset($closedtags[arraysearch($openedtags[$i], $closedtags)]); } } return $html; }

You can see that the single tags (img, br, meta...) are excluded from the expression. You could add more by putting them in (separated by a vertical slash).

I've tested this version and it works like a charm. :)

Posted By: berto on March 19, 2010

Whoops, here it is again; hopefully the blog will display it more like it should be:

function closetags($html) { pregmatchall('##iU', $html, $result); $closedtags = $result[1]; $lenopened = count($openedtags); if (count($closedtags) == $lenopened) { return $html; } $openedtags = arrayreverse($openedtags); for ($i=0; $i < $lenopened; $i++) { if (!inarray($openedtags[$i], $closedtags)) { $html .= ''; } else { unset($closedtags[arraysearch($openedtags[$i], $closedtags)]); } } return $html; }

Posted By: berto on March 19, 2010

Okay, I'm going to try this again.

Here's the code:


function closetags($html) {
    preg_match_all('##iU', $html, $result);
    $closedtags = $result[1];
    $len_opened = count($openedtags);
    if (count($closedtags) == $len_opened) {
        return $html;
    }
    $openedtags = array_reverse($openedtags);
    for ($i=0; $i < $len_opened; $i++) {
        if (!in_array($openedtags[$i], $closedtags)) {
            $html .= '';
        } else {
            unset($closedtags[array_search($openedtags[$i], $closedtags)]);
        }
    }
    return $html;
} 

(Don't use either of my previous posts, the blog stripped out some essential characters.)

Posted By: berto on March 19, 2010
  • sigh. *

Sorry, gang, I don't know how to include this code on this blog without it getting hammered.

Let's try this:

Posted By: berto on March 19, 2010

Gulp. I'm having massive trouble with this blog format.

See: http://codesnippets.joyent.com/posts/show/959

Posted By: pijulius on July 24, 2010

Thanks Pitje, I was looking at many codes but this one was the closest to the same thing I had in mind. I noticed a few things thought, for e.g. if you open an "a" tag and you close a "b" tag the number of opened/closed tags will be equal and you end up without closing the "a" tag.

Also if you close a tag before opening it it will be also skipped as it thinks it's already closed. I have modified your function a bit and included in my CMS (jCore to close the tags when posting comments, the modified version is:

static function closeTags($html) {
    preg_match_all("##iU", $html, $result, PREG_OFFSET_CAPTURE);

    if (!isset($result[1]))
        return $html;

    $openedtags = $result[1];
    $len_opened = count($openedtags);

    if (!$len_opened)
        return $html;

    preg_match_all("##iU", $html, $result, PREG_OFFSET_CAPTURE);
    $closedtags = array();

    foreach($result[1] as $tag)
        $closedtags[$tag[1]] = $tag[0];

    $openedtags = array_reverse($openedtags);

    for($i = 0; $i < $len_opened; $i++) {
        if (preg_match('/(img|br|hr)/i', $openedtags[$i][0]))
            continue;

        $found = array_search($openedtags[$i][0], $closedtags);

        if (!$found || $found < $openedtags[$i][1])
            $html .= "";

        if ($found)
            unset($closedtags[$found]);
    }

    return $html;
}

As you can see I dropped the whole length check and if there are any open tags it will start checking for the closed ones and fix if needed. I did some testings and works just fine, I don't care about half tags as I won't crop the messages and it may be better/faster way to do it but as it gets only executed when comment added to the db it's ok for me.

Thanks again and if you can't see the above code it will be in jCore (0.6) source at: lib/sources/security.class.php line 575: static function closeTags($html)

You need to login to post a comment.