Posted By

math89 on 08/13/10


Tagged

html text


Versions (?)

Who likes this?

1 person have marked this snippet as a favorite

pytheas


Convert html source to full text


 / Published in: PHP
 

URL: http://www.phpsnippets.info/convert-html-source-to-full-text

Turn a html source into a full text document by removing all html tags and other unneeded code.

  1. function html2txt($document){
  2. $search = array('@<script[^>]*?>.*?</script>@si', // Strip out javascript
  3. '@<style[^>]*?>.*?</style>@siU', // Strip style tags properly
  4. '@<[?]php[^>].*?[?]>@si', //scripts php
  5. '@<[?][^>].*?[?]>@si', //scripts php
  6. '@<[\/\!]*?[^<>]*?>@si', // Strip out HTML tags
  7. '@<![\s\S]*?--[ \t\n\r]*>@' // Strip multi-line comments including CDATA
  8. );$text = preg_replace($search, '', $document);
  9. return $text;
  10. }
  11.  
  12. // Usage
  13.  
  14. $html_source = file_get_contents('http://www.phpsnippets.info');
  15. $txt = html2txt($html_source);

Report this snippet  

You need to login to post a comment.