Posted By

bingjian on 09/24/09


Tagged

RegularExpression dblp


Versions (?)

Most Prolific DBLP Authors


 / Published in: Python
 

URL: http://www.informatik.uni-trier.de/~ley/db/indices/a-tree/prolific/index.html

  1. from urllib import urlopen
  2. def most_prolific_authors():
  3. """
  4. Return the list of most prolific DBLP authors
  5. http://www.informatik.uni-trier.de/~ley/db/about/prolific.html
  6. The number of publications listed in DBLP for an author is no indication for the quality or importance of her/his work.
  7. """
  8. url = 'http://www.informatik.uni-trier.de/~ley/db/indices/a-tree/prolific/index.html'
  9. doc = urlopen(url).read()
  10. start = doc.find('number of publications / names')
  11. end_of_cloud = doc.find('DBLP lists',start)
  12. authors = []
  13. while(True):
  14. start = doc.find('html">',start)
  15. if start == -1 or start>end_of_cloud:
  16. return authors
  17. end = doc.find('</a>',start)
  18. name = doc[start+6:end]
  19. if not isNumber(name):
  20. authors.append(name)
  21. start = end+14 # </a></td></tr>
  22. return authors

Report this snippet  

You need to login to post a comment.