We Recommend

ASP.NET 3.5 Unleashed ASP.NET 3.5 Unleashed
ASP.NET 3.5 Unleashed is the most comprehensive book available on the Microsoft ASP.NET 3.5 Framework, covering all aspects of the ASP.NET 3.5 Framework--no matter how advanced.


Posted By

damarev on 04/23/07


Tagged


Versions (?)


UTF-8 functions: IsValidUTF8 DecodeUTF8 EncodeUTF8


Published in: ASP 


  1. ' Simple functions to convert the first 256 characters
  2. ' of the Windows character set from and to UTF-8.
  3.  
  4. ' Written by Hans Kalle for Fisz
  5. ' http://www.fisz.nl
  6.  
  7. 'IsValidUTF8
  8. ' Tells if the string is valid UTF-8 encoded
  9. 'Returns:
  10. ' true (valid UTF-8)
  11. ' false (invalid UTF-8 or not UTF-8 encoded string)
  12.  
  13. function IsValidUTF8(s)
  14. dim i
  15. dim c
  16. dim n
  17.  
  18. IsValidUTF8 = false
  19. i = 1
  20. do while i <= len(s)
  21. c = asc(mid(s,i,1))
  22. if c and &H80 then
  23. n = 1
  24. do while i + n < len(s)
  25. if (asc(mid(s,i+n,1)) and &HC0) <> &H80 then
  26. exit do
  27. end if
  28. n = n + 1
  29. loop
  30. select case n
  31. case 1
  32. exit function
  33. case 2
  34. if (c and &HE0) <> &HC0 then
  35. exit function
  36. end if
  37. case 3
  38. if (c and &HF0) <> &HE0 then
  39. exit function
  40. end if
  41. case 4
  42. if (c and &HF8) <> &HF0 then
  43. exit function
  44. end if
  45. case else
  46. exit function
  47. end select
  48. i = i + n
  49. else
  50. i = i + 1
  51. end if
  52. loop
  53. IsValidUTF8 = true
  54. end function
  55.  
  56.  
  57.  
  58.  
  59.  
  60.  
  61. 'DecodeUTF8
  62. ' Decodes a UTF-8 string to the Windows character set
  63. ' Non-convertable characters are replace by an upside
  64. ' down question mark.
  65. 'Returns:
  66. ' A Windows string
  67.  
  68. function DecodeUTF8(s)
  69. dim i
  70. dim c
  71. dim n
  72.  
  73. i = 1
  74. do while i <= len(s)
  75. c = asc(mid(s,i,1))
  76. if c and &H80 then
  77. n = 1
  78. do while i + n < len(s)
  79. if (asc(mid(s,i+n,1)) and &HC0) <> &H80 then
  80. exit do
  81. end if
  82. n = n + 1
  83. loop
  84. if n = 2 and ((c and &HE0) = &HC0) then
  85. c = asc(mid(s,i+1,1)) + &H40 * (c and &H01)
  86. else
  87. c = 191
  88. end if
  89. s = left(s,i-1) + chr(c) + mid(s,i+n)
  90. end if
  91. i = i + 1
  92. loop
  93. DecodeUTF8 = s
  94. end function
  95.  
  96.  
  97.  
  98.  
  99.  
  100.  
  101. 'EncodeUTF8
  102. ' Encodes a Windows string in UTF-8
  103. 'Returns:
  104. ' A UTF-8 encoded string
  105.  
  106. function EncodeUTF8(s)
  107. dim i
  108. dim c
  109.  
  110. i = 1
  111. do while i <= len(s)
  112. c = asc(mid(s,i,1))
  113. if c >= &H80 then
  114. s = left(s,i-1) + chr(&HC2 + ((c and &H40) / &H40)) + chr(c and &HBF) + mid(s,i+1)
  115. i = i + 1
  116. end if
  117. i = i + 1
  118. loop
  119. EncodeUTF8 = s
  120. end function

Report this snippet 

You need to login to post a comment.