網路城邦
上一篇 回創作列表 下一篇   字體:
依byte長度截斷 UTF-8長度
2011/03/04 10:03:29瀏覽493|回應0|推薦0

忘了從那找到的參考


  /**
   * 依byte長度截斷 UTF-8長度
   *
   * @param str
   * @param maxBytes
   * @return
   */
  public static String truncateWhenUTF8(String str, int maxBytes) {    
   int b = 0;    
   System.out.println("str = " + str);
   System.out.println("maxBytes = " + maxBytes);
   for (int i = 0; i < str.length(); i++) {        
    char c = str.charAt(i);        
    // ranges from http:
    //en.wikipedia.org/wiki/UTF-8        
    int skip = 0;        
    int more;        
    if (c <= 0x007f) {            
     more = 1;        
    }        
    else if (c <= 0x07FF) {            
     more = 2;        
    } else if (c <= 0xd7ff) {            
     more = 3;        
    } else if (c <= 0xDFFF) {            
     // surrogate area, consume next char as well            
     more = 4;            
     skip = 1;        
    } else {            
     more = 3;        
    }        
    if (b + more > maxBytes) {            
     return str.substring(0, i);        
    }        
    b += more;        
    i += skip;    
   }    
   return str;
  }

( 興趣嗜好電腦3C )
回應 推薦文章 列印 加入我的文摘
上一篇 回創作列表 下一篇

引用
引用網址:https://classic-blog.udn.com/article/trackback.jsp?uid=ulinaboy&aid=4944074