hadoop的split使用方法，不当使用可能会使你的程序运行缓慢_hadoop的split使用方法，不当使用可能会使你的程序运行缓慢

hadoop的split使用方法，不当使用可能会使你的程序运行缓慢

今天运行一个简单的hadoop测试程序时，发现运行缓慢

后来发现，原来时因为我在分割字符串的时候使用的java原生的分割字符串的方法split

这个方法是可以用的，但是当真正在大数据环境下时，他的效率就相当的底下

因此hadoop为我们提供了一个分割字符串的方法

使用方法是这样的

StringUtils.split(str, " ");

前面的StringUtils是一个工具类

split方法里面第一个参数时待处理的字符串，第二个参数表示分割符号

他的返回值是一个String类型的数组

他的源码如下，又兴趣的朋友可以研究一下这段源码

public static String[] split(
      String str, char separator) {
    // String.split returns a single empty result for splitting the empty
    // string.
    if (str.isEmpty()) {
      return new String[]{""};
    }
    ArrayList<String> strList = new ArrayList<String>();
    int startIndex = 0;
    int nextIndex = 0;
    while ((nextIndex = str.indexOf(separator, startIndex)) != -1) {
      strList.add(str.substring(startIndex, nextIndex));
      startIndex = nextIndex + 1;
    }
    strList.add(str.substring(startIndex));
    // remove trailing empty split(s)
    int last = strList.size(); // last split
    while (--last>=0 && "".equals(strList.get(last))) {
      strList.remove(last);
    }
    return strList.toArray(new String[strList.size()]);
  }

hadoop的工具类在包org.apache.hadoop.util下面，大家可以自行去参考一下其他的工具使用方法

爆款云服务器s6 2核4G 低至0.46/天，具体规则查看活动详情

未经允许不得转载：【java爱好者】博客