2017-08-29

使用MRUnit进行单元测试

安装MRUnit

首先在官网下载MRUnit，我也上传了一份到服务器上[点此下载]。我安装的版本是适用于hadoop2.x的mrunit-1.1.0。

安装完MRUnit之后，还没完事，因为MRUnit是依赖于另外两个包：mockito-all.jar和powermock-mockitod.jar。我安装的hadoop版本是2.6.5，在自带的hadoop链接库中可以找到mockito-all-1.8.5.jar。那么剩下需要安装的就是powermock包。在网上查询powermock包可以找到两个主要的下载链接。一个是powermock的github主页，我第一次就是从这里下载的1.7.1版本的包。但是会出现NoSuchMethodError (setMockName)的Exception，参考这篇文章，我猜测应该是版本不对应的原因。根据这篇文章，powermock和mockito的版本需要搭配.

Mockito	PowerMock
2.0.0-beta - 2.0.42-beta	1.6.5+
1.10.19	1.6.4
1.10.8 - 1.10.x	1.6.2+
1.9.5-rc1 - 1.9.5	1.5.0 - 1.5.6
1.9.0-rc1 & 1.9.0	1.4.10 - 1.4.12
1.8.5	1.3.9 - 1.4.9
1.8.4	1.3.7 & 1.3.8
1.8.3	1.3.6
1.8.1 & 1.8.2	1.3.5
1.8	1.3
1.7	1.2.5

但是在github主页提供的下载中只有1.7.1的版本，还有一个下载powermock-api-mockito-common-1.6.6的包，但是需要搭配hadoop中的1.8.5的版本都是不行的。另外还可以找到一个maven repository可以提供下载。这里我下载的是1.4.8版本。
还有一个下载链接，我没有用到，在这里保留一下，以备不时之需。
安装之后，在project structure中加入链接库路径。

其实maven是一个很好的项目管理工具，但是出于学习成本的原因一直没能去学习它，这里挖个坑，以后有时间争取学习一下这个工具。

使用

这次写的一个简单的Hadoop程序的目的是找出一个DNS日志中域名的随机部分。域名是记录中的第九个字段。程序很简单，但是输出总是有问题，所以想用MRUnit分别测试一下Mapper和Reducer，看看是哪里出了问题。

Mapper类：

public static class MyMapClass extends Mapper<LongWritable, Text, IntWritable, AverAndCount> {
        public void map(LongWritable k1, Text v1, Context context) throws IOException, InterruptedException {
            String[] record = v1.toString().split("\t");
            String[] domain = record[8].split("\\.");

            int max_index = 0;
            int max_length = 0;
            for (int i = 0; i < domain.length; i++) {
                if(domain[i].length() > max_length) {
                    max_index = i;
                    max_length = domain[i].length();
                }
            }
            AverAndCount a2c = new AverAndCount((double)max_length,1);
            context.write(new IntWritable(max_index), a2c);
        }
    }

Reducer类：

public static class MyReduceClass extends Reducer<IntWritable, AverAndCount, IntWritable, AverAndCount> {
        public void reduce(IntWritable k2, Iterable<AverAndCount> v2, Context context) throws IOException, InterruptedException {
            int count = 0;
            double sum = 0;
            for (AverAndCount a2c : v2) {
                count += a2c.count;
                sum += a2c.average * a2c.count;
            }
            double average = sum / count;
            context.write(k2, new AverAndCount(average, count));
        }
    }

然后是MRUnit的使用，先给出我对于Mapper类的测试代码。

public class MRTest {
    private MapDriver<LongWritable, Text, IntWritable, FindRandomFieldByLength.AverAndCount> driver;


    @Before
    public void setup() {
        FindRandomFieldByLength.MyMapClass mapper = new FindRandomFieldByLength.MyMapClass();
        driver = MapDriver.newMapDriver(mapper);

    }

    @Test
    public void testMapper() throws IOException {
        String test= "1480550672.524545\tCNtFvIee1UblLy6b2\t159.226.238.20\t31394\t54.222.4.28\t53\tudp\t11806\tpc-info-collect-2075296290.cn-north-1.elb.amazonaws.com.cn\t1\tC_INTERNET\t1\tA\t0\tNOERROR\tT\tF\tF\tF\t0\t54.223.169.167,54.223.105.82,54.223.94.199,54.223.104.153,54.223.92.69,54.223.181.89,54.223.53.219,54.222.221.127\t60.000000,60.000000,60.000000,60.000000,60.000000,60.000000,60.000000,60.000000\tF\t1\n";
        driver.withInput(new LongWritable(1), new Text(test));
        driver.withOutput(new IntWritable(0),new FindRandomFieldByLength.AverAndCount(26,1));
        driver.runTest();
    }
}

这样就完成了对mapper用例的测试，发现是mapper中的split函数接受的参数是正则表达式，将”.”改成”\.”就可以了。

后续

上面只是对Mapper的测试，再之后我又尝试了对Mapper+Reducer的测试。

public class MRTest2 {
    private MapReduceDriver<LongWritable, Text,
                IntWritable, FindRandomFieldByLength.AverAndCount,
                IntWritable, FindRandomFieldByLength.AverAndCount> mapReduceDriver;
    @Before
    public void setup() {
        FindRandomFieldByLength.MyMapClass mapper = new FindRandomFieldByLength.MyMapClass();
        FindRandomFieldByLength.MyReduceClass reducer = new FindRandomFieldByLength.MyReduceClass();
        mapReduceDriver = MapReduceDriver.newMapReduceDriver(mapper, reducer);
    }

    @Test
    public void test() throws IOException {
        mapReduceDriver.withInput(new LongWritable(1), new Text("1480550672.524545\tCNtFvIee1UblLy6b2\t159.226.238.20\t31394\t54.222.4.28\t53\tudp\t11806\tpc-info-collect-2075296290.cn-north-1.elb.amazonaws.com.cn\t1\tC_INTERNET\t1\tA\t0\tNOERROR\tT\tF\tF\tF\t0\t54.223.169.167,54.223.105.82,54.223.94.199,54.223.104.153,54.223.92.69,54.223.181.89,54.223.53.219,54.222.221.127\t60.000000,60.000000,60.000000,60.000000,60.000000,60.000000,60.000000,60.000000\tF\t1\n"));
        mapReduceDriver.withInput(new LongWritable(1), new Text("1480550672.524545\tCNtFvIee1UblLy6b2\t159.226.238.20\t31394\t54.222.4.28\t53\tudp\t11806\tpc-info-collect-20752962.cn-north-1.elb.amazonaws.com.cn\t1\tC_INTERNET\t1\tA\t0\tNOERROR\tT\tF\tF\tF\t0\t54.223.169.167,54.223.105.82,54.223.94.199,54.223.104.153,54.223.92.69,54.223.181.89,54.223.53.219,54.222.221.127\t60.000000,60.000000,60.000000,60.000000,60.000000,60.000000,60.000000,60.000000\tF\t1\n"));
        mapReduceDriver.withOutput(new IntWritable(0), new FindRandomFieldByLength.AverAndCount(25,2));
        mapReduceDriver.runTest();
    }
}

补充

JUnit4使用Java5中的注解（annotation），以下是JUnit4常用的几个annotation：

@Before：初始化方法对于每一个测试方法都要执行一次（注意与BeforeClass区别，后者是对于所有方法执行一次）
@After：释放资源对于每一个测试方法都要执行一次（注意与AfterClass区别，后者是对于所有方法执行一次）
@Test：测试方法，在这里可以测试期望异常和超时时间
@Test(expected=ArithmeticException.class)检查被测方法是否抛出ArithmeticException异常
@Ignore：忽略的测试方法
@BeforeClass：针对所有测试，只执行一次，且必须为static void
@AfterClass：针对所有测试，只执行一次，且必须为static void

一个JUnit4的单元测试用例执行顺序为：

@BeforeClass -> @Before -> @Test -> @After -> @AfterClass;

每一个测试方法的调用顺序为：

@Before -> @Test -> @After;

proTao的大脑具现

使用MRUnit进行单元测试

安装MRUnit

使用

后续

补充

参考：