0%

## 实现过程

Lab1的主要任务就是通过Golang来实现一个MapReduce的算法，具体的内容可以从Lab1的官方文档中获取。

One way to get started is to modify mr/worker.go's Worker() to send an RPC to the master asking for a task. Then modify the master to respond with the file name of an as-yet-unstarted map task. Then modify the worker to read that file and call the application Map function, as in mrsequential.go.

Worker()当中仿照样例里面的内容发送RPC请求，结果分为三类执行对应的MapReduceWait三类任务。

The application Map and Reduce functions are loaded at run-time using the Go plugin package, from files whose names end in .so.

If you change anything in the mr/ directory, you will probably have to re-build any MapReduce plugins you use, with something like go build -buildmode=plugin ../mrapps/wc.go

**这里每次都要重新build！**不然直接会出错，一开始没有注意到检查了挺久。

This lab relies on the workers sharing a file system. That’s straightforward when all workers run on the same machine, but would require a global filesystem like GFS if the workers ran on different machines.

A reasonable naming convention for intermediate files is mr-X-Y, where X is the Map task number, and Y is the reduce task number.

The worker’s map task code will need a way to store intermediate key/value pairs in files in a way that can be correctly read back during reduce tasks. One possibility is to use Go’s encoding/json package. To write key/value pairs to a JSON file:

and to read such a file back:

The map part of your worker can use the ihash(key) function (in worker.go) to pick the reduce task for a given key.

You can steal some code from mrsequential.go for reading Map input files, for sorting intermedate key/value pairs between the Map and Reduce, and for storing Reduce output in files.

The master, as an RPC server, will be concurrent; don’t forget to lock shared data.

Use Go’s race detector, with go build -race and go run -race. test-mr.sh has a comment that shows you how to enable the race detector for the tests.

Workers will sometimes need to wait, e.g. reduces can’t start until the last map has finished. One possibility is for workers to periodically ask the master for work, sleeping with time.Sleep() between each request. Another possibility is for the relevant RPC handler in the master to have a loop that waits, either with time.Sleep() or sync.Cond. Go runs the handler for each RPC in its own thread, so the fact that one handler is waiting won’t prevent the master from processing other RPCs.

The master can’t reliably distinguish between crashed workers, workers that are alive but have stalled for some reason, and workers that are executing but too slowly to be useful. The best you can do is have the master wait for some amount of time, and then give up and re-issue the task to a different worker. For this lab, have the master wait for ten seconds; after that the master should assume the worker has died (of course, it might not have).

To test crash recovery, you can use the mrapps/crash.go application plugin. It randomly exits in the Map and Reduce functions.

Test的最后一个部分就是crash test，会使用crash.so文件，事实上，在上面利用一个go routine取检查超时，如果一个worker在工作过程中crash了，那么就无法在最后返回个master一个Finish的RPC。所以利用以上的超时检查机制，是可以简单的对于Lab当中的crash进行正确重分配任务，保证MapReduce流程正常进行。

To ensure that nobody observes partially written files in the presence of crashes, the MapReduce paper mentions the trick of using a temporary file and atomically renaming it once it is completely written. You can use ioutil.TempFile to create a temporary file and os.Rename to atomically rename it.

test-mr.sh runs all the processes in the sub-directory mr-tmp, so if something goes wrong and you want to look at intermediate or output files, look there.

## 体验

• Golang上手的感觉还行，感觉和C语言差别不大，有C语言的基础熟悉一下语言特性应该就可以上手。
• 难度一般，可能是Lab1的缘故，总体实现上困难不是很大。
• 从文档和代码上看，整个Lab的结构上感觉没有6.828课程的巧妙，但是已经非常完备了，远超国内学校课程作业的实用程度。