Skip to content

IndaPlus22/vprytz-palinda-3

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 

Repository files navigation

Deadline:

This work should be completed before the exercise on Friday 24th March.

Instructions

For instructions on how to do and submit the assignment, please see the assignments section of the course instructions.

Homework

Study the following course literature:

Task 1 - Matching Behaviour

Take a look at the program matching.go. Explain what happens and why it happens if you make the following changes. Try first to reason about it, and then test your hypothesis by changing and running the program.

  • What happens if you remove the go-command from the Seek call in the main function?
  • What happens if you switch the declaration wg := new(sync.WaitGroup) to var wg sync.WaitGroup and the parameter wg *sync.WaitGroup to wg sync.WaitGroup?
  • What happens if you remove the buffer on the channel match?
  • What happens if you remove the default-case from the case-statement in the main function?

Hint: Think about the order of the instructions and what happens with arrays of different lengths.

Task 2 - Fractal Images

The file julia.go contains a program that creates Julia Set fractal images and writes them to file. You can watch a quick explainer on the Julia Set and its relationship to the Mandelbrot Set here, to get an idea of the type of images that are produced.

To start with read through the code to get an idea of how it works. Some things to note:

  • var Funcs []ComplexFunc = []ComplexFunc{ on line 18 is an array of functions. Note how Go lets us iterate through this array and pass each function as an argument to the CreatePng function in the Main function (see: https://en.wikipedia.org/wiki/First-class_function).

  • In the CreatePng function it is the Julia function that creates the image and it does this by iterating through each pixel of the image to paint the final image. The actual calculation whether to paint a pixel occurs in the Iterate function in the blink and you miss it line 73: z = f(z), using the function from the Funcs array that has been passed through several functions.

The program works, but is pretty slow. In order to get a sense of how long the program takes, you can time it using a stop watch, or more conveniently using the time keyword in Bash:

$ time go run julia.go
go run julia.go  11.10s user 0.60s system 97% cpu 11.974 total

Your own system will have its own runtime, but in the example above we find that it completed in 11.10s with the CPU running at 97% load.

Your assignment is to find different ways to divide the computations so that they run in parallel on all available CPUs. Use the ideas from the example in the efficient parallel computation section of the course literature.

Please note that simply making everything concurrent with more and more Gorountines does not necessarily get you faster performance after a certain point (see https://en.wikipedia.org/wiki/Amdahl%27s_law), so think carefully how you split the problem up (hint: each individual pixel does not needs its own Goroutine).

Assistant's note: In more recent versions of Golang (since 1.5), the runtime will default to use as many operating system threads as it is allowed. To see differences in behaviour, refer to the GOMAXPROCS function and vary the value.

Make sure you add a comment in julia.go with your original runtime and your improved runtime.

Task 3 - MapReduce

In the final task, you will be applying the MapReduce model for improving a word frequency program.

Assistant's note: We actually struggled to find a nice introductory explanation beyond our own example below, as the full model has several more layers that we are skipping. If you are curious, check out the original Google research article and see how you get on! https://ai.google/research/pubs/pub62

A word frequency analysis of a document will return a summary of the word counts for all unique words in the document. Whilst this can be solved efficiently using a map data structure in a sequential program, the performance can be improved by parallelising the program.

Assistant's note: Typically we ignore case when counting the frequency so 'Hello', 'HELLO' and 'hello' are all counted as 'hello', so remember to convert all strings to lower case.

By splitting the document into sub-documents and conducting a partial count in parallel (Map task), we can arrive at the solution by combining all the partial results into a final result (Reduce task).

Read the code in singleworker/words.go and complete the missing parts:

  • Implementation for the WordCount function
  • Reading a text file into a string in the main function
  • Check that the unittest passes
  • Log the runtime performance in the table below

Once you are satisfied with the singleworker, move into mapreduce/words.go and parallelise the program in order to improve the performance.

  • Update the WordCount function with the Map and Reduce tasks, using goroutines to parallelise and a channel to gather partial results
  • Check that the unittest passes
  • Find the optimal amount of gorountines before you encounter diminishing returns in performance improvements
  • Log the runtime performance in the table below
Variant Runtime (ms)
singleworker 713
mapreduce 225

And with that, you are on your way to Google-scale problems ;-)

Releases

No releases published

Packages

No packages published

Languages