Brief Introductions (not original)
Count the frequency of letters stored in a list of &str
. i.e:
pub fn frequency(input: &[&str], worker_count: usize) -> HashMap<char, usize>
Moreover, we need depoly workers(thread) which is specified as worker_count
. This requirement
makes this task exciting and challenging as I have little exprience on concurrency programming.
And original introduction also suggests:
Learn more about concurrency in Rust here: Concurrency
Which covers general knowledge about concurrency enough to understand to solve the task!
Working on concurrency
Breifely, we are using concurrent programming because to get
- Parallel - Do the similar or same job at the same time to save time.
- Concurrency - Separate the jobs into different parts.
The task is about the former case. And the latter case is related to almost every programming related to dynamic I/O to interact with user or other service.
Now we should think about how to:
- Communicate between threads
- Devide the task properly to make them work smooth and seemlessly
The first thing is most important here because if we donāt know how to communicate each other, we could not achieve common goal, counting whole string. So how to talk to each others?
How To Make Thread and How To Reap them
Oh, waitā¦ I found that we need to talk about how to make thread(s) and harvest the threads. This is the first example on rust documentation
use std::thread;
use std::time::Duration;
fn main() {
let _handle = thread::spawn(|| {
for i in 1..10 {
println!("hi number {} from the spawned thread!", i);
thread::sleep(Duration::from_millis(1));
}
// note: there is no return value here: return value is not essential.
});
for i in 1..5 {
println!("hi number {} from the main thread!", i);
thread::sleep(Duration::from_millis(1));
}
And the result:
hi number 1 from the main thread!
hi number 1 from the spawned thread!
hi number 2 from the main thread!
hi number 2 from the spawned thread!
hi number 3 from the main thread!
hi number 3 from the spawned thread!
hi number 4 from the main thread!
hi number 4 from the spawned thread!
hi number 5 from the spawned thread!
oh no! we should wait for our friends!
Does it look good? doesnāt is? But if we look carefully. we can find that some output is missing as the our spawned thread is supposed to print up to 9, but we only got 5 here.
hi number 5 from the spawned thread!
This is because main thread reach the end of block before the children thread finished its job.
We should have waited for our friend threads to see all the output.
// .. snip ..
for i in 1..5 {
println!("hi number {} from the main thread!", i);
thread::sleep(Duration::from_millis(1));
}
.join().unwrap(); // wait until for the children thread to join to current thread.
_handle}
What join() returns
- type of last evaluation
- we can use it for the result of the task as well.
Communication Between threads in rust
return value from thread
I could see many otherās solution at exercism which is using return value from the each thread after submitting the my first solution. And I found the major pros and ā maybe ā cons from using the return value:
- pros: similar approach to normal function call (no extra setup)
- ??: the main thread is required to choose to which thread the main talks
The latter sentence could be cons. letās think about what kind of worst senario might look like:
let sample = &["very long string. abcdefghijklmnopqrstuvxyz...",
"",
"some more text", ];
// snip ..
let total = HashMap::new();
for handle in handles {
, handle.join().unwrap());
our_counting_fold_fn(total}
// snip ..
So, clearly, the first employee need to work harder than other š. And we can see if the the diffrence between the lengths of the strings is larger, the longer time will be wasted by last two cases and the main thread to wait only for first thread to finished the job. i.e: we lost the chance to summerize other results only to wait for first result.
So, if the main thread could not figure out how to choose right thread, the performance in total isnāt great: no benefit from the thread at all.
Shared memory
I found the sentence shown below from the original documentation.
access the same shared data. Consider this part of the slogan from the Go language documentation again: ādo not communicate by sharing memory.ā
However, Rust language would not say those in the same nuance because it seems that rust ownership will deal with shared-memory very well.
Management of mutexes can be incredibly tricky to get right, which is why so many people are enthusiastic about channels. However, thanks to Rustās type system and ownership rules, you canāt get locking and unlocking wrong.
Sounds promising, I hope I can go further on this subject later some time.
By the way, Mutex is an abbreviation for mutual exclusion.
Nevetheless, my first impression on shared-memory in concurrency are:
- it is optional
- less efficient
- more complicated than any other method
- but it is still useuful when requirement of multi ownership comes in.
And the execuses š± why Iād like to skip the method in this task are:
- Less over-head is preferred in this simple task.
- Yet, one thread is enought to summerize the results from others
Message Passing via channel
letās check outh why many people are enthusiastic about message passing.
The document of message passing(channel) starts with the sentence like below:
One increasingly popular approach to ensuring safe concurrency is message passing
We can safely assume that the message passing as a recommended method for communicating, canāt we?
There is one more thing we need to think about before going further to solve the task
ownership matters
The key difference between āmessage passingā and āshared-memoryā is the number of ownership. and message passing allows only one ownership at a time which gurantees the receiver could get the consistent result as it wonāt be allowed for sender to access the value after sending them. It may sounds tedious at first. What if the value could be touched after sending them? the answer is simple:
We cannot gurantee the what will happen
The unpredictable is most dangerous thing in programming world, nobody would want that unless we are throwing a dice.
I felt that rust documentation is quite logical. please read more on The original book.
How to make channel
one on one
use std::sync::mpsc;
use std::thread;
fn main() {
let (tx, rx) = mpsc::channel();
thread::spawn(move || {
let val = String::from("hi");
// sending from the friend thread
.send(val).unwrap();
tx}); // return value is not quite useful here so it is in void context
// no `let` statement or no following function(method call) after that.
// receiving from the main thread
let received = rx.recv().unwrap();
println!("Got: {}", received);
}
many speakers and only one listener
Now Iād like to remind you that we have actually many workers who want to talk about their accomplishment with their works.
But if tx
variable is moved to one thread, we cannot use it in another thread. This has
something with the simple principle in rust after all. ownership.
So we need to clone them as much as we need. This process is relatively simple:
// .. snip ..
let (tx_main, rx_main) = mpsc::channel();
let mut handles = vec![];
// .. snip ..
0..worker_count)
(.for_each(|_| {
let tx = tx_main.clone(); // this is it. simple and easy
let handle = thread::spawn(move || {
// now tx is only available in this thread only
// as its ownership is moved here.
});
.push(handle);
handles});
// .. snip ..
my first collecting the partial results is not working
The example code in rust shows me how to read from receiver(rx_main). But unknown reason makes the following code runs infinitely.
rx_main.into_iter()
.fold(HashMap::new(), |mut acc, partial_acc| {
partial_acc.iter()
.for_each(|(&ch, &count)| match acc.entry(ch) {
=> *o.into_mut() += count,
Occupied(o) => {
Vacant(v) .insert(count);
v}
});
acc})
}
So.. I had to drop the code which directly iterating from the receiver, I rather go with the following approach:
// note: the following code works only one main thread.
0..worker_count) // use it for only numbering the iteration
(.for_each(|_id| {
rx_main.recv() // like when I did for single read.
.unwrap()
// .somefunc() ... and summerizing(folding) go around here
});
I know this is not perfect reading because it only takes one value per thread. what if each thread talks more than one? It works so far, I apoligize that if it is not enough explanation. I will come back forthis if I found the reason and workaround for it.
Dividing the tasks
It is time to give the threads something to work with. we need to divide our job into a certain amount of smaller pieces so that you can hand them over the our workers(threads).
Weāll get the fixed amount of input data, so I decided:
- Divide the the list into as many as the number of workers and store it into
// example: &[&str] -> Vec<Vec<String>> with two workers
&[ "abc", "def", "ghi", "jkl" ]
// ->
vec![
"abcghi", // note stored in String
"defjkl",
]
- Move the each vector to the one appropriate thread.
The partial code so far
pub fn frequency(input: &[&str], worker_count: usize) -> HashMap<char, usize> {
let num_workers = if worker_count == 0 { 1 } else { worker_count };
let mut handles = vec![];
// make a channel for all the results collected
let (tx_main, rx_main) = mpsc::channel();
let mut input_per_worker = vec![String::from(""); num_workers];
input.iter()
.zip((0..worker_count).cycle()) // .cycle() easy way to rotate
// the numbers
.for_each(|(&s, i)| { // ( string as &str, index(or id) number )
*input_per_worker.get_mut(i) // access the value at the index `i`
.unwrap() // it is safe to access because we allocate the
// memory space already
+= s; // this very efficient way to copy the string from
// a &str
});
.into_iter().for_each(|input_| {
input_per_worker// preprare for sending
let tx = tx_main.clone();
// prepare for listening
let handle = thread::spawn(move || {
// collect count
let mut subtotal = HashMap::<char, usize>::new();
.iter().for_each(|s| {
input_// snip ..
benefits from memory access
input_per_worker.get_mut(i)
is a great inheritance from the memory access. In haskell,
maybe generating with lazy evaluation is better approach for this kind of problem.
Counting the Alphabets
This is quite easy task and I didnāt make a seprate function for this
// .. snip ..
.into_iter().for_each(|input_| {
input_per_worker// note: .into_iter() used because I want to move the ownership into
// the for_each()
// preprare for sending
let tx = tx_main.clone();
// prepare for listening
let handle = thread::spawn(move || {
// collect count
let mut subtotal = HashMap::<char, usize>::new();
.chars().for_each(|ch| {
input_if ch.is_alphabetic() {
// note: this task actually only insensitive on ascii alphabets
// speed up 2x when using simple to_ascii_lowercase()
// another option is
// change s.chars() -> s.to_lowercases().chars()
// and use as it is.
let lc = ch.to_ascii_lowercase();
/*ch.to_lowercase().for_each(|lc|*/
// the following pattern was posted in
// https://jeongoon.github.io/posts/2022-05-20-exercism-org-raindrops.html#leave-it-as-basic
match subtotal.entry(lc) {
=> *o.into_mut() += 1,
Occupied(o) => {
Vacant(v) .insert(1);
v}
} /*);*/
};
});
// finally sending
.send(subtotal).unwrap();
tx});
.push(handle);
handles});
to_lowercase() or to_ascii_lowercase()
As I mentioned as a comment, there are some issues with lower case. In short,
- If you need to apply lower case on whole unicode string: go for str::to_lowercase
- otherwise use
char::to_ascii_lowercase
orstr::to_ascii_lowercase
.
I used char::to_ascii_lowercase() because Iād like to filter first (is_alphabetic()), and then apply
Finally Summerizing
fold() could be handy when you are only asked to for the final result. And naturally we donāt need to worry about ownership and lifetime on the accumulator thanks to nature of functional programming. I took the fold() again here.
// .. snip
.iter().fold(HashMap::new(), |mut acc, _handle| {
handles
rx_main.recv()
.unwrap()
.iter()
.for_each(|(&ch, &count)| match acc.entry(ch) {
=> *o.into_mut() += count,
Occupied(o) => {
Vacant(v) .insert(count);
v}
});
acc})
} // end of pub fn frequency()
Brief Benchmark
This task comes with benchmark code as well and the the following is one of my benchmark on 9 years old xps.
Compiling parallel-letter-frequency v0.0.0 (/home/myoungjin/exercism/rust/parallel-letter-frequency)
Finished bench [optimized] target(s) in 1.48s
Running unittests src/lib.rs (target/release/deps/parallel_letter_frequency-016160b8c250f033)
running 0 tests
test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s
Running benches/benchmark.rs (target/release/deps/benchmark-15a648eb6d4b6f2b)
running 6 tests
test bench_large_parallel ... bench: 279,074 ns/iter (+/- 37,084)
test bench_large_sequential ... bench: 661,265 ns/iter (+/- 14,291)
test bench_small_parallel ... bench: 68,663 ns/iter (+/- 3,665)
test bench_small_sequential ... bench: 22,776 ns/iter (+/- 489)
test bench_tiny_parallel ... bench: 60,142 ns/iter (+/- 3,124)
test bench_tiny_sequential ... bench: 78 ns/iter (+/- 3)
test result: ok. 0 passed; 0 failed; 0 ignored; 6 measured; 0 filtered out; finished in 21.73s
parallel is not always helpful _ė³ė ¬ģ²ė¦¬
We could gain some time efficiency on first bench_large_parallel test case from the concurrency, wel.. the other benchmarks do not show the benefit at all.
Because there are some overhead to make thread(s), we should be be careful on applying parallel or concurrency
The bench marking is pretty wise way to learn something to see the what we can earn or lose by using some sort of methods.
Wrapping Up
This is quite long article, due to the subject. But Iād like to share my thoughts on concurrency:
Once you making the threads, think about how to reap as well.
Communication between the threads is an important factor of concurrency.
- Message passing - which isnāt expensive at all compared to using return from a thread
- Shared memory - could be useful, requires more resource, not dangerous in rust.
Use the concurrency wisely
- benchmarking is important to see the effect objectively
- Less thread and less ownership makes less overhead
- but donāt be afraid, rust wonāt compile the thread-unsafe code.