Queue
Syntax#
- q = Queue.new
- q.push object
- q << object # same as #push
- q.pop #=> object
Multiple Workers One Sink
We want to gather data created by multiple Workers.
First we create a Queue:
sink = Queue.new
Then 16 workers all generating a random number and pushing it into sink:
(1..16).to_a.map do
Thread.new do
sink << rand(1..100)
end
end.map(&:join)
And to get the data, convert a Queue to an Array:
data = [].tap { |a| a << sink.pop until sink.empty? }
One Source Multiple Workers
We want to process data in parallel.
Let’s populate source with some data:
source = Queue.new
data = (1..100)
data.each { |e| source << e }
Then create some workers to process data:
(1..16).to_a.map do
Thread.new do
until source.empty?
item = source.pop
sleep 0.5
puts "Processed: #{item}"
end
end
end.map(&:join)
One Source - Pipeline of Work - One Sink
We want to process data in parallel and push it down the line to be processed by other workers.
Since Workers both consume and produce data we have to create two queues:
first_input_source = Queue.new
first_output_sink = Queue.new
100.times { |i| first_input_source << i }
First wave of workers read an item from first_input_source
, process the item, and write results in first_output_sink
:
(1..16).to_a.map do
Thread.new do
loop do
item = first_input_source.pop
first_output_source << item ** 2
first_output_source << item ** 3
end
end
end
Second wave of workers uses first_output_sink
as its input source and reads, process then writes to another output sink:
second_input_source = first_output_sink
second_output_sink = Queue.new
(1..32).to_a.map do
Thread.new do
loop do
item = second_input_source.pop
second_output_sink << item * 2
second_output_sink << item * 3
end
end
end
Now second_output_sink
is the sink, let’s convert it to an array:
sleep 5 # workaround in place of synchronization
sink = second_output_sink
[].tap { |a| a << sink.pop until sink.empty? }
Pushing Data into a Queue - #push
q = Queue.new
q << "any object including another queue"
# or
q.push :data
- There is no high water mark, queues can infinitely grow.
#push
never blocks
Pulling Data from a Queue - #pop
q = Queue.new
q << :data
q.pop #=> :data
#pop
will block until there is some data available.#pop
can be used for synchronization.
Synchronization - After a Point in Time
syncer = Queue.new
a = Thread.new do
syncer.pop
puts "this happens at end"
end
b = Thread.new do
puts "this happens first"
STDOUT.flush
syncer << :ok
end
[a, b].map(&:join)
Converting a Queue into an Array
q = Queue.new
q << 1
q << 2
a = Array.new
a << q.pop until q.empty?
Or a one liner:
[].tap { |array| array < queue.pop until queue.empty? }
Merging Two Queues
- To avoid infinitely blocking, reading from queues shouldn’t happen on the thread merge is happening on.
- To avoid synchronization or infinitely waiting for one of queues while other has data, reading from queues shouldn’t happen on same thread.
Let’s start by defining and populating two queues:
q1 = Queue.new
q2 = Queue.new
(1..100).each { |e| q1 << e }
(101..200).each { |e| q2 << e }
We should create another queue and push data from other threads into it:
merged = Queue.new
[q1, q2].map do |q|
Thread.new do
loop do
merged << q.pop
end
end
end
If you know you can completely consume both queues (consumption speed is higher than production, you won’t run out of RAM) there is a simpler approach:
merged = Queue.new
merged << q1.pop until q1.empty?
merged << q2.pop until q2.empty?