Skip to content

Instantly share code, notes, and snippets.

@diogoos
Last active August 19, 2025 04:12
Show Gist options
  • Select an option

  • Save diogoos/7a057b554fed7baa33fa8bf2b9f387b3 to your computer and use it in GitHub Desktop.

Select an option

Save diogoos/7a057b554fed7baa33fa8bf2b9f387b3 to your computer and use it in GitHub Desktop.
from queue import Queue
from concurrent.futures import ThreadPoolExecutor
import threading
def worker(data):
# Pre-process text, create embeddings, or do another compute-intensive tasks
def writer():
while True:
item = queue.get()
if item is None:
break # sentinel
# Write the item to an output file as needed, or otherwise consume the output
# Start writer thread
writer_thread = threading.Thread(target=writer, daemon=True)
writer_thread.start()
# Loop through each piece of data, and enqueue it to be asynchronously processed
with ThreadPoolExecutor() as executor:
for datum in data:
executor.submit(worker_encode, datum)
# Wait for queue to empty and stop writer
queue.join()
queue.put(None) # sentinel
writer_thread.join()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment