Theophilus Ngaribvume

This is Where i Share my Experiences, Ideas, and Thoughts. One Post a Day.


Python Background Asynchronous Tasks with Django (Hacker News)

02 October, 2021 - 4 min read

Python Background Asynchronous Tasks with Django (Hacker News)

Running periodic tasks in the background is a feature that can be challenging for developers. After completing this post you will be able to create your own custom django background worker/schedule.

Are periodic tasks useful?.

Imagine you're using Django as a content aggregator, and you want to auto fetch blog posts from multiple websites every 10 minutes. This is where Asynchronous Tasks come into play. You can write a function that will be run periodically without any human intervention needed.

In this post you'll learn about three technologies/tools:

  1. Celery
  2. Django
  3. Redis

What is Celery

It is an asynchronous task queue and based on distributed message passing. Celery can be used for scheduling tasks and that's what we'll be using in this post.

What is Redis

Redis is an in-memory data structure store. It can used as a caching engine. Above from being such superior for caching, it is also very simple to configure with django.

In this project however, we're going to use redis as a message broker. Celery uses brokers to pass messages between django and the workers we're going to create.

A message broker is a computer program that translates a message from the sender's formal messaging protocol to the receiver's formal messaging protocol

The Project Setup

I'm going to assume you have already installed Django and it is running, this is just for redis and celery setup. But first, install redis from the redis official page.

After downloading redis start the redis server by running: $ redis-server in your terminal. To test if the redis is working properly, run $ redis-cli ping and you it should reply with PONG.

Open your terminal and run the following commands to install celery, redis and the other python dependencies we'll need.

pip install celery pip install redis pip install django-celery-beat pip install django-redis

Add Celery and Redis to your settings.py

After you have installed redis, are it is working, add the following code to your settings.py file

# add django_celery_beat, helps you view and manage your tasks form the django admin panel INSTALLED_APPS = [ # ---------------all the other apps ----------- 'django_celery_beat', ] # CELERY STUFF BROKER_URL = 'redis://localhost:6379' CELERY_RESULT_BACKEND = 'redis://localhost:6379' CELERY_ACCEPT_CONTENT = ['application/json'] CELERY_TASK_SERIALIZER = 'json' CELERY_RESULT_SERIALIZER = 'json' CELERY_TIMEZONE = 'UTC' # Redis Caching (Not related to this post but you can go ahead and just congigure your caching as well) CACHES = { "default": { "BACKEND": "django_redis.cache.RedisCache", "LOCATION": "redis://localhost:6379/1", "OPTIONS": { "CLIENT_CLASS": "django_redis.client.DefaultClient" }, "KEY_PREFIX": "napi" } } CACHE_TTL = 60 * 15

After you have edited your settings.py run $ python manage.py migrate

Create a Task in your Django app (tasks.py)

from celery import shared_task import requests # get hacker news top stories def get_hackernews_json(): url = "https://hacker-news.firebaseio.com/v0/topstories.json" try: r = requests.get(url) r.raise_for_status() return r.json() except requests.exceptions.RequestException as err: print("OOps: Something Else", err) except requests.exceptions.HTTPError as errh: print("Http Error:", errh) except requests.exceptions.ConnectionError as errc: print("Error Connecting:", errc) except requests.exceptions.Timeout as errt: print("Timeout Error:", errt) # get hacker news top stories in detail def update_hackernews_news(): json = get_hackernews_json() if json is not None: for story in json: url = 'https://hacker-news.firebaseio.com/v0/item/{}.json'.format( story) try: r = requests.get(url) r.raise_for_status() item = r.json() news_obj = item print(str(news_obj)) except requests.exceptions.RequestException as err: print("OOps: Something Else", err) # create your celery task @shared_task(name="update_hackernews") def update_hackernews(): update_hackernews_news()

This is the task that will be scheduled to run.

Configuring the Scheduler

Now, Create a celery.py file inside the root project directory, Same directory as settings.py. And paste the following code.

from __future__ import absolute_import import os from celery import Celery from celery.schedules import crontab #set the default Django settings module for the 'celery' program (replace api.settings with the location of your project settings) os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'api.settings') app = Celery('api') # Using a string here means the worker will not have to # pickle the object when using Windows. app.config_from_object('django.conf:settings') app.autodiscover_tasks() @app.task(bind=True) def debug_task(self): print('Request: {0!r}'.format(self.request)) # create a 5 minutes schedule for your hacker new task app.conf.beat_schedule = { 'update_hackernews_task': { 'task': 'update_hackernews', 'schedule': crontab(minute="*/5") }, }

Run the Project

Now, moment of truth, open your terminal and run the following command (replace the api app name with yours in celert.py)

celery -A api worker -l info celery -A api beat -l info

To run the project remotely, on a live server, you need something like supervisor and for that process you need to following the instructions provided here run celery tasks remotely with supervisor.

Happy coding!.

Creator of Workly and Nerdlify, Currently working at Pindula.

© 2021, Theophilus Ngaribvume. All Rights Reserved.