Thursday, December 14, 2023

Setup 4CAT (Capture and Analysis Toolkit) on linux Server using Docker

4CAT is a research tool that can be used to analyse and process data from online social platforms. Its goal is to make the capture and analysis of data from these platforms accessible to people through a web interface, without requiring any programming or web scraping skills. 

There is some great documentation already, but no harm adding a bit more detail.
Tested on Ubuntu 22.04.3 LTS & 4CAT 1.38

Ref:

Install Docker on Server

  1. Setup Docker repository
    # Add Docker's official GPG key:
    $ sudo apt-get update
    $ sudo apt-get install ca-certificates curl gnupg
    $ sudo install -m 0755 -d /etc/apt/keyrings
    $ curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg
    $ sudo chmod a+r /etc/apt/keyrings/docker.gpg
    
    # Add the repository to Apt sources:
    $ echo \
      "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/ubuntu \
      $(. /etc/os-release && echo "$VERSION_CODENAME") stable" | \
      sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
    $ sudo apt-get update
    
  2. Install docker
    $ sudo apt-get install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin
  3. Install Compose (4CAT used this method)
    $ apt  install docker-compose
    
  4. Test docker
    # do
      .....cker run hello-world
    Unable to find image 'hello-world:latest' locally
    latest: Pulling from library/hello-world
    719385e32844: Pull complete
    Digest: sha256:c79d06dfdfd3d3eb04cafd0dc2bacab0992ebc243e083cabe208bac4dd7759e0
    Status: Downloaded newer image for hello-world:latest
    
    Hello from Docker!
    This message shows that your installation appears to be working correctly.
      .....

Install 4CAT

  1. Download docker-compose.yml and .env 
  2. Good chance .env has been downloaded as env.txt so rename i.e. mv env.txt .env 
  3. Edit .env so you use teh full URI of the server and a new port number.
    # Server information
    # SERVER_NAME is only used on first run; afterwards it can be set in the frontend
    SERVER_NAME=linux01.dcs.bbk.ac.uk
    PUBLIC_PORT=8080
    
  4. (optional) If you do not want to run docker as root. Then add existing user (or create a new one) to docker group by editting /etc/group
      docker:x:998:4catuser
    
  5. Make sure docker-compse.yml and .env are in a directory which you are going to run and install 4cat
    $ docker-compose up -d
    Creating network "4cat_default" with the default driver
    Creating volume "4cat_4cat_db" with default driver
    Creating volume "4cat_4cat_data" with default driver
    Creating volume "4cat_4cat_share" with default driver
    Creating volume "4cat_4cat_logs" with default driver
    Pulling db (postgres:latest)...
    latest: Pulling from library/postgres
    1f7ce2fa46ab: Pull complete
    8a0c088137b8: Pull complete
    11be68f68a2e: Pull complete
    19f13c4e1d96: Pull complete
    43187fdc5ebc: Pull complete
    a84cb0803492: Pull complete
    b50a897e2632: Pull complete
    7bc6d5552c52: Pull complete
    c8161286a3f1: Pull complete
    e36f0ab546af: Pull complete
    c2a71678092b: Pull complete
    7c23bdcac538: Pull complete
    16648961c661: Pull complete
    Digest: sha256:a2282ad0db623c27f03bab803975c9e3942a24e974f07142d5d69b6b8eaaf9e2
    Status: Downloaded newer image for postgres:latest
    Pulling backend (digitalmethodsinitiative/4cat:stable)...
    stable: Pulling from digitalmethodsinitiative/4cat
    a378f10b3218: Pull complete
    c11bdfacfd25: Pull complete
    b8cc7de3de04: Pull complete
    93b0b6f266cf: Pull complete
    5951bacf5eee: Pull complete
    bbad587f60cb: Pull complete
    d7621de0c96b: Pull complete
    4f9d94b6505d: Pull complete
    dcdff001d5c5: Pull complete
    5f3040f09c72: Pull complete
    Digest: sha256:3afef45f8690770e6a0fe71914bddbf20f21193f99274dac36a9d2ef9b03082f
    Status: Downloaded newer image for digitalmethodsinitiative/4cat:stable
    Creating 4cat_db ... done
    Creating 4cat_backend ... done
    Creating 4cat_frontend ... done
    
  6. You now have a running 4cat system visit: http://myhost.dcs.bbk.ac.uk:8080

4CAT configuration for email users (optional)

  • Email Setup: Control Panel -> Settings ->  Mail settings & credentials
    • Need to set Admin e-mail
    • SMTP server
  • Fix URL issue in emails.: Control Panel -> Settings -> Flask Settings and then update "Host name" to your desired server name (e.g. myhost.dcs.bbk.ac.uk:8080

Where is my Data

You need to think about backing up your data and making sure you have enough disk space.

All data is stored in /var/lib/docker and your volumes (database, etc) is in /var/lib/docker/volumes. 
$ df -h /var/lib/docker
Filesystem                         Size  Used Avail Use% Mounted on
/dev/mapper/ubuntu--vg-ubuntu--lv   49G   13G   34G  28% /

$ ls -l /var/lib/docker/volumes
total 40
drwx-----x 3 root root   4096 Dec 14 10:46 4cat_4cat_data
drwx-----x 3 root root   4096 Dec 14 10:46 4cat_4cat_db
drwx-----x 3 root root   4096 Dec 14 10:46 4cat_4cat_logs
drwx-----x 3 root root   4096 Dec 14 10:46 4cat_4cat_share
brw------- 1 root root 253, 0 Dec 12 14:38 backingFsBlockDev
-rw------- 1 root root  32768 Dec 14 10:46 metadata.db

<

No comments: