Files
alar.ink/README.md

1.5 KiB

Alar

This repository contains ./site dictpress theme and other govarnam data required by the Alar dictioanry (Nomad) deployment.

Govarnam suggestions

IMPORTANT: Directly loading words into varnam for kannada for some reason doesn't produce good results for many word variants, for instance "kannada". However, the learnings/kn.vst.learnings (88+ MB) has all variations in it that produces correct words. This would've been via user selection from the frontend. So unless the existing learnings is lost, don't do this afresh.

# Start govarnam in the background
docker compose up -d govarnam

# If there is existing training to be wiped, stop the govarnam container and:
rm -rf ./govarnam/input/* && rm -rf ./govarnam/learnings/*

# Dump words from the DB to disk.
docker compose exec -T db psql -U alar -d alar -c "COPY (SELECT DISTINCT(content) FROM entries WHERE initial != '' AND lang='kannada') TO STDOUT WITH CSV DELIMITER ','" > ./govarnam/input/words.csv

# Load words into govarnam
docker compose exec -T govarnam varnamcli -s kn -learn-from-file /govarnam/input/words.csv

Cron jobs

CSV dump cron

For the period public CSV data dump, set a cron to run ./scripts/dump-csv.sh which dumps CSV to the files directory.

Sitemap generate cron

This ideally only needs to run once a month or even once a year.

dictpress --config=config/config.toml sitemap --from-lang=kannada --to-lang=english --output-dir=/home/alar/www/sitemaps --url="https://alar.ink/sitemaps" --output-prefix=sitemap_ --robots=true