Files
alar.ink/README.md

35 lines
1.5 KiB
Markdown

# Alar
This repository contains `./site` dictpress theme and other govarnam data required by the Alar dictioanry (Nomad) deployment.
### Govarnam suggestions
IMPORTANT: Directly loading words into varnam for kannada for some reason doesn't produce good results for many word variants, for instance "kannada". However, the learnings/kn.vst.learnings (88+ MB) has all variations in it that produces correct words. This would've been via user selection from the frontend. So unless the existing learnings is lost, don't do this afresh.
```
# Start govarnam in the background
docker compose up -d govarnam
# If there is existing training to be wiped, stop the govarnam container and:
rm -rf ./govarnam/input/* && rm -rf ./govarnam/learnings/*
# Dump words from the DB to disk.
docker compose exec -T db psql -U alar -d alar -c "COPY (SELECT DISTINCT(content) FROM entries WHERE initial != '' AND lang='kannada') TO STDOUT WITH CSV DELIMITER ','" > ./govarnam/input/words.csv
# Load words into govarnam
docker compose exec -T govarnam varnamcli -s kn -learn-from-file /govarnam/input/words.csv
```
### Cron jobs
### CSV dump cron
For the period public CSV data dump, set a cron to run `./scripts/dump-csv.sh` which dumps CSV to the `files` directory.
### Sitemap generate cron
This ideally only needs to run once a month or even once a year.
```sh
dictpress --config=config/config.toml sitemap --from-lang=kannada --to-lang=english --output-dir=/home/alar/www/sitemaps --url="https://alar.ink/sitemaps" --output-prefix=sitemap_ --robots=true
```