35 lines
1.5 KiB
Markdown
35 lines
1.5 KiB
Markdown
# Alar
|
|
|
|
This repository contains `./site` dictpress theme and other govarnam data required by the Alar dictioanry (Nomad) deployment.
|
|
|
|
|
|
### Govarnam suggestions
|
|
|
|
IMPORTANT: Directly loading words into varnam for kannada for some reason doesn't produce good results for many word variants, for instance "kannada". However, the learnings/kn.vst.learnings (88+ MB) has all variations in it that produces correct words. This would've been via user selection from the frontend. So unless the existing learnings is lost, don't do this afresh.
|
|
|
|
```
|
|
# Start govarnam in the background
|
|
docker compose up -d govarnam
|
|
|
|
# If there is existing training to be wiped, stop the govarnam container and:
|
|
rm -rf ./govarnam/input/* && rm -rf ./govarnam/learnings/*
|
|
|
|
# Dump words from the DB to disk.
|
|
docker compose exec -T db psql -U alar -d alar -c "COPY (SELECT DISTINCT(content) FROM entries WHERE initial != '' AND lang='kannada') TO STDOUT WITH CSV DELIMITER ','" > ./govarnam/input/words.csv
|
|
|
|
# Load words into govarnam
|
|
docker compose exec -T govarnam varnamcli -s kn -learn-from-file /govarnam/input/words.csv
|
|
```
|
|
|
|
### Cron jobs
|
|
|
|
### CSV dump cron
|
|
For the period public CSV data dump, set a cron to run `./scripts/dump-csv.sh` which dumps CSV to the `files` directory.
|
|
|
|
### Sitemap generate cron
|
|
This ideally only needs to run once a month or even once a year.
|
|
|
|
```sh
|
|
dictpress --config=config/config.toml sitemap --from-lang=kannada --to-lang=english --output-dir=/home/alar/www/sitemaps --url="https://alar.ink/sitemaps" --output-prefix=sitemap_ --robots=true
|
|
```
|