[script] creating a nice unit list for the wiki

Discuss and coordinate development of mainline and user-made content translations.

Moderator: Forum Moderators

Post Reply
lynx
Posts: 188
Joined: March 22nd, 2004, 2:12 pm
Location: Slovenija

[script] creating a nice unit list for the wiki

Post by lynx »

I talked with Viliam, the Slovak translation manager and he allowed me to use his cool unit list layout (http://wesnoth.slack.it/?SlovakTranslation).
He is right, it's hard to edit. So I wrote a bash script to transfer my previous (and future) translations automagically.

Code: Select all

#!/bin/bash

SKWIKI=filename       #Slovak wikipage base; the unit list part only
TRUNITLIST=filename     #unit translations file,'=' separated
TRFACTLIST=filename    #faction translations file, '=' separated, last 2 lines *
OUTPUT=filename 

#until female units are added
sed -i 's/female^//' "$TRUNITLIST"

rm "$OUTPUT"
while read line
do
    if [ "${line:0:6}" == "<li><a" ]; #faction list
    then
        FACTION=`echo "$line" | cut -d"_" -f2 | cut -d'"' -f1`
        FACTION=${FACTION:-NONmATCHINGsTRING}
        TRFACT=`grep -i -w --color=never "$FACTION" "$TRFACTLIST" | head -n 1 | cut -d"=" -f2`
        TRFACT=${TRFACT:-?}
        [ "$FACTION" != "monster" ] && [ "$FACTION" != "special" ] && echo '<li><a href="#unit_'"$FACTION"'">'"$TRFACT"'</a> <span style="color:silver;">('"$FACTION"')</span></li>' >> "$OUTPUT" && continue
        #next line is dodgy, but works. Requires $TRFACTLIST to have the last 2 lines as orig.
        echo "$line" | sed -e 's/Ostatné/'`tail -n2 "$TRFACTLIST" | cut -d"=" -f2 | head -n1`'/' -e 's/Špeciálne/'`tail -n1 "$TRFACTLIST" | cut -d"=" -f2`'/' >> "$OUTPUT"
    elif [ "${line:0:5}" == "<p><a" ]; #faction anchors
    then
        LINK=`echo "$line" | cut -d"_" -f2 | cut -d'"' -f1`
        TRFACT=`grep -i -w --color=never "$LINK" "$TRFACTLIST" | head -n 1 | cut -d"=" -f2`
        TRFACT=${TRFACT:-$LINK}
        [ "$TRFACT" != "monster" ] && [ "$TRFACT" != "special" ] && echo '<p><a name="unit_'"$LINK"'">'"$TRFACT"'</a></p>' >> "$OUTPUT" && continue
        echo "$line" | sed -e 's/Ostatné/'`tail -n2 "$TRFACTLIST" | cut -d"=" -f2 | head -n1`'/' -e 's/Špeciálne/'`tail -n1 "$TRFACTLIST" | cut -d"=" -f2`'/' >> "$OUTPUT"          
    elif [ "${line:0:3}" == "<li" ]; #units
    then
        UNIT=`echo "$line" | cut -d"(" -f2| cut -d")" -f1 | grep -v "<"`
        UNIT=${UNIT:-NONmATCHINGsTRING}
        TRUNIT=`grep -i -w --color=never "^$UNIT" "$TRUNITLIST" | head -n 1 | cut -d"=" -f2`
        TRUNIT=${TRUNIT:-?}
        END=`echo "$line" | sed 's,.*</span>\(.*\),\1,'`
        echo '<li>'"$TRUNIT"' <span style="color:silver;">('"$UNIT"')</span>'"$END" >> "$OUTPUT"
    else #misc
        echo "$line" >> "$OUTPUT"
    fi
done < "$SKWIKI"
It looks like binary garble, i know ><. You can make it work for you if you have the files mentioned in the header and a bash console. :D

The TRFACTLIST logic is quick and dirty, so it requires the last two lines to be:
Ostatné=yourTranslationOf("monster")
Špeciálne=yourTranslationOf("special")
in my case:
/.../
Troll=Troli
Dwarf=Škratje
Ostatné=Pošasti
Špeciálne=Posebne
My TRUNITLIST looks the same:
/.../
Elder Mage=Čarovnik starešina čarovniška?
female^White Mage=Bela čarovnica
White Mage=Bel čarovnik

LOJALISTI
Spearman=Suličar
Swordsman=Mečevalec
Royal Guard=Kraljevi stražar
Bowman=Lokostrelec
Cavalier=Kavalir
/.../
(it's used only with grep, so it can contain garbage as above). Both are basically just one-per-line '=' delimited lists.

My results (compare with original):
http://wesnoth.slack.it/?SlovenianTranslation

Notes:
- the list is as recent as is the Slovak unit list ^^
- when you translate more unit names, just rerun the second script and then the first one and it should all be there.


This next script will grab units from wesnoth.po(t) and make a list with translations if they're found. You can use this to make the TRUNITLIST argument for the previous script.

Code: Select all

#!/bin/bash
TRFILEPATH=~/dlTemping/wesnoth/main-sl.po
OUTPUT=wiki2
TMPFILE=tempOrary

grep "data/units/" "$TRFILEPATH" | tr " " "\n" | sed -e '/#:/d' -e 's/:.*//' | sort | uniq | grep unit | sed -e 's,data/units/,,' -e 's/\.cfg//' -e 's/_/ /g' -e 's/-/ /' > "$TMPFILE"

#dealing with some special cases 
sed -i -e 's/Mermaid Siren/Siren/' -e 's/Tentacle/& of the Deep/' -e 's/Halbardier/Halberdier/' -e 's/Drake Fire/Fire Drake/' -e 's/Drake Sky/Sky Drake/' -e 's/Drake Inferno/Inferno Drake/' -e 's/Cave Spider/Giant Spider/' "$TMPFILE"
echo "Nagini Warrior" >> "$TMPFILE" # other m!=f names == nonexsistant
sort "$TMPFILE" -o "$TMPFILE"

rm "$OUTPUT"

while read one;
do  
    egrep --color=never -A1 "msgid \"(|female\^)$one\"" "$TRFILEPATH" | sed -e 's/--//' -e '/^$/d' -e 's/msgid "/+/' -e 's/msgstr "/=/' -e 's/\(=.*\)"/\1/' | tr '\n' ' ' | sed -e 's/"//g' -e 's/+/\n/g' | sed -e 's/ *= */=/' -e 's/  */ /' >> "$OUTPUT"
    #echo >> "$OUTPUT"
done < "$TMPFILE"

#wc -l "$OUTPUT"
#wc -l "$TMPFILE" #the difference are all the female and male unit  names (the units with both genders)
rm "$TMPFILE"
Thanks to Viliam for letting me use his layout! :)

I hope someone besides me finds this helpful. ;)
Post Reply