timApp.document.translation package#

Submodules#

timApp.document.translation.deepl module#

Contains implementation of the TranslationService-interface for the DeepL machine translator: https://www.deepl.com/translator.

Both DeepL API Free and DeepL API Pro -versions.

class timApp.document.translation.deepl.DeeplProTranslationService(values)[source]#

Bases: timApp.document.translation.deepl.DeeplTranslationService

Translation service using the DeepL API Pro.

id#

Translation service identifier.

ignore_tag#

The XML-tag name to use for ignoring pieces of text when XML-handling is used. Should be chosen to be some uncommon string not found in many texts.

service_name#

Human-readable name of the machine translator. Also used as an identifier.

service_url#

The url base for the API calls.

class timApp.document.translation.deepl.DeeplTranslationService(values)[source]#

Bases: timApp.document.translation.translator.RegisteredTranslationService

Translation service using the DeepL API Free.

get_languages(source_langs: bool) list[timApp.document.translation.language.Language][source]#

Fetches the source or target languages from DeepL.

Parameters

source_langs – Whether source languages must be fetched

Returns

The list of source of target languages from DeepL.

headers: dict[str, str]#

Request-headers needed for authentication with the API-key.

id#

Translation service identifier.

ignore_tag#

The XML-tag name to use for ignoring pieces of text when XML-handling is used. Should be chosen to be some uncommon string not found in many texts.

languages() timApp.document.translation.translator.LanguagePairing[source]#

Asks the DeepL API for the list of supported languages and turns the returned language codes to Languages found in the database.

Returns

Dictionary of source langs to lists of target langs, that are supported by the API and also found in database.

postprocess(text: str) str[source]#

Remove unnecessary protection tags from the text and change defined aliases back to Markdown syntax.

Parameters

text – The text returned from DeepL API after translation.

Returns

Text with the needed operations performed to more closely match the text before passing it to DeepL API.

preprocess(elem: timApp.document.translation.translationparser.TranslateApproval) None[source]#

Protect the text inside element from mangling in translation by adding XML-tags.

Parameters

elem – The element to add XML-protection-tags to.

:return None. The tag is added to the input object.

register(user_group: timApp.user.usergroup.UserGroup) None[source]#

Set headers to use the user group’s API-key ready for translation calls.

Parameters

user_group – The user group whose API key will be used.

Raises
service_name#

Human-readable name of the machine translator. Also used as an identifier.

service_url#

The url base for the API calls.

source_Language_code: str#

The source language’s code (helps handling regional variants that DeepL doesn’t differentiate).

supports(source_lang: timApp.document.translation.language.Language, target_lang: timApp.document.translation.language.Language) bool[source]#

Check that the source language can be translated into target language by the translation API.

Parameters
  • source_lang – Language to check the translation capability from.

  • target_lang – Language to check the translation capability into.

Returns

True, if the pairing is supported.

supports_tag_handling(tag_type: str) bool[source]#

Check if DeeplTranslationService supports a tag-handling.

Parameters

tag_type – The tag-type to check handling for.

Returns

True if the tag-type is supported.

translate(texts: list[list[timApp.document.translation.translationparser.TranslateApproval]], source_lang: timApp.document.translation.language.Language | None, target_lang: timApp.document.translation.language.Language, tag_handling: str = 'xml') list[str][source]#

Use the DeepL API to translate text between languages.

Parameters
  • texts – Some set of texts to be translated.

  • source_lang – Language of input text. None value makes DeepL guess it from the text.

  • target_lang – Language for target language.

  • tag_handling – See comment in superclass.

Returns

List of strings in target language with the non-translatable parts intact.

usage() timApp.document.translation.translator.Usage[source]#

Fetch current API usage of the registered key from DeepL.

Returns

Usage returned from DeepL.

timApp.document.translation.language module#

Contains implementation of the Language-database model, which is used to unify TIM’s translation-documents’ languages.

class timApp.document.translation.language.Language(lang_code, lang_name, autonym, flag_uri=None)[source]#

Bases: sqlalchemy.ext.declarative.api.Model

Represents a standardized language code used for example with translation documents.

NOTE: You should always use the provided class-methods for creating new instances!

autonym#

Native name for the language.

classmethod create_from_name(name: str) timApp.document.translation.language.Language[source]#

Create an instance of Language that follows a standard. Note that this should always be used when creating a new Language especially when adding it to database.

Parameters

name – Natural name of the language

Returns

A corresponding Language-object newly created.

Raises

LookupError – if the language is not found from langcodes’ database.

flag_uri#

Path to a picture representing the language.

lang_code#

Standardized code of the language.

lang_name#

IANA’s name for the language.

classmethod query_all() list['Language'][source]#

Query the database for all the languages

Returns

All the languages found from database.

classmethod query_by_code(code: str) Optional[timApp.document.translation.language.Language][source]#

Query the database to find a single match for language tag

Parameters

code – The IETF tag for the language.

Returns

The corresponding Language-object in database or None if not found.

to_json() dict[source]#

Create a JSON representation of the Language instance.

Returns

The Language instance’s fields in a dict.

timApp.document.translation.reversingtranslator module#

Contains the implementation of ReversingTranslationService and its target language, which are used in (NOTE:) unit-tests for translation routes.

timApp.document.translation.reversingtranslator.REVERSE_LANG = {'autonym': 'esreveR', 'lang_code': 'rev-Erse', 'lang_name': 'Reverse'}#

Language that the ReversingTranslationService translates text into. To use in tests.

class timApp.document.translation.reversingtranslator.ReversingTranslationService(**kwargs)[source]#

Bases: timApp.document.translation.translator.TranslationService

Translator to test if the list[list[TranslateApproval]]-structure is generic enough to (easily) use for integrating new machine translators into TIM.

get_languages(source_langs: bool) list[timApp.document.translation.language.Language][source]#

Reverse-language is supported as the only target language.

Parameters

source_langs – See documentation on TranslationService.

Returns

See documentation on TranslationService.

id#

Translation service identifier.

languages() timApp.document.translation.translator.LanguagePairing[source]#
Returns

Mapping from all languages in database into the reversed language.

service_name#

Human-readable name of the machine translator. Also used as an identifier.

supports(source_lang: timApp.document.translation.language.Language, target_lang: timApp.document.translation.language.Language) bool[source]#

Check if language pairing is supported.

Parameters
  • source_lang – Language to translate from.

  • target_lang – Only the REVERSE_LANG -language-code is supported.

Returns

True, if target_lang is rev-Erse.

supports_tag_handling(tag_type: str) bool[source]#

Check if the service supports tag handling in translations. For example using XML-tags, some services offer controlling parts of the text, that should be kept as-is and not be affected by the machine translation: “My name is Dr. <protect>Oak</protect>.”

NOTE this is related to the kinda HACKY way of handling Markdown-tables in DeepL-translation.

Parameters

tag_type – Type of the tag. Some services for example support “xml” or “html”.

Returns

True, if the tag type is supported.

translate(texts: list[list[timApp.document.translation.translationparser.TranslateApproval]], src_lang: timApp.document.translation.language.Language, target_lang: timApp.document.translation.language.Language, *, tag_handling: str = '') list[str][source]#

Reverse the translatable text given. NOTE The algorithm here for combining translation results back to original structure might be integrated into the actual TranslationService-implementation. Note This implementation does not fully follow the needed interface.

Returns

Parameters
  • texts – Texts to reverse

  • src_lang – Any.

  • target_lang – Only REVERSE_LANG[“lang_code”] is supported.

  • tag_handling – tags to intelligently handle during translation TODO XML-handling.

Returns

Texts where translatable ones have been reversed.

usage() timApp.document.translation.translator.Usage[source]#

Infinite quota

timApp.document.translation.routes module#

Contains routes for making operations on translation documents. Mainly translations on whole documents, paragraphs and raw text.

Also contains routes for getting available languages, names of machine translators and queries related to API-keys of these machine translators.

timApp.document.translation.routes.add_api_key() flask.wrappers.Response[source]#

Add API key to the database for current user.

Returns

OK response if adding the key was successful.

timApp.document.translation.routes.create_translation_route(tr_doc_id: int, language: str, translator: str) flask.wrappers.Response[source]#

Create and add a translation version of a whole document. Make machine translation on it if so requested and authorized to.

Parameters
  • tr_doc_id – ID of a document that the translation can be made based on. ID of document, that is or is linked to the original source document.

  • language – Language that will be set to the translation document and used in potential machine translation.

  • translator – Identifying name of the translator to use (machine or manual).

Returns

The created translation document’s information as JSON.

timApp.document.translation.routes.get_all_languages() flask.wrappers.Response[source]#

Query the database for all the available languages to be used for documents.

Returns

JSON response containing all the available languages.

timApp.document.translation.routes.get_keys() flask.wrappers.Response[source]#

Gets the user’s API keys.

Returns

The user’s API keys as JSON.

timApp.document.translation.routes.get_languages(source_languages: bool) flask.wrappers.Response[source]#

Get list of supported languages by machine translator.

Parameters

source_languages – Flag for getting source-language (True) list instead of target-language (False).

Returns

List of the supported languages by type (source or target).

timApp.document.translation.routes.get_my_translators() flask.wrappers.Response[source]#

Gets the names of the translators the user has the API keys for.

Returns

The JSON-list of the names of the translators the user has the API keys for.

timApp.document.translation.routes.get_quota()[source]#

Gets the quota info for the user’s API key.

Returns

The used and available quota for the user’s API key as JSON.

timApp.document.translation.routes.get_source_languages() flask.wrappers.Response[source]#

Query the database for the possible source languages.

Returns

JSON response containing the languages.

timApp.document.translation.routes.get_target_languages() flask.wrappers.Response[source]#

Query the database for the possible target languages.

Returns

JSON response containing the languages.

timApp.document.translation.routes.get_translations(doc_id: int) flask.wrappers.Response[source]#
timApp.document.translation.routes.get_translators() flask.wrappers.Response[source]#

Query the database for the possible machine translators.

Returns

JSON response containing the translators.

timApp.document.translation.routes.get_valid_status() flask.wrappers.Response[source]#

Check the validity of a given api-key with the chosen translator engine.

Returns

OK-response if the key is valid, or an Exception.

timApp.document.translation.routes.is_valid_language_id(lang_id: str) bool[source]#

Check that the ID is recognized by the langcodes library and found in database.

Parameters

lang_id – Language id (or “tag”) to check for validity.

Returns

True, if the standardized ID is found in database.

timApp.document.translation.routes.paragraph_translation_route(tr_doc_id: int, tr_par_id: str, language: str, transl: str) flask.wrappers.Response[source]#

Replace the content of paragraph with requested translation.

Parameters
  • tr_doc_id – ID of the document that the paragraph is in.

  • tr_par_id – ID of the paragraph in the Translation NOTE: NOT the original paragraph!

  • language – Language to translate into.

  • transl – Identifying code of the translator to use.

Returns

OK-response if translation and modification was successful.

timApp.document.translation.routes.remove_api_key() flask.wrappers.Response[source]#

Remove the current user’s API key from the database.

Returns

OK-response if removing the key was successful.

timApp.document.translation.routes.text_translation_route(tr_doc_id: int, language: str, transl: str) flask.wrappers.Response[source]#

Translate raw text between the source document’s language and the one requested.

Parameters
  • tr_doc_id – ID of the document that the text is from.

  • language – Language to translate the text into.

  • transl – Identifying code of the translator to use.

Returns

The translated text.

timApp.document.translation.routes.translate_full_document(tr: timApp.document.translation.translation.Translation, src_doc: timApp.document.document.Document, target_language: timApp.document.translation.language.Language, translator_code: str) None[source]#

Translate matching paragraphs of document based on an original source document.

Parameters
  • tr – The metadata of the translation target.

  • src_doc – The original source document with translatable text.

  • target_language – The language to translate the document into.

  • translator_code – Identifier of the translator to use (machine or “Manual” if empty).

Returns

None. The translation is applied to document based on the tr-parameter.

timApp.document.translation.routes.update_translation(doc_id)[source]#

timApp.document.translation.synchronize_translations module#

timApp.document.translation.synchronize_translations.synchronize_translations(doc: timApp.document.docinfo.DocInfo, edit_result: timApp.document.editing.documenteditresult.DocumentEditResult)[source]#

Synchronizes the translations of a document by adding missing paragraphs to the translations and deleting non-existing paragraphs.

Parameters
  • edit_result – The changes that were made to the document.

  • doc – The document that was edited and whose translations need to be synchronized.

timApp.document.translation.translation module#

class timApp.document.translation.translation.Translation(**kwargs)[source]#

Bases: sqlalchemy.ext.declarative.api.Model, timApp.document.docinfo.DocInfo

A translated document.

Translation objects may be created in two scenarios:

  • An existing non-translated document is assigned a language.

  • A new translated document is created (via manage view).

doc_id#
docentry#
property id#

Returns the item id.

lang_id#
property path#

Returns the Document path, including the language part in case of a translation.

property path_without_lang#

Returns the Document path without the language part in case of a translation.

property public#
src_docid#
to_json(**kwargs)[source]#
property translations: list['Translation']#

Returns the translations of the document. NOTE: The list includes the document itself.

timApp.document.translation.translation.add_tr_entry(doc_id: int, item: timApp.document.docinfo.DocInfo, tr: timApp.document.translation.translation.Translation) timApp.document.translation.translation.Translation[source]#

timApp.document.translation.translationparser module#

This module contains the main functions needed for marking parts of the Markdown used in TIM into translatable text (human-spoken language) and non-translatable text (syntax of Markdown and TIM-plugins for example).

Basically only the get_translate_approvals -function should be called directly by users.

timApp.document.translation.translationparser.NOTRANSLATE_STYLE_LONG = 'notranslate'#

Longer string used for marking non-translatable text in TIM’s Markdown

timApp.document.translation.translationparser.NOTRANSLATE_STYLE_SHORT = 'nt'#

Shorter string used for marking non-translatable text in TIM’s Markdown

class timApp.document.translation.translationparser.NoTranslate(text: str = '')[source]#

Bases: timApp.document.translation.translationparser.TranslateApproval

Subclass of TranslateApproval, which indicates that the string value of the class will not be translated.

timApp.document.translation.translationparser.PLUGIN_MD_PREFIX = 'md:'#

Prefix in plugin’s values that can be parsed into Markdown. The prefix does not contain delimiters and is not preceded by spaces.

class timApp.document.translation.translationparser.Table(text: str = '')[source]#

Bases: timApp.document.translation.translationparser.TranslateApproval

Hacky way to translate tables by identifying them at translation and setting html-tag handling on.

class timApp.document.translation.translationparser.Translate(text: str = '')[source]#

Bases: timApp.document.translation.translationparser.TranslateApproval

Subclass of TranslateApproval, which indicates that the string value of the class will be translated.

class timApp.document.translation.translationparser.TranslateApproval(text: str = '')[source]#

Bases: object

Superclass for text that should or should not be passed to a machine translator.

text: str = ''#
class timApp.document.translation.translationparser.TranslationParser(quote: str = '"')[source]#

Bases: object

add_value_with_prefix(text: str, arr: list[timApp.document.translation.translationparser.TranslateApproval], plugin_quote: str = '"') None[source]#

Separates the contents of a YAML string-prefix and value found in plugins and adds to the list.

The text can possibly start with the “md:” prefix (NoTranslate) for content that is Markdown, and the rest after that is the value (Translate).

Parameters
  • text – The text that can be contained with (plugin).

  • arr – The list that the results will be added to.

  • plugin_quote – The quote to use inside the potential Markdown.

Returns

None, the result is inserted into the arr-parameter.

attr_collect(content: list) Tuple[list[timApp.document.translation.translationparser.TranslateApproval], bool][source]#

Collect the parts of Attr into Markdown.

Pandoc: https://hackage.haskell.org/package/pandoc-types-1.22.1/docs/Text-Pandoc-Definition.html#t:Inline

Parameters

content – Pandoc-ASTs JSON form of Attr (attributes): [ str, [str], [(str, str)] ].

Returns

List of non/translatable parts and boolean indicating, whether. the .notranslate -style was found in the element.

block_collect(top_block: dict, depth: int = 0) list[timApp.document.translation.translationparser.TranslateApproval][source]#

Walks the whole block and appends each translatable and non-translatable string-part into a list in order. Adds newlines to the start of each block and end of some specific blocks, for Markdown syntax. These newlines are required due to Pandoc removing the newlines in formatting.

Based on the pandoc AST-spec at: https://hackage.haskell.org/package/pandoc-types-1.22.1/docs/Text-Pandoc-Definition.html#t:Block

Parameters
  • top_block – The block to collect strings from.

  • depth – The depth of the recursion if it is needed for example with list-indentation.

Returns

List of strings inside the correct approval-type.

bulletlist_collect(content: dict, depth: int) list[timApp.document.translation.translationparser.TranslateApproval][source]#

Collect and separate translatable and untranslatable areas within a bullet list element through recursion. Calls to list_collect to handle recursion through block_collect.

Pandoc: https://hackage.haskell.org/package/pandoc-types-1.22.1/docs/Text-Pandoc-Definition.html#t:Block

Parameters
  • content – Bullet list (attributes and a list of items, each a list of blocks): [ ListAttributes, [[Block]] ].

  • depth – The current depth of the list, used for indentation.

Returns

List of translatable and untranslatable areas within a bullet list element.

cite_collect(content: dict) list[timApp.document.translation.translationparser.TranslateApproval][source]#

Collect and separate translatable and untranslatable areas within a citation element. Citation element is delimited by citation marks.

Pandoc: https://hackage.haskell.org/package/pandoc-types-1.22.1/docs/Text-Pandoc-Definition.html#t:Inline

Parameters

content – Citation (list of inlines) from Inline element: [ [Citation], [Inline] ].

Returns

List containing the parsed collection of Citation content.

code_collect(content: dict) list[timApp.document.translation.translationparser.TranslateApproval][source]#

Collect everything within an Inline code element as untranslatable areas due to no clear context if the text should remain in the origin language or not element. Inline Code element is defined through spacing before the string.

Pandoc: https://hackage.haskell.org/package/pandoc-types-1.22.1/docs/Text-Pandoc-Definition.html#t:Inline

Parameters

content – Inline code (literal) from Inline element: [ Attr, Text ].

Returns

List containing the collection of Inline code content.

codeblock_collect(content: dict) list[timApp.document.translation.translationparser.TranslateApproval][source]#

Pick translatable and non-translatable parts off of a codeblock.

NOTE/WARNING In regard to plugins:

It is critical that the attributes do not include the TIM-identifier eg. id=”SAs3EK96oQtL” from {plugin=”csPlugin” id=”SAs3EK96oQtL”}, because Pandoc deletes extra identifiers contained in attributes like #btn-tex2 and id=”SAs3EK96oQtL” in {plugin=”csPlugin” #btn-tex2 id=”SAs3EK96oQtL”}. Here, the attributes of a plugin-codeblock are DISCARDED and will not be included in the result when markdown is reconstructed i.e. caller should save the attributes if needed.

Parameters

content – List with the attributes and text-content of the codeblock.

Returns

List marking the Markdown representation of the element into translatable and non-translatable parts.

collect_tim_plugin(attrs: dict, content: str) list[timApp.document.translation.translationparser.TranslateApproval][source]#

Special case to collect translatable and non-translatable parts of a TIM-plugin based on its (YAML) contents.

Parameters
  • attrs – Pandoc-AST defined Attr -attributes of the plugin-block for example plugin=”csPlugin”. TODO Add handling for this if necessary.

  • content – The raw markdown content of the plugin-defined paragraph.

Returns

List of the translatable and non-translatable parts.

definitionlist_collect(content: dict) list[timApp.document.translation.translationparser.TranslateApproval][source]#

Collect definition list areas as untranslatable. Each list item is a pair consisting of a term (a list of inlines) and one or more definitions (each a list of blocks).

Pandoc: https://hackage.haskell.org/package/pandoc-types-1.22.1/docs/Text-Pandoc-Definition.html#t:Block

Parameters

content – Definition list. : [([Inline], [[Block]])].

Returns

List of single NoTranslate -element containing Markdown representation of definition list.

div_collect(content: dict) list[timApp.document.translation.translationparser.TranslateApproval][source]#

Collects generic block container with attributes as untranslatable.

Pandoc: https://hackage.haskell.org/package/pandoc-types-1.22.1/docs/Text-Pandoc-Definition.html#t:Block

Parameters

content – Generic block container with attributes: [ Attr [Block] ].

Returns

List of single NoTranslate -element containing Markdown representation of div element.

get_translate_approvals(md: str) list[timApp.document.translation.translationparser.TranslateApproval][source]#

By parsing the input text, identify parts that should and should not be passed to a machine translator.

TODO Does this need to return list of lists, when the function of this is

to split markdown into parts that can be translated or not?

Parameters

md – The input text to eventually translate.

Returns

Lists containing the translatable parts of each block in a list.

header_collect(content: dict) list[timApp.document.translation.translationparser.TranslateApproval][source]#

Collect and separate translatable and untranslatable areas within a header from a block.

Pandoc: https://hackage.haskell.org/package/pandoc-types-1.22.1/docs/Text-Pandoc-Definition.html#t:Block

Parameters

content – Header’s level (integer) and text (inlines): [ int, Attr, [Inline] ].

Returns

List of translatable and untranslatable areas within a header element.

image_collect(content: dict) list[timApp.document.translation.translationparser.TranslateApproval][source]#

Collect and separate translatable and untranslatable areas within an image element.

Pandoc: https://hackage.haskell.org/package/pandoc-types-1.22.1/docs/Text-Pandoc-Definition.html#t:Inline

Parameters

content – Attr, alt text (list of inlines), target: [ Attr, [Inline], Target ].

Returns

List containing the parsed collection of image content.

inline_collect(top_inline: dict) list[timApp.document.translation.translationparser.TranslateApproval][source]#

Collect and separate translatable and untranslatable areas within an Inline element.

Pandoc: https://hackage.haskell.org/package/pandoc-types-1.22.1/docs/Text-Pandoc-Definition.html#t:Inline Types are listed as emphasized text in the list and the values after it is the content.

Parameters

top_inline – Made out of type and content. Type defines the case and content is the value of that type.

Returns

List of translatable and untranslatable areas within an Inline element.

Collect and separate translatable and untranslatable areas within a link element.

Pandoc: https://hackage.haskell.org/package/pandoc-types-1.22.1/docs/Text-Pandoc-Definition.html#t:Inline

Parameters

content – Attr, alt text (list of inlines), target: [ Attr, [Inline], Target ].

Returns

List containing the parsed collection of link content.

Collect and separate translatable and untranslatable areas within a link or image element. Universal collector for both link and image collect due to them having the same outline in markdown, except for “[” or “![” prepend.

Parameters
  • content – Attr, alt text (list of inlines), target: [ Attr, [Inline], Target ].

  • islink – True-state if content is link-element (true=link, false=image).

Returns

List containing the parsed collection of link or image content.

list_collect(blocks: list[list[dict]], depth: int, attrs: Optional[Tuple[int, str, str]]) list[timApp.document.translation.translationparser.TranslateApproval][source]#

General method for handling both bullet- and ordered lists.

Parameters
  • blocks – The [[Block]] found in Pandoc definition for the lists.

  • depth – The depth of recursion with lists (can contain lists of lists. of lists …).

  • attrs – The information related to the style of the OrderedList items.

Returns

List containing the translatable parts of the list.

math_collect(content: dict) list[timApp.document.translation.translationparser.TranslateApproval][source]#

Collect and separate translatable and untranslatable areas within a math element.

Pandoc: https://hackage.haskell.org/package/pandoc-types-1.22.1/docs/Text-Pandoc-Definition.html#t:Inline

Parameters

content – TeX math (literal) from Inline: [ MathType, Text ].

Returns

List containing the parsed collection of math content.

merge_consecutive(arr: Iterable[timApp.document.translation.translationparser.TranslateApproval]) list[timApp.document.translation.translationparser.TranslateApproval][source]#

Merge consecutive elements of the same type into each other to reduce length of the list.

The merging is as follows (T = Translate, NT = NoTranslate):

[T(“foo”), T(” “), T(“bar”), NT(”

“), NT(“[“), T(“click”),

NT(“](www.example.com)”)]

==>

[T(“foo bar”), NT(”

[“), T(“click”), NT(“](www.example.com)”)]

param arr

The list of objects to merge.

return

Merged list.

notranslate_all(type_: str, content: dict) list[timApp.document.translation.translationparser.TranslateApproval][source]#

Mark the whole element as non-translatable.

TODO NOTE This function does not seem to produce Markdown consistent with

TIM’s practices, and using this should eventually be replaced with the specific *_collect -functions!

Parameters
  • type – Pandoc AST-type of the content.

  • content – Pandoc AST-content of the type.

Returns

List of single NoTranslate -element containing Markdown representation of content.

ordered_list_styling(start_num: int, num_style: str, num_delim: str) str[source]#

Makes the style for the ordered lists.

Different styles for ordered lists: num_styles - Decimal (1,2,3), LowerRoman(i,ii,iii), LowerAlpha(a,b,c),

UpperRoman (I,III,III), UpperAlpha(A,B,C), DefaultStyle (#)

num_delims - Period( . ), OneParen( ) ), DefaultDelim ( . ),

TwoParens ( (#) )

Parameters
  • start_num – The number that starts the list.

  • num_style – The numbering style.

  • num_delim – The punctuation for list.

Returns

The list style that needs to be used.

orderedlist_collect(content: dict, depth: int) list[timApp.document.translation.translationparser.TranslateApproval][source]#

Collect and separate translatable and untranslatable areas within an ordered list element through recursion. Calls to list_collect to handle recursion through block_collect.

Pandoc: https://hackage.haskell.org/package/pandoc-types-1.22.1/docs/Text-Pandoc-Definition.html#t:Block

Parameters
  • content – Ordered list (attributes and a list of items, each a list of blocks): [ ListAttributes, [[Block]] ].

  • depth – The current depth of the list, used for indentation.

Returns

List of translatable and untranslatable areas within an ordered list element.

quote: str = '"'#
quoted_collect(content: dict) list[timApp.document.translation.translationparser.TranslateApproval][source]#

Collect and separate translatable and untranslatable areas within quatation marks. Quatation element is delimited by quatation marks.

Pandoc: https://hackage.haskell.org/package/pandoc-types-1.22.1/docs/Text-Pandoc-Definition.html#t:Inline

Parameters

content – The types of quatation marks used and the text (list of inlines) from Inline element: [ QuoteType, [Inline] ].

Returns

List containing the parsed collection of Quoted content.

rawblock_collect(content: dict) list[timApp.document.translation.translationparser.TranslateApproval][source]#

Pick translatable and non-translatable parts from a rawblock.

Pandoc: https://hackage.haskell.org/package/pandoc-types-1.22.1/docs/Text-Pandoc-Definition.html#t:Block

Parameters

content – The Raw block [ Format, Text ].

Returns

List of single NoTranslate -element containing Markdown representation of rawblock element.

rawinline_collect(content: dict) list[timApp.document.translation.translationparser.TranslateApproval][source]#

Collect and separate translatable and untranslatable areas within a rawinline element.

Pandoc: https://hackage.haskell.org/package/pandoc-types-1.22.1/docs/Text-Pandoc-Definition.html#t:Inline

Parameters

content – RawInline from Inline: [ Format, Text ].

Returns

List containing the parsed collection of rawinline content.

span_collect(content: dict) list[timApp.document.translation.translationparser.TranslateApproval][source]#

Collect and separate translatable and untranslatable areas within a generic inline container with attributes.

Pandoc: https://hackage.haskell.org/package/pandoc-types-1.22.1/docs/Text-Pandoc-Definition.html#t:Inline

Parameters

content – Generic inline container with attributes: [Attr, [Inline] ].

Returns

List containing the parsed collection of span area.

table_collect(content: dict) list[timApp.document.translation.translationparser.TranslateApproval][source]#

Collect table areas as untranslatable.

Refer to Pandoc definition for tables: https://hackage.haskell.org/package/pandoc-types-1.22.1/docs/Text-Pandoc-Definition.html#t:Block

Parameters

content – Table content as dict.

Returns

List of single NoTranslate -element containing Markdown representation of table.

tex_collect(content: str) list[timApp.document.translation.translationparser.TranslateApproval][source]#

Collect and separate translatable and untranslatable areas within a LaTeX element.

Parameters

content – String which contains LaTeX area.

Returns

List containing the parsed collection of LaTeX content.

timApp.document.translation.translationparser.to_alphabet(num: int) str[source]#

Converts the start number from Pandoc’s alphabet list to the corresponding character.

Parameters

num – The list’s starting number.

Returns

The alphabet corresponding the starting number.

timApp.document.translation.translationparser.to_roman_numeral(num: int) str[source]#

Converts the start number from Pandoc’s Roman number list to the corresponding number. Source: https://stackoverflow.com/questions/28777219/basic-program-to-convert-integer-to-roman-numerals

Parameters

num – The list’s starting number.

Returns

The Roman number corresponding the starting number.

timApp.document.translation.translator module#

This module contains most notably the TranslationService-interface that different machine translators must implement in order to be integrated into TIM’s machine translation feature.

Other notable things include a database model for the API-keys of machine translator services and a processor/wrapper by which the different translators can be used to translate text from one language to another.

class timApp.document.translation.translator.LanguagePairing(value: dict[str, list[timApp.document.translation.language.Language]])[source]#

Bases: object

Maps standardized codes of (source) Languages to lists of (target) Language objects.

value: dict[str, list[timApp.document.translation.language.Language]]#
class timApp.document.translation.translator.RegisteredTranslationService(**kwargs)[source]#

Bases: timApp.document.translation.translator.TranslationService

A translation service whose use is constrained by user group.

id#

Translation service identifier.

register(user_group: timApp.user.usergroup.UserGroup) None[source]#

Set some state to the service object based on user group.

Parameters

user_group – The somehow related user group.

Returns

None.

service_name#

Human-readable name of the machine translator. Also used as an identifier.

timApp.document.translation.translator.TranslateBlock#

Typedef to represent logically connected parts of non- and translatable text.

class timApp.document.translation.translator.TranslateProcessor(translator_code: str, s_lang: str, t_lang: str, user_group: timApp.user.usergroup.UserGroup | None)[source]#

Bases: object

translate(pars: list[timApp.document.translation.translator.TranslationTarget]) list[str][source]#

Translate a list of text-containing items using the TranslationService-instance and languages set at initialization.

Parameters

pars – TIM-paragraphs containing Markdown to translate.

Returns

The translatable text contained in input paragraphs translated according to the processor-state (languages and the translator).

class timApp.document.translation.translator.TranslationService(**kwargs)[source]#

Bases: sqlalchemy.ext.declarative.api.Model

Represents the information and methods that must be available from all possible machine translators.

get_languages(source_langs: bool) list[timApp.document.translation.language.Language][source]#

Return languages supported by the TranslationService.

Parameters

source_langs – Whether source languages must be returned.

Returns

The list of supported source or target languages.

id#

Translation service identifier.

languages() timApp.document.translation.translator.LanguagePairing[source]#

Get the language-combinations for translations supported with the service.

Returns

The supported mapping of languages to translate to and from with this TranslationService.

service_name#

Human-readable name of the machine translator. Also used as an identifier.

supports(source_lang: timApp.document.translation.language.Language, target_lang: timApp.document.translation.language.Language) bool[source]#

Check if the service supports a language-combination.

Parameters
  • source_lang – Language to translate from.

  • target_lang – Language to translate into.

Returns

True, if the service can translate from source_lang to target_lang.

supports_tag_handling(tag_type: str) bool[source]#

Check if the service supports tag handling in translations. For example using XML-tags, some services offer controlling parts of the text, that should be kept as-is and not be affected by the machine translation: “My name is Dr. <protect>Oak</protect>.”

NOTE this is related to the kinda HACKY way of handling Markdown-tables in DeepL-translation.

Parameters

tag_type – Type of the tag. Some services for example support “xml” or “html”.

Returns

True, if the tag type is supported.

translate(texts: list[list[timApp.document.translation.translationparser.TranslateApproval]], source_lang: timApp.document.translation.language.Language, target_lang: timApp.document.translation.language.Language, *, tag_handling: str = '') list[str][source]#

Translate texts from source to target language.

The implementor of this method should return the (translated) text in the same order as found in the input texts-parameter originally.

Parameters
  • texts – The texts marked for translation or not. A convention would be to pass as much of the translatable text as possible in this parameter in order to minimize the amount of separate translation-calls.

  • source_lang – Language to translate from.

  • target_lang – Language to translate into.

  • tag_handling – Tag representing a way to separate or otherwise control translated text with the translation service. A HACKY way to handle special case with translating (html) tables.

Returns

List of strings found inside the items of texts-parameter, in the same order and translated.

usage() timApp.document.translation.translator.Usage[source]#

Get the service’s usage status.

Returns

The current usage of this TranslationService (for example status of an API-key).

class timApp.document.translation.translator.TranslationServiceKey(**kwargs)[source]#

Bases: sqlalchemy.ext.declarative.api.Model

Represents an API-key (or any string value) that is needed for using a machine translator and that one or more users are in possession of.

api_key#

The key needed for using related service.

static get_by_user_group(user_group: timApp.user.usergroup.UserGroup | None) timApp.document.translation.translator.TranslationServiceKey[source]#

Query a key based on a group that could have access to it.

Parameters

user_group – The group that wants to use a key.

Returns

The first matching TranslationServiceKey instance, if one is found.

group: timApp.user.usergroup.UserGroup#

The group that can use this key.

group_id#
id#

Key identifier.

service: timApp.document.translation.translator.TranslationService#

The service that this key is used in.

service_id#
to_json() dict[source]#

Create a JSON representation of data related to the TranslationServiceKey instance.

Returns

The TranslationServiceKey instance’s needed fields in a dict.

class timApp.document.translation.translator.TranslationTarget(value: str | timApp.document.docparagraph.DocParagraph)[source]#

Bases: object

Type that can be passed around in translations.

get_text() str[source]#
value: str | timApp.document.docparagraph.DocParagraph#
class timApp.document.translation.translator.Usage(character_count: int, character_limit: int)[source]#

Bases: object

Contains information about the usage of a translator service.

character_count: int#
character_limit: int#
timApp.document.translation.translator.replace_md_aliases(text: str) str[source]#

Replace the aliases that are used in place of Markdown-syntax-characters.

On some machine translators (tested with DeepL) the Markdown syntax characters break easier compared to their HTML-style counterparts. This is baked into the translation-parser, but must be converted back to Markdown-style in order to follow TIM’s preferences. :param text: Text to replace the HTML-tags of. :return: Text with the HTML-tags replaced.

Module contents#