Template talk:Lang
This is the talk page for discussing improvements to the Lang template. |
|
Archives: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13Auto-archiving period: 4 months |
This template does not require a rating on Wikipedia's content assessment scale. It is of interest to the following WikiProjects: | ||||||||
|
Template:Lang is permanently protected from editing because it is a heavily used or highly visible template. Substantial changes should first be proposed and discussed here on this page. If the proposal is uncontroversial or has been discussed and is supported by consensus, editors may use {{edit template-protected}} to notify an administrator or template editor to make the requested edit. Usually, any contributor may edit the template's documentation to add usage notes or categories.
Any contributor may edit the template's sandbox. Functionality of the template can be checked using test cases. |
This template was considered for deletion on 2006 February 20. The result of the discussion was "keep". |
To help centralize discussions and keep related topics together, the talk pages for all help pages, categories, MediaWiki messages and templates related to cite errors redirect here. |
lang-my outputs tofu on my browser (FF)
[edit]I've been removing lang-my where I come across it because it turns burmese script into tofu. Not sure what the problem is, but assume it's forcing a script to display that I don't have installed. I have a number of burmese scripts, though, including generic ones like Noto, so display shouldn't be a problem. — kwami (talk) 09:26, 31 August 2024 (UTC)
- Don't do that without evidence that
{{lang-my}}
is at fault. Here are examples of differently written Burmese text:- မြန်မာအက္ခရာ ← plain text; no markup
- မြန်မာအက္ခရာ ←
<span>မြန်မာအက္ခရာ</span>
- မြန်မာအက္ခရာ ←
<span lang="my">မြန်မာအက္ခရာ</span>
- {{lang-my|မြန်မာအက္ခရာ}} ←
{{lang-my|မြန်မာအက္ခရာ}}
{{lang-my|မြန်မာအက္ခရာ}}
- For me, all of the above render correctly (win 10, chrome). Do any of the above render correctly for you?
- We have had discussions with you about fonts in the past:
- None of those discussions revealed a problem with
{{lang}}
, the various{{lang-xx}}
, or Module:Lang. - —Trappist the monk (talk) 13:54, 31 August 2024 (UTC)
- The results of 1 and 2 display correctly (as well as the code of all 5). lang="my" appears to be the problem, and lang-my appears to inherit that problem. — kwami (talk) 14:16, 31 August 2024 (UTC)
- It is your browser that interprets the
lang="my"
attribute. If it does not interpret the attribute correctly, you will get rubbish for a rendering. Here I have switched the language tags (don't do this in mainspace):- မြန်မာအက္ခရာ ←
<span lang="ja">မြန်မာအက္ခရာ</span>
–lang="my"
switched tolang="ja"
- မြန်မာအက္ခရာ ←
<span lang="ru">မြန်မာအက္ခရာ</span>
–lang="ru"
switched tolang="ru"
- မြန်မာအက္ခရာ ←
- For me, they both render correctly.
- —Trappist the monk (talk) 14:40, 31 August 2024 (UTC)
- Okay, it's my browser then. Formatting those as 'ja' or 'ru' works for me too, but 'my' ruins it. Bizarre that FF doesn't render 'my' by default. I'll look for overrides. Thanks. — kwami (talk) 15:13, 31 August 2024 (UTC)
- Huh, same for Geʽez script. A bit of tofu in the basic block; the subsequent blocks are completely tofu except for the last, the obscure Extended-B, which displays perfectly, just as the obscure Burmese block does. But in this case, changing the language setting to 'ja' or 'ru' doesn't help. — kwami (talk) 05:07, 6 October 2024 (UTC)
- I was reminded recently that it isn't the browser that maintains fonts but rather it is the operating system. When the browser wants to display something, it uses the operating system to do it. If your operating system doesn't support these fonts then no display. At Geʽez script, my browser (chrome, win 10) displays all of the unicode characters except those in Ethiopic Extended-B. These are from Ethiopic Extended-A and display correctly both as plain text and when marked up by
{{lang}}
:- ꬁꬂꬃꬄꬅꬆ ← plain text
- ꬁꬂꬃꬄꬅꬆ ←
<span title="Ge'ez-language text"><span lang="gez">ꬁꬂꬃꬄꬅꬆ</span></span>
←{{lang|gez|ꬁꬂꬃꬄꬅꬆ}}
- —Trappist the monk (talk) 13:56, 6 October 2024 (UTC)
- Thanks for that. What's weird is that I get the opposite results. I'd expect that anything that supports Extended-B would support anything earlier. If nothing past a certain point was displayed, I'd think I needed to update something. But its stuff earlier than a certain point that's the problem. — kwami (talk) 21:43, 6 October 2024 (UTC)
- Help:Multilingual support (Indic) might help. – Jonesey95 (talk) 03:45, 8 October 2024 (UTC)
- Thanks. — kwami (talk) 03:50, 8 October 2024 (UTC)
- Help:Multilingual support (Indic) might help. – Jonesey95 (talk) 03:45, 8 October 2024 (UTC)
- Thanks for that. What's weird is that I get the opposite results. I'd expect that anything that supports Extended-B would support anything earlier. If nothing past a certain point was displayed, I'd think I needed to update something. But its stuff earlier than a certain point that's the problem. — kwami (talk) 21:43, 6 October 2024 (UTC)
- I was reminded recently that it isn't the browser that maintains fonts but rather it is the operating system. When the browser wants to display something, it uses the operating system to do it. If your operating system doesn't support these fonts then no display. At Geʽez script, my browser (chrome, win 10) displays all of the unicode characters except those in Ethiopic Extended-B. These are from Ethiopic Extended-A and display correctly both as plain text and when marked up by
- It is your browser that interprets the
- The results of 1 and 2 display correctly (as well as the code of all 5). lang="my" appears to be the problem, and lang-my appears to inherit that problem. — kwami (talk) 14:16, 31 August 2024 (UTC)
{{lang-my}}
has been deleted; see Wikipedia:Templates_for_discussion/Log/2024_September_27/lang-??_templates. Calls to that template in this discussion have been disabled.
—Trappist the monk (talk) 19:20, 7 November 2024 (UTC)
Block level
[edit]Is there a version of this template for use on block-level content? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 16:37, 2 September 2024 (UTC)
- This template. It will correctly wrap
<poem>...</poem>
tags, ordered, unordered, and definition lists, and content wrapped in<div>...</div>
tags. - —Trappist the monk (talk) 17:22, 2 September 2024 (UTC)
- Odd then that the opening sentence of the documentation refers to a "span of text". I'll change that. But what about simple paragraphs, singly or in multiple? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 17:43, 2 September 2024 (UTC)
- A span of text does not necessarily mean the html
<span>...</span>
tags. The term span has been used as a descriptor since the first version (permalink) of the documentation (then held at Template talk:Lang). I would suppose that had the original author (Editor Monedula) meant the html<span>...</span>
tags, they would have written something to the effect:- The purpose of this template is to indicate that text in HTML
<span>...</span>
tags belongs to a particular language.
- The purpose of this template is to indicate that text in HTML
- Of coarse, at the time,
{{lang}}
only supported inline text. - Paragraphs written as normal wikipedia paragraphs are supported.
- —Trappist the monk (talk) 18:53, 2 September 2024 (UTC)
- Yes; I was saying it was odd that it had never been updated to say that it covered block level content. I have now done so. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 19:20, 2 September 2024 (UTC)
- I have seen Linter errors caused by the use of this template with block content. This version of my sandbox lists one missing end tag (for
<p>
) and at least one misnested pair of<i>...</i>
tags. – Jonesey95 (talk) 22:07, 3 September 2024 (UTC)
- I have seen Linter errors caused by the use of this template with block content. This version of my sandbox lists one missing end tag (for
- Yes; I was saying it was odd that it had never been updated to say that it covered block level content. I have now done so. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 19:20, 2 September 2024 (UTC)
- A span of text does not necessarily mean the html
- Odd then that the opening sentence of the documentation refers to a "span of text". I'll change that. But what about simple paragraphs, singly or in multiple? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 17:43, 2 September 2024 (UTC)
Template-protected edit request on 5 September 2024
[edit]This edit request has been answered. Set the |answered= or |ans= parameter to no to reactivate your request. |
Can someone please remove the following comments from Module:lang/data?:
- ["lij"] = "Ligurian (Romance language)" on line 384, because the article for the ISO 639-3 code
lij
is now 'Ligurian language', - ["lij-mc"] = "Monégasque language" on line 385, because the correct article for the ISO 639-3 code
lij-mc
is 'Monégasque dialect', - ['qwm'] = "Kuman (Russia)" on line 390, because the article 'Kuman (Russia)' now redirects to 'Cumans', and because the correct article for the ISO 639-3 code
qwm
is 'Cuman language', - and ["xlg"] = "Ligurian (ancient language)" on line 394, because the article for the ISO 639-3 code
xlg
is now 'Ligurian language (ancient)'. PK2 (talk; contributions) 04:09, 5 September 2024 (UTC)
- To do as you have asked would not have been the optimal solution.
["lij"] = "Ligurian (Romance language)"
can be deleted because the language name forlij
in Module:Lang/data/iana languages is 'Ligurian'["lij-mc"] = "Monégasque language"
because there is a duplicate in another table that would have causedlij-mc
to link to 'Monégasque language'['qwm'] = "Kuman (Russia)"
can be deleted but the resulting link would be to Kuman (Russia) language from the language name forqwm
in ~/iana languages: 'Kuman (Russia)'["xlg"] = "Ligurian (ancient language)"
can be deleted but the resulting link would be Ligurian (Ancient) language from the language name forxlg
in ~/iana languages: 'Ligurian (Ancient)'
- So, I have:
- deleted
["lij"] = "Ligurian (Romance language)"
- modified
["lij-mc"] = "Monégasque language"
so that it points to 'Monégasque dialect'{{lang|lij-mc|fn=name_from_tag|link=yes}}
→ Monégasque
- deleted
['qwm'] = "Kuman (Russia)"
- added
['qwm'] = "Cuman"
to the override table - modified
["xlg"] = "Ligurian (ancient language)"
so that it points to 'Ligurian language (ancient)'{{lang|xlg|fn=name_from_tag|link=yes}}
→ Ligurian
- deleted
- —Trappist the monk (talk) 14:35, 5 September 2024 (UTC)
Hanja
[edit]For {{lang|ko-Hani}}
(supposed to be for Hanja), it renders the "traditional" characters used for Hanja as simplified characters on iOS. This seems to be undesirable; Hanja doesn't use most of the simplified characters.
For example, on iOS {{lang|ko-Hani|龜}}
renders incorrectly using the simplified char (⻱). However, on Mac desktop this issue doesn't occur.
I feel like we should recommend against people using ko-Hani
or ko-Hant
, and just ask them to stick to ko
, which doesn't have this issue. seefooddiet (talk) 04:13, 8 September 2024 (UTC)
- This is not an issue for
{{lang}}
. The character, no matter how it is rendered, is the same unicode character U+9F9C from the CJK Unified Ideographs unicode block. From your browser's point of view, the character is just a series of digits. Your browser and the operating system under which it is running decide which (of many) font faces is used to convert that series of digits to the character displayed on the screen. You can control that to some extent by providing the appropriate script subtag when you write a{{lang}}
template but ultimately, the font face is chosen by the browser and its OS. - I suspect that iOS has physical limitations (available memory?) that determine how many font faces are available. If I understand the tables in CJK Unified Ideographs (search for 9F9C) there are seven ways to write the character that is 9F9C – 3 Chinese, 2 Korean, 1 Japanese, and 1 other (Vietnamese?). There are 20,735 characters identified in CJK Unified Ideographs; many (most?) of those have multiple ways to write a CJK character so it would not surprise me to learn that the iOS/browser designers elected to fall back to one or two of those ways when rendering a CJK character.
- Regardless, when appropriate, we should always identify the correct script and not presume that all browsers have the same design as your iOS/browser. And who knows, perhaps at IOS v30 or whatever, the problem as you see it will have been resolved.
- —Trappist the monk (talk) 13:19, 8 September 2024 (UTC)
I'd argue you don't need script (writing system) tagging. Machines can easily identify the script by checking the code point of each character in a string.
Language tagging is needed for distinguishing different languages using the same script (e.g. English, Spanish; Russian, Bulgarian; etc.) or for distinguishing different orthographies using the same script in a language (e.g. Norwegian Bokmål/Nynorsk, Chinese simplified/traditional, etc.); it is not needed for distinguishing different scripts (Latin, Cyrillic, etc.).
Also, Hani
is for text consisting of Chinese characters (hanzi, kanji, hanja) only. Hanja forms of Korean terms can also contain hangul (e.g. 서울特別市 – 서울 does not have hanja), so ko-Hani
is not really appropriate anyway. I think ko
is good enough. 172.56.232.227 (talk) 23:36, 8 September 2024 (UTC)
- Apparently I wasn't as clear as I ought to have been. I do not support writing
es-Latn
orru-Cyrl
, etc. But, for Spanish transliterated into Greek, for example,es-Grek
is appropriate. Hanja forms of Korean terms can also contain hangul
. If I understand our article on Hanja, it is Chinese characters used to write Korean text. When that occurs, it would seem that the correct thing to do is to mark the text withko-Hani
. IANA seems to support this with this definition forHani
(see the IANA language-subtag-registry file):%% Type: script Subtag: Hani Description: Han Description: Hanzi Description: Kanji Description: Hanja Added: 2005-10-16 %%
- —Trappist the monk (talk) 14:05, 9 September 2024 (UTC)
- In fact, there is a code specifically for hangul+hanja Korean text:
ko-Kore
. But for some reason no one uses this on Wikipedia. - Anyway,
ko
is good enough. 172.56.232.227 (talk) 04:09, 10 September 2024 (UTC)- Oh neat, I didn't know that! Now I do, thank you. Remsense ‥ 论 06:19, 10 September 2024 (UTC)
ko-Kore
not supported by IANA and so not supported by this template:%% Type: language Subtag: ko Description: Korean Added: 2005-10-16 Suppress-Script: Kore %%
{{lang|ko-Kore|龜}}
→ [龜] Error: {{Lang}}: script: kore not supported for code: ko (help)- —Trappist the monk (talk) 06:26, 10 September 2024 (UTC)
- Oh, that's a shame. In any case, Japanese is an analogous case as it also uses a mixed script, so simply
ko
would seem to suffice, withko-Hani
also usable for hanja-only text. Remsense ‥ 论 06:29, 10 September 2024 (UTC) - Correct me if I'm wrong, but I think we're in agreement that
ko-Hani
is fine if it's exclusively Hanja, but if there is Korean mixed script then the more generalko
is more accurate. seefooddiet (talk) 06:16, 11 September 2024 (UTC)- Bingo! Remsense ‥ 论 07:56, 11 September 2024 (UTC)
- Oh, that's a shame. In any case, Japanese is an analogous case as it also uses a mixed script, so simply
Possible bug
[edit]At the bottom of the page, List of transgender public officeholders in the United States is in the category "Category:Articles containing Neapolitan-language text", despite not having any Neapolitan text. I'm not seeing anything labeled {{lang|nap}} or anything like that, either. Snowman304|talk 13:47, 15 September 2024 (UTC)
- That page transcludes Template:Transgender sidebar which does use that. Gonnym (talk) 14:38, 15 September 2024 (UTC)
- Gotcha! Thanks Snowman304|talk 14:47, 15 September 2024 (UTC)
Template-protected edit request on 18 September 2024
[edit]This edit request has been answered. Set the |answered= or |ans= parameter to no to reactivate your request. |
Can someone please categorise the template {{lang-ku}} under Category:Iranian multilingual support templates instead of Category:Indo-Iranian multilingual support templates, and the templates {{lang-bn}}, {{lang-hi}}, {{lang-ne}}, {{lang-pa}}, {{lang-sa}} and {{lang-ur}} under Category:Indo-Aryan multilingual support templates instead of Category:Indo-Iranian multilingual support templates, because the categories 'Indo-Aryan multilingual support templates' and 'Iranian multilingual support templates' are more specific than the category 'Indo-Iranian multilingual support templates'? PK2 (talk; contributions) 03:44, 18 September 2024 (UTC)
- Done. Another reason why this system of creating hundred of templates like this is horrible maintenance-wise, when one template with a language code works. Gonnym (talk) 08:25, 18 September 2024 (UTC)
merge language-specific templates
[edit]For years I've wanted to create a {{lang-??}}
template that would replace all of those hundred of templates
. Alas, {{lang-xx}}
, the most obvious choice for a template name, is used as a redirect to Template:Lang § Language-specific templates. One might argue that the language-specific templates need not be mentioned in Template:Lang/doc if {{lang-??}}
was a template that accepted the same parameters as the language-specific templates. {{lang-x}}
is used for documentation for the language-specific templates and would become superfluous if we created a single {{lang-??}}
template.
We might:
- create a redirect
{{language-specific templates}}
- replace all instances of
{{lang-xx}}
with{{language-specific templates}}
so we could recover the{{lang-xx}}
name - modify Module:lang to have a
lang-xx()
entry point - create
{{lang-xx}}
as a template that invokes the newlang-xx()
entry point to Module:Lang - create Template:lang-xx/doc from
{{lang-x}}
create language-tagged index of categories (as a new submodule?)- replace appropriate instances of the
{{lang-XX|...}}
templates with{{lang-xx|XX|...}}
where-xx
is literal andXX
is the language tag and subtags if any (not all are appropriate,{{lang-zh}}
for example; there are also{{lang-XX}}
templates that have been 'augmented') - delete appropriate
{{lang-XX}}
templates that are supported by Module:lang (not all are appropriate) - replace instances of
{{language-specific templates}}
with links to{{lang-xx}}
- delete
{{language-specific templates}}
- cleanup the mess
No doubt I've missed something here, not the least of which is community approval to make this change.
—Trappist the monk (talk) 14:34, 18 September 2024 (UTC) 16:11, 18 September 2024 (UTC) +category list 17:16, 18 September 2024 (UTC) strike category list
- Yeah, that all sounds great and I support it. The past few years saw us move from instances of templates with multiple language or country versions, to one single template ({{ISO 639 name}}: TfD; {{In lang}}: TfD (part 1) and TfD (part 2); {{Globalize}}: TfD; {{Contains special characters}}: TfD; {{Wikt-lang}}: TfD), this isn't different. Another option for the name can be {{lang2}} (which currently is an unused unrelated redirect) since "lang-xx" doesn't have any semantic meaning either. Gonnym (talk) 15:24, 18 September 2024 (UTC)
- It also seems that most usages of {{lang-xx}} is from transclusions of Module:Road data/strings/doc. Gonnym (talk) 16:05, 18 September 2024 (UTC)
- I have removed all but 16 usages of {{lang-xx}}. The remaining usages appear to be generated by an error, possibly in this module. If someone wants to dig in to the remaining 16, we can free up the template name for a better use. – Jonesey95 (talk) 16:18, 19 September 2024 (UTC)
- Thanks for doing that. I am beginning to favor
{{langx}}
; easier to write and no pre-existing conflicts to clean up. I am working to implement{{langx}}
in Module:Lang/sandbox: - —Trappist the monk (talk) 16:59, 19 September 2024 (UTC)
- Thanks for doing that. I am beginning to favor
- I have removed all but 16 usages of {{lang-xx}}. The remaining usages appear to be generated by an error, possibly in this module. If someone wants to dig in to the remaining 16, we can free up the template name for a better use. – Jonesey95 (talk) 16:18, 19 September 2024 (UTC)
- If semantic meaning is a requirement, perhaps the solution is a change to
{{lang}}
where we add a parameter|<something>=
that causes Module:Lang to selectlang()
,lang_xx_inherit()
, orlang_xx_italic
depending on the language tag supplied in the template call. The replacement in article space then becomes{{lang-XX|...}}
→{{lang|XX|<something>=yes|...}}
. I can imagine that editors won't like that so much and would want a more-or-less familiar shortcut which brings us back to{{lang-xx}}
or{{lang2}}
or{{langx}}
or{{lang+}}
or ... - —Trappist the monk (talk) 16:11, 18 September 2024 (UTC)
- I agree. If we add too much character count it will fail. Gonnym (talk) 16:19, 18 September 2024 (UTC)
- It also seems that most usages of {{lang-xx}} is from transclusions of Module:Road data/strings/doc. Gonnym (talk) 16:05, 18 September 2024 (UTC)
- Made this topic its own section.
- I'm going to commandeer
{{langx}}
for use as a testbed/demonstrator with a module in my sandbox. - —Trappist the monk (talk) 16:44, 18 September 2024 (UTC)
- Category list. The categories listed in these various
{{lang-??}}
templates (like those listed at Template talk:Lang § Template-protected edit request on 18 September 2024) seem to be mostly collections of related templates (see Category:Iranian multilingual support templates as an example). Because a single{{langx}}
template can't be categorized in this way there is no need to support those collection categories. I have struck it from the list. - —Trappist the monk (talk) 17:16, 18 September 2024 (UTC)
- Yes, that's another thing that gets simplified. The category system at Category:Wikipedia multilingual support templates will get trimmed by quite a lot. Gonnym (talk) 17:39, 18 September 2024 (UTC)
- Category list. The categories listed in these various
- The sandbox module is pretty simple, doesn't do error checking (leaves that for _lang_xx() in Module:Lang) and chooses upright font if the language tag is listed in a table of upright tags; italic else:
- I suppose that the next thing to do is to hack on Module:Lang/sandbox so that it can support both
{{langx}}
and{{lang-??}}
. That will be necessary if or when we transition from the one to the other. I think that we ought to leave support for{{lang-??}}
in the module so that the ~155 wikis that use it can adapt to the change in their own time. - —Trappist the monk (talk) 18:45, 18 September 2024 (UTC)
- Module:Episode list, Module:Nihongo, and Module:Lang/utilities will need to be adjusted if we transition to
{{langx}}
. - —Trappist the monk (talk) 19:11, 18 September 2024 (UTC)
- Yes, I agree with leaving in the lang-?? support. I think maybe a note should be added to its documentation that this usage is the deprecated method. Gonnym (talk) 19:12, 18 September 2024 (UTC)
- I had thought to enforce the deprecation by testing the value returned in lines 27 & 28; if
en
and calling_lang_xx
, return an error message. - —Trappist the monk (talk) 19:33, 18 September 2024 (UTC)
- That's also a good idea. Gonnym (talk) 21:17, 18 September 2024 (UTC)
- I had thought to enforce the deprecation by testing the value returned in lines 27 & 28; if
- Module:Episode list, Module:Nihongo, and Module:Lang/utilities will need to be adjusted if we transition to
- Module:Lang/sandbox now supports
{{langx}}
. For the most part,{{langx}}
uses the code already present for the{{lang-??}}
templates. But because of that, it is necessary that all of the testcases in Module:Lang/testcases pass. As of this writing, they do. - Because many of the
{{lang-??}}
templates render text in an upright font, I created Module:Lang/langx which holds several tables so that Module:Lang/sandbox can render the{{langx}}
template identically to the corresponding{{lang-??}}
. For example:{{lang-es|casa}}
→{{lang-es|casa}}
→ {{lang-es|casa}}{{langx|es|casa}}
→[[Spanish language|Spanish]]: <i lang="es">casa</i>
→ Spanish: casa{{lang-he|עִבְרִית}}
→{{lang-he|עִבְרִית}}
→ {{lang-he|עִבְרִית}}{{langx|he|עִבְרִית}}
→[[Hebrew language|Hebrew]]: <span lang="he" dir="rtl">עִבְרִית</span>
→ Hebrew: עִבְרִית
- In my sandbox (permalink), all of the
{{lang-??}}
templates are compared to their{{langx}}
counterparts. There are two{{lang-??}}
templates that do not match. Both of those wrap the module invoke with<span class="Unicode">...</span>
:{{lang-mrj}}
and{{lang-sty}}
. I have a vague memory that suggests that it was once necessary to use that class but with certain changes to MediaWiki, the requirement for the Unicode class was removed. If I do remember correctly, these two templates might be edited to remove the class span. If, for some reason, the class is required, Module:lang and Module:lang/langx can be modified to support it formrj
andsty
- At this writing, Category:Lang-x templates lists 1156
{{lang-??}}
and other templates. This number includes the 10 templates in Category:Constructed language multilingual support templates but, rightly, does not include the 14 templates listed in Category:Lang-x templates with other than ISO 639. - Of the 1156, four are redirects. The remaining 13 templates are not supported by
{{langx}}
though some might be:{{Language with name}}
– an older, manual form of{{langx}}
; calls{{lang}}
to do text rendering{{lang-ku-Arab}}
– uses{{language with name}}
and{{script/Arabic}}
{{lang-mnc}}
– uses{{language with name}}
and has support for two types of transliteration; might be converted to lang-xx and{{langx}}
? Manchu language; would need to add transliteration identifiers to Module:Lang/data{{lang-kmr}}
–usesconverted to Module:Lang 2024-09-22{{language with name}}
; can be converted to lang-xx and{{langx}}
? Kurmanji Kurdish language{{lang-grc-gre}}
– invalid language tag; Module:Lang does not support legitimate IETF extlang subtags;gre
is not a legitimate extlang subtag{{lang-ka}}
– uses{{language with name}}
and{{ka-translit}}
{{lang-tdd}}
–usesconverted Module:Lang 2024-09-22{{language with name}}
; can be converted to lang-xx and{{langx}}
? Tai Nuea language{{lang-prk}}
–usesconverted Module:Lang 2024-09-22{{language with name}}
; can be converted to lang-xx and{{langx}}
? Parauk language{{lang-zh}}
– uses Module:Lang-zh{{lang-wbm}}
–usesconverted Module:Lang 2024-09-22{{language with name}}
; can be converted to lang-xx and{{langx}}
? Vo language{{lang-su-fonts}}
– wraps invoke in<span lang="su" style="font-family:'Noto Sans Sundanese', 'Sundanese Unicode 2013'; font-size:;">...</span>
tag{{lang-rus}}
– uses{{language with name}}
and{{IPA}}
{{spell-nv}}
– uses{{lang}}
wrapped in<span style="font-family: Aboriginal Sans, DejaVu Sans, Calibri, Arial Unicode MS, sans-serif;">...</span>
tag; does not belong in Category:Lang-x templates
- Many (most?) of the
{{lang-??}}
templates include category links in their wikitext. Most appear to use some form of 'Category:<language name> multilingual support template' category. Leaving out those categories, this search returns 13 templates with some sort of category link. These are:{{lang-el}}
–Category:Instances of Lang-el using second unnamed parameter; {{2}} is the transliteration parameter so why do we care?template modified and category deleted 2024-09-24{{lang-sh}}
– Category:Articles using Lang-sh with second positional parameter; Serbo-Croatian gives equal standing to Cyrillic and Latin scripts so the 'transliteration' parameter is inappropriate for rendering both scripts; for this, use{{lang-x2}}
or the more specific{{lang-sh-Cyrl-Latn}}
or{{lang-sh-Latn-Cyrl}}
{{lang-ko}}
–Category:Korean name templates; may be a Korean template but not tied to Korean names exclusivelymoved to Category:Korean language templates 2024-09-24{{lang-okm}}
– same{{lang-sco}}
–Category:Lang-x templates with unsupported parameters;fixed 2024-09-23|abbr=
documented but not supported; no evidence of use in the wild except ~/testcases{{lang-mn}}
– Category:Mongolian language; redundant to the Module:Lang-created category: Category:Articles containing Mongolian-language text{{lang-eu}}
– Category:Basque templates; if these{{lang-??}}
templates go away (as they should) nothing lost{{lang-euq}}
– Category:Basque templates; same{{lang-pap}}
– Category:Papiamento; same{{lang-xno}}
–Category:Anglo-Norman literature; template is not literature; as abovemoved to Category:Norman language 2024-09-24{{lang-cdo}}
–Category:Min Dong Chinese support templates; only item in category; both can go awaycategory deleted 2024-09-24{{lang-clm}}
–Category:Klallam multilingual support templates; shouldn't be in this list but but is because the category link has a tab character after the colon; apparently no cat of that name with or without the extra whitespacefixed 2024-09-23{{lang-ykg}}
–Category:Yukaghir languages; not a languagetemplate removed from category 2024-09-24
- I think that Module:Lang/sandbox is ready to go. Yea or nay? If yea then we need to consider the process of retiring the 1100+
{{lang-??}}
templates. - —Trappist the monk (talk) 18:55, 21 September 2024 (UTC)
- I'll of course support it. After the replacement of the templates, and deletion, it will be easier to spot the leftover templates that need additional work (like the ones in your list above). Gonnym (talk) 22:12, 21 September 2024 (UTC)
- This sounds like an excellent simplification. I am sure that we will run into a few stumbles along the way, but edge cases can either be dealt with or left behind as exceptions to the general "no lang-xx" templates practice. – Jonesey95 (talk) 20:41, 22 September 2024 (UTC)
- I'll of course support it. After the replacement of the templates, and deletion, it will be easier to spot the leftover templates that need additional work (like the ones in your list above). Gonnym (talk) 22:12, 21 September 2024 (UTC)
- Right then, Module:Lang updated from Module:Lang/sandbox so
{{langx}}
is live; Template:Langx/doc created from a hacked version of Template:Lang-x/doc (needs work; see the TODOs there). - —Trappist the monk (talk) 22:26, 22 September 2024 (UTC)
- I have converted
{{lang-kmr}}
,{{lang-tdd}}
,{{lang-prk}}
, and{{lang-wbm}}
in the list above to directly use Module:Lang. - —Trappist the monk (talk) 01:01, 23 September 2024 (UTC)
- Is there anything I can do to help us get to the point where we are ready with replacement? Gonnym (talk) 17:17, 25 September 2024 (UTC)
- I suspect that we're ready with the exceptions of writing a bot to do the replacements and deciding how to word a TfD that will provoke the fewest editors into reaching for their pitchforks and at the same time gain the greatest support. You are for more experienced with TfD and what succeeds there than I, so if you have a suggestion for that... Do we need to mark all 1140-ish templates with TfD notices? If so, that will also need doing; could probably do that with AWB – of course that is the sort of thing that brings out the torches and pitchforks...
- —Trappist the monk (talk) 18:47, 25 September 2024 (UTC)
- We do need to tag all templates, but
|type=disabled
(in my opinion) should be used so the pages aren't completely full of TfD everywhere (which usually gets people angry). - Regarding the TfD text...well I'm sure we're going to get the angry mob no matter what. But a few things we can mention
- Character count (langx|??) 7 vs (lang-??) 6 - so only one character more, a non-issue
- Removing the need for:
- Maintenance: Adding categories
- Fixes: Template:Lang-tdd, Template:Lang-prk, Template:Lang-clm has over two years a broken category
- Creating new language templates (many are still being created and left unused): Template:Lang-mtm, Template:Lang-nio, Template:Lang-ohu examples of recently created templates
- Watching 1500+ templates for vandalism (do you maybe have an example?)
- Past experience with similar nominations {{ISO 639 name}} (TfD), {{In lang}} (TfD (part 1) and TfD (part 2)), {{Wikt-lang}} (TfD) has proven that it works.
- Gonnym (talk) 19:03, 25 September 2024 (UTC)
- Yeah, all of those TfD notices do cause anger. But, editors will also get angry for not being notified when all of a sudden a bot shows up to rewrite all of the
{{lang-??}}
templates. Is there a middle ground? Perhaps we show the TfD notice on one or two of the templates with significant use? Perhaps{{lang-de}}
(~40280 articles) or{{lang-ru}}
(~95500 articles)? Or, we can choose four or five templates that aren't used as much? - How about this? Too long; won't read? Too technical? Too detailed? Too...? reworked 18:51, 26 September 2024 (UTC)
- Yeah, all of those TfD notices do cause anger. But, editors will also get angry for not being notified when all of a sudden a bot shows up to rewrite all of the
- We do need to tag all templates, but
- Is there anything I can do to help us get to the point where we are ready with replacement? Gonnym (talk) 17:17, 25 September 2024 (UTC)
{{lang-??}}
templates listed at [[]] with the single template {{langx}}
.
The {{lang-??}}
templates are all more-or-less forks of some ancient ancestor. Like {{lang}}
the primary purpose of these templates is to render non-English text in a way that is html-correct and compliant with the Manual of Style. {{langx}}
uses the same rendering code as the {{lang-??}}
templates so, given the same language tag and text, renders an identical output:
{{lang-es|casa}}
→{{lang-es|casa}}
→ {{lang-es|casa}}{{langx|es|casa}}
→[[Spanish language|Spanish]]: <i lang="es">casa</i>
→ Spanish: casa
Like {{lang}}
, {{langx}}
supports all of the 8000+ languages listed in the IANA language-subtag-registry file. {{lang-??}}
: one template for one language; {{langx}}
: one template for 8000 languages.
Background
|
---|
The
For editors who need another language template, their options til now have been:
Previous TfDs related to language tagging:
|
- —Trappist the monk (talk)
11:57, 26 September 2024 (UTC)reworked 18:51, 26 September 2024 (UTC)- Maybe put the sections relevant to the nomination together (the start and the end), and then add in the technical behind the secnes information for anyone wanting a more detailed explanation to the differences.
There are approximately 1145 lang-?? templates; all more-or-less forks of some ancient ancestor. Like {{lang}} the primary purpose of these templates is to render non-English text in a way that is html-correct and compliant with the Manual of Style {{langx}} uses the same rendering code as the lang-?? templates so, given the same language tag and text, renders an identical output: -examples- Like {{lang}}, {{langx}} supports all of the 8000+ languages listed in the IANA language-subtag-registry file. lang-??: one template for one language; {{langx}}: one template for 8000 languages.
Gonnym (talk) 17:10, 26 September 2024 (UTC)- Reworked some. Better? Worse? Still too...?
- —Trappist the monk (talk) 18:51, 26 September 2024 (UTC)
- I think this looks good. @Jonesey95 thoughts? Gonnym (talk) 19:00, 26 September 2024 (UTC)
- My experience with TFD is that sometimes biting off a huge chunk creates too much drama and backfires. I would pick ten simple, widely used templates from the set (i.e. don't pick any edge cases; pick ones that will definitely convert cleanly), explain briefly what you propose to replace them with, and then say that if it works, you will bring the remaining 1,100 templates back to TFD using the ten-template TFD as a basis for consensus. Link to this discussion. Doing the process in two phases will probably take less time and cause less drama than trying to do it in one phase. – Jonesey95 (talk) 19:06, 26 September 2024 (UTC)
- I guess I don't like this option. It means multiple TfDs which can have multiple outcomes. In the case where the TfD results in merge/delete for some but not for others (because different crowds of editors) there is nothing to prevent the recreation of those templates that were merged/deleted unless we salt those templates. What a headache. For me, I would rather the proposal be accepted or rejected as a whole,
{{langx}}
will still be here regardless of the outcome (unless someone does a successful reverse TfD to replace/delete{{langx}}
). - I can see that it might be necessary to have something to offer so that the enacting of an affirmative would be done in stages rather than as one giant bot run to replace all
{{lang-??}}
templates. But, that need not be offered from the outset but withheld until needed. But this option is different from Editor Jonesey95's option. - —Trappist the monk (talk) 22:06, 26 September 2024 (UTC)
- Go for it. I will support either direction. – Jonesey95 (talk) 22:13, 26 September 2024 (UTC)
- I also prefer, in this case, a complete nomination. Gonnym (talk) 22:19, 26 September 2024 (UTC)
- You didn't answer my
Reworked some. Better? Worse? Still too...?
question... - —Trappist the monk (talk) 23:05, 26 September 2024 (UTC)
- I said I think it looks good like that. It explains what we are doing and why and then lets editors who want read into it more. Gonnym (talk) 08:13, 27 September 2024 (UTC)
- @Trappist the monk can you look at the /doc? I'm not sure
|code=
is correct for this template. Not sure about if the others are also still relevant. Gonnym (talk) 23:13, 29 September 2024 (UTC)|1=
and|code=
should not be listed separately; they are the same thing in{{langx}}
. If both are supplied,|1=
wins.|code=
is not set by the template (that is only for{{lang-??}}
). If|code=
is used, then the positional parameters for text (and transliteration and translation if needed) must use their named parameters|text=
,|translit=
, and|translation=
or their explicitly stated numerical aliases|2=
,|3=
,|4=
.- —Trappist the monk (talk) 00:21, 30 September 2024 (UTC)
- @Trappist the monk can you look at the /doc? I'm not sure
- I said I think it looks good like that. It explains what we are doing and why and then lets editors who want read into it more. Gonnym (talk) 08:13, 27 September 2024 (UTC)
- You didn't answer my
- I guess I don't like this option. It means multiple TfDs which can have multiple outcomes. In the case where the TfD results in merge/delete for some but not for others (because different crowds of editors) there is nothing to prevent the recreation of those templates that were merged/deleted unless we salt those templates. What a headache. For me, I would rather the proposal be accepted or rejected as a whole,
- My experience with TFD is that sometimes biting off a huge chunk creates too much drama and backfires. I would pick ten simple, widely used templates from the set (i.e. don't pick any edge cases; pick ones that will definitely convert cleanly), explain briefly what you propose to replace them with, and then say that if it works, you will bring the remaining 1,100 templates back to TFD using the ten-template TFD as a basis for consensus. Link to this discussion. Doing the process in two phases will probably take less time and cause less drama than trying to do it in one phase. – Jonesey95 (talk) 19:06, 26 September 2024 (UTC)
- I think this looks good. @Jonesey95 thoughts? Gonnym (talk) 19:00, 26 September 2024 (UTC)
- —Trappist the monk (talk)
Break
[edit]I've came across Template:Lang-az-Cyrl and Template:Lang-lmo-IT that aren't in Category:Lang-x templates (not sure why) and aren't in the TfD nomination. Should they be? --Gonnym (talk) 09:05, 30 September 2024 (UTC)
{{Lang-az-Cyrl}}
is one of a couple of dozen that escaped the search. I will add them as an addendum this morning.{{Lang-lmo-IT}}
is a wrapper around{{Language with name}}
for the Bergamasque dialect of Lombard so not a 'language' per se. Not currently supported by Module:Lang except by the taglmo-IT
which the module knows as Lombard ({{Language with name}}
uses{{lang}}
for html compliance).- —Trappist the monk (talk) 11:38, 30 September 2024 (UTC)
- Looking at Template:Lang-bcs, it links (via a MoS invalid redirect) to Serbo-Croatian. That has the code of "hbs". Shouldn't that template also be part of the regular list then? Gonnym (talk) 14:39, 3 October 2024 (UTC)
bcs
is the IANA language tag for Kohumono; a Nigerian language so nothing to do with Serbo-Croatian.{{lang-bcs}}
uses a custom label and is a wrapper template. The initial list and the addenda are only supposed to list those templates that directly invoke Module:lang because those templates comprise the vast majority of{{lang-??}}
templates. After we dispose of those templates, what remains can be dealt with case-by-case.- —Trappist the monk (talk) 15:07, 3 October 2024 (UTC)
- Ah ok, I didn't even check if bcs was an actual code for something else. How am I not surprised. Gonnym (talk) 15:21, 3 October 2024 (UTC)
- I've converted Template:Lang-lmo-cr to use the standard style and it seems to be working correctly. That can also be added to the list. Gonnym (talk) 14:47, 3 October 2024 (UTC)
- Ah, the dialect isn't recognized and needs a manual label, so doesn't fit. Gonnym (talk) 14:51, 3 October 2024 (UTC)
- Looking at Template:Lang-bcs, it links (via a MoS invalid redirect) to Serbo-Croatian. That has the code of "hbs". Shouldn't that template also be part of the regular list then? Gonnym (talk) 14:39, 3 October 2024 (UTC)
- I have created Category:Pages using Lang-xx templates to collect pages from all namespaces that use a
{{lang-??}}
template that calls one of the module functionslang_xx_inherit()
orlang_xx_italic()
. This category can be used as a resource for a bot when (if) the TfD closes in the affirmative. A category is better than a cirrus search which, for the{{lang-??}}
templates, times out. - —Trappist the monk (talk) 14:01, 5 October 2024 (UTC)
- Nice work. The category will also catch all sorts of edge case formatting of template parameters that searches (or searchers) have a difficult time finding. A note: if categories are working as they have historically, this category may take days or weeks to fill up. Expect to be dealing with stragglers for a while. Individual templates can and should always be checked via "What links here" before they are deleted. – Jonesey95 (talk) 16:25, 5 October 2024 (UTC)
- Which is why I created the category now, before the TfD is done and before I take Monkbot 20 to WP:BRFA. Of course this all assumes that the TfD closes in the affirmative.
- —Trappist the monk (talk) 18:55, 5 October 2024 (UTC)
- Nice work. The category will also catch all sorts of edge case formatting of template parameters that searches (or searchers) have a difficult time finding. A note: if categories are working as they have historically, this category may take days or weeks to fill up. Expect to be dealing with stragglers for a while. Individual templates can and should always be checked via "What links here" before they are deleted. – Jonesey95 (talk) 16:25, 5 October 2024 (UTC)
- WP:BRFA filed: Wikipedia:Bots/Requests for approval § Monkbot 20
- —Trappist the monk (talk) 14:46, 7 October 2024 (UTC)
- I have tweaked Module:Lang and Module:Lang/langx to emit a category link whenever
{{langx}}
has one of the language tags from a template listed at Wikipedia:Templates for discussion/Log/2024 September 27/lang-?? templates § excluded templates. See Category:Langx uses unsupported language tag. The purpose of this is to help (over?) enthusiastic editors to not convert the excluded templates and for the rest of us to know where to look. Pages in that category are sorted by language tag. - —Trappist the monk (talk) 15:45, 13 October 2024 (UTC)
- Good idea. I wish the BRFA would have already been given a trial so your bot could do have already handled the replacement. Gonnym (talk) 15:53, 13 October 2024 (UTC)
- Even had it been through BRFA, there isn't a chance that it could have got through the 13,777 pages in Category:Pages using Lang-xx templates. I have been manually editing thousands of pages using the bot's code and have hardly made a dent. I think that I have fixed all of the category and file namespace pages and (so far) managed to keep the total count to just below 600,000 ... the category is still being populated.
- —Trappist the monk (talk) 16:12, 13 October 2024 (UTC)
- I've restored the usages of lang-ka in the category. I think the category is catching false positives. Your edit here seems to be correct as {{Lang-sh}} is tagged, but maybe the code is catching {{Lang-sh-Cyrl-Latn}} or one of the others. Gonnym (talk) 08:40, 14 October 2024 (UTC)
- Fixed.
{{lang-sh}}
sets|script=Latn
which, in the module, gets appended to the language subtagsh
→sh-Latn
.{{lang-sh-Latn}}
does the same. When it came time to test the language tag against the list of unsupported language tags, both{{lang-sh}}
and{{lang-sh-Latn}}
triggered the category. Fixed by testing the unmodified language tag,sh
, against the list of unsupported language tags. This should be adequate because most editors who are tweaking{{lang-??}}
to{{langx}}
won't change the??
portion of the template. - —Trappist the monk (talk) 14:00, 14 October 2024 (UTC)
- Fixed.
- I've restored the usages of lang-ka in the category. I think the category is catching false positives. Your edit here seems to be correct as {{Lang-sh}} is tagged, but maybe the code is catching {{Lang-sh-Cyrl-Latn}} or one of the others. Gonnym (talk) 08:40, 14 October 2024 (UTC)
- Good idea. I wish the BRFA would have already been given a trial so your bot could do have already handled the replacement. Gonnym (talk) 15:53, 13 October 2024 (UTC)
Moldovan Cyrillic
[edit]An editor moved {{Lang-mo-Cyrl}} to {{Moldovan Cyrillic}}, which broke the documentation nicely. Many of the transclusions were replaced with the new name. It may need some special attention during the migration. – Jonesey95 (talk) 14:23, 9 October 2024 (UTC)
- That template is excluded from the TfD because it uses a custom label.
- —Trappist the monk (talk) 14:29, 9 October 2024 (UTC)
a way to mark something as being in multiple languages
[edit]Maybe this is pie-in-the-sky, or a different matter entirely, but it would be nice if there were a way to mark something as being in multiple languages, e.g., Czech and Slovak from Chort: A chort (Russian: чёрт, Belarusian and Ukrainian: чорт, Serbo-Croatian čort or črt, Polish: czart and czort, Czech and Slovak: čert, Slovene: črt) Snowman304|talk 19:12, 18 September 2024 (UTC)
- Not in these templates. The primary purpose of these templates is to provide correct html markup for non-English text. html allows only one
lang=
attribute per tag. Which one of these multiple languages would apply? Browsers use this attribute to choose a proper font; screen readers use the attribute to control pronunciation. Do Belarusians and Ukrainians pronounce 'чорт' the same way? If not then that suggests that a different way of writing that lead sentence should be preferred. - —Trappist the monk (talk) 19:43, 18 September 2024 (UTC)
- Gotcha, I wasn't thinking about those things at all. Snowman304|talk 21:08, 18 September 2024 (UTC)
Italics in foreign-language text
[edit]I'm struggling with what to do with foreign-language text containing italic text while following default rules on foreign-language italicization. Specifically, I'm working on Template:Translated blockquote. The default rules are described at Template:Lang#Automatic italics and defined at Module:Lang#L-996.
Option | Source | Issue |
---|---|---|
{{lang|fr|Je suis un clown nommé ''Maurice''|italic=unset}}
|
Category:Lang and lang-xx template errors | Doesn't use the default italicization |
{{lang|fr|Je suis {{noitalic|English}}.}}
|
Template:Lang#Automatic italics | Uses Template:Noitalic, when the content should invert italics relative to the surrounding text. |
tûndra | Template:Lang#italic parameter | Doesn't use the default italicization |
I have edited Template:Lang/with italics (permalink) as a proof-of-concept that can accept the following kinds of markup:
Markup | Renders as |
---|---|
{{Lang/with italics|en|Some text}} |
Some text |
{{Lang/with italics|en|Some <i>italic</i> text}} |
Some italic text |
{{Lang/with italics|fr|Je suis française.}} |
Je suis française. |
{{Lang/with italics|fr|Je ''suis'' française.}} |
Je suis française. |
{{Lang/with italics|he|לעז}} |
לעז |
{{Lang/with italics|he|''לעז''}} |
[לעז] Error: {{Lang}}: text has italic markup (help) |
My implementation is really klunky, so this isn't an edit request. It just seemed easier for me to implement in the template rather than the Lua module.
Questions:
- Why doesn't Template:Lang accept italics in its text, as Template:Lang/with italics does?
- What do you recommend I do with Template:Translated blockquote? At the moment, it uses
|italic=invert
. It could use Template:Lang/with italics by a more permanent name, eg Template:Lang/with italics.
Daask (talk) 20:08, 18 September 2024 (UTC)
{{lang}}
emits errors because in the beginning of this module's life, there were a bunch of{{lang|es|''casa''}}
, holdovers from the time that Latn-script text had to be manually italicized. This doesn't happen so much anymore now that editors have learned the 'new' way. But, this italics prohibition brought with it the problem of what to do with mixed italic/upright text. The solution to that was|italic=unset
and|italic=invert
. So far as I know, there has been no call for any other options.- What is wrong with using
|italic=invert
? Does it not do what you need doing? - —Trappist the monk (talk) 21:47, 18 September 2024 (UTC)
- @Trappist the monk: The
|italic=default
only italicizes Roman-script text, whereas|italic=invert
always italicizes the text, regardless of script. - Eg.
{{Lang|italic=invert|he|לעז}}
→ לעז vs.{{Lang/with italics|he|לעז}}
→ לעז - Daask (talk) 13:32, 19 September 2024 (UTC)
- Maybe we could add an option
|allow-italics=yes
to omit error messages about italics within the text? Daask (talk) 13:39, 19 September 2024 (UTC)- On second thought, Category:Lang and lang-xx template errors is empty except for a citation template issue, so I suggest the Template:Lang/with italics behavior be made the default. These error messages are no longer necessary. Daask (talk) 13:41, 19 September 2024 (UTC)
- I disagree. These italics errors do still appear. The template is responsible for styling the rendered non-English text so it considers italic markup an error unless the editor has explicitly directed the template to allow the markup.
- —Trappist the monk (talk) 14:44, 19 September 2024 (UTC)
- On second thought, Category:Lang and lang-xx template errors is empty except for a citation template issue, so I suggest the Template:Lang/with italics behavior be made the default. These error messages are no longer necessary. Daask (talk) 13:41, 19 September 2024 (UTC)
- Yes:
– this determination happens at lines 996–1003; see also lines 94–135|italic=default
only italicizes Roman-script text - The purpose of
invert
is to flip italicized text within upright text so that you get upright text within italicized text. This is a completely bogus example because the English text should never be marked up as Hebrew:{{Lang|italic=invert|he|some italic text followed by inverted Hebrew text ''לעז'' and then some more italic text}}
- some italic text followed by inverted Hebrew text לעז and then some more italic text
- So, the module inverts everything to the opposite markup:
some italic text followed by inverted Hebrew text ''לעז'' and then some more italic text
- some italic text followed by inverted Hebrew text לעז and then some more italic text
- becomes:
''some italic text followed by inverted Hebrew text ''לעז'' and then some more italic text''
- some italic text followed by inverted Hebrew text לעז and then some more italic text
- If there is no italic markup,
|italic=invert
is the same as|italic=yes
as you demonstrated in your example. Conversely, when there is only italicized text:{{Lang|he|''לעז''|italic=invert}}
- לעז
- Your example:
{{Lang/with italics|he|''לעז''}}
→ [לעז] Error: {{Lang}}: text has italic markup (help)
- can be achieved with any of these:
{{Lang|he|לעז|italic=yes}}
→ לעז{{Lang|he|לעז|italic=invert}}
→ לעז{{Lang|he|''לעז''|italic=unset}}
→ לעז
- These
|italic=
parameter values are working as they are intended to work. - —Trappist the monk (talk) 14:44, 19 September 2024 (UTC)
- @Trappist the monk: I have current set Template:Translated blockquote to use Template:Lang/with italics, because I see no way to use Template:lang. I need the default behavior (which Template:Lang/with italics detects via Template:lang/italicize), but I also need to omit error messages. I apologize for being overly bold in suggesting that the error messages are no longer useful, but I need a means to omit them. Daask (talk) 14:55, 19 September 2024 (UTC)
- Maybe we could add an option
- @Trappist the monk: The
- Daask, I think you should not implement any italics for Cyrillic until you get sufficient consensus to overturn Wikipedia:Manual of Style/Text formatting, in particular, MOS:BADITALICS. Is this sandboxed now? If so, please do not release it until a wider discussion has been had about it. Mathglot (talk) 01:12, 5 October 2024 (UTC)
- @Mathglot: I'm confused by your comment. {{Lang/with italics}} and {{Lang}} use the same default italicization. {{Lang/with italics}} just omits the error messages when the text contains manual italicization. I have no intention of proposing changes to WP:MOS related to this topic. Can you give an example of your concern? Daask (talk) 12:50, 8 October 2024 (UTC)
- Ah, I now see your comments in § Is it applied to transliterarions? and think I understand your concern. You want to ensure that {{Lang/with italics}} enforces MOS:BADITALICS by throwing errors on manual italicization of non-Roman scripts. I created it because {{Lang}} was throwing errors on italicization in French text, but I see that my examples included italicized non-Roman text, which are not acceptable. I'll adjust the template accordingly momentarily. Daask (talk) 13:01, 8 October 2024 (UTC)
- Yes, that's what I meant. Mathglot (talk) 15:03, 9 October 2024 (UTC)
- Ah, I now see your comments in § Is it applied to transliterarions? and think I understand your concern. You want to ensure that {{Lang/with italics}} enforces MOS:BADITALICS by throwing errors on manual italicization of non-Roman scripts. I created it because {{Lang}} was throwing errors on italicization in French text, but I see that my examples included italicized non-Roman text, which are not acceptable. I'll adjust the template accordingly momentarily. Daask (talk) 13:01, 8 October 2024 (UTC)
- @Mathglot: I'm confused by your comment. {{Lang/with italics}} and {{Lang}} use the same default italicization. {{Lang/with italics}} just omits the error messages when the text contains manual italicization. I have no intention of proposing changes to WP:MOS related to this topic. Can you give an example of your concern? Daask (talk) 12:50, 8 October 2024 (UTC)
Is it applied to transliterarions?
[edit]Please see Talk:Kompromat#Why_is_the_word_so_small?. Two issues: (2) the complaint abouot fontsize and (1) (my question: Is the usage {{lang|ru|Kompromat}} (Kompromat) valid or only {{lang|ru|компромат}} makes sense? --Altenmann >talk 23:54, 4 October 2024 (UTC)
- Please use
{{lang|xx-Latn}}
or{{tlit}}
for transliterations: IETF codes assume the "native" script with a bare language code, soru
assumes Cyrillic (i.e. explicitlyru-Cyrl
). Using the transliteration template{{tlit|ru}}
would tag asru-Latn
(i.e. Russian written using the Latin alphabet) - So,
{{lang|ru|Компромат}}
{{tlit|ru|Kompromat}}
→ Компромат Kompromat. If you have any questions lmk Remsense ‥ 论 23:59, 4 October 2024 (UTC)- Thx, great. But what about the complaint in Talk:Kompromat#Why_is_the_word_so_small?? --Altenmann >talk 00:11, 5 October 2024 (UTC)
- I've responded there. Remsense ‥ 论 00:20, 5 October 2024 (UTC)
- Not great at all; the italics make me start off seeing it as Cyrillic italic. I read
{{tlit|ru|Kompromat}}
→ Kompromat as the italicized version of the unpronounceable mish-mash Котротаъ, which rendered in italics almost looks like the word under discussion (here rendered on two lines, to illustrate the problem):- Котротаъ – fake word 'Котротаъ' in italics
- Kompromat – from
{{tlit|ru|Kompromat}}
→ Kompromat — real Russian word Компромат, romanized
- See the problem? Makes me backtrack and reparse. The ' r ' is a clue, but it depends how clear my glasses are, and what time it is. Isn't there a guideline somewhere about not italicizing Cyrillic? There's a good reason for that. Mathglot (talk) 01:06, 5 October 2024 (UTC)
- Found it: MOS:BADITALICS. Mathglot (talk) 01:13, 5 October 2024 (UTC)
- Thx, great. But what about the complaint in Talk:Kompromat#Why_is_the_word_so_small?? --Altenmann >talk 00:11, 5 October 2024 (UTC)
Georgian italics
[edit]In Langx, Georgian (code "ka") is currently italicized by default but shouldn't be, per WP:FOREIGNITALIC. — Goszei (talk) 22:54, 12 October 2024 (UTC)
- Use
{{lang-ka}}
. That template is not one that will be converted to{{langx}}
because it is based on{{language with name}}
and also uses{{ka-translit}}
.{{lang-ka|ქართული ენა}}
→ Georgian: ქართული ენა
- I expect in a future version of
{{langx}}
to implement the same auto-italic code that is used by{{lang}}
:{{lang|ka|ქართული ენა}}
→ ქართული ენა
- If you are seeing editors switching
{{lang-ka|...}}
to{{lang|ka|...}}
, please ask them to stop. - —Trappist the monk (talk) 23:24, 12 October 2024 (UTC)
Lua error in Module:Lang at line 1422: attempt to concatenate a nil value
[edit]This error show on the page Wicked City (1987 film). 118.3.227.103 (talk) 15:40, 13 October 2024 (UTC)
- Ping Trappist the monk (last editor), it also shows up at MOS:FORITA. Sam Sailor 15:54, 13 October 2024 (UTC)
- Wow, that was fast! 118.3.227.103 (talk) 16:14, 13 October 2024 (UTC)
Error when displaying Japanese text
[edit]I don't know if this is the right place for a bug report, but instead of the Japanese text and romaji equivalent I get this message: "Lua error in Module:Lang at line 1422: attempt to concatenate a nil value.".
The text was displaying correctly until I clicked on the donate button with the scroll-wheel (which opened the page in a new tab). Now any page I go on has this error message instead of the Japanese text, even when I refresh or close and reopen a page.
I am using Firefox and Ecosia. Luu-meer (talk) 15:44, 13 October 2024 (UTC)
Tracking categories
[edit]Could you add the following tracking categories to the module?
- Unknown parameters (Module:Check for unknown parameters or manually)
- When langx is used with
|label=none
, since that usage should just be converted to lang instead.
Gonnym (talk) 08:30, 14 October 2024 (UTC)
- We might do an unsupported parameters test in future.
- We might create a maint cat for
|label=none
, but:{{lang-es|casa|lit=house|label=none}}
→ {{lang-es|casa|lit=house|label=none}}{{langx|es|casa|lit=house|label=none}}
→ casa, 'house'{{lang|es|casa|lit=house}}
→ [casa] Error: {{Lang}}: invalid parameter: |lit= (help)
- There is a set of parameters that are common to both
{{lang}}
and{{langx}}
:|code=
,{{{1}}}
,|text=
,{{{2}}}
,|rtl=
,|italic=
,|italics=
,|i=
,|size=
,|proto=
,|nocat=
,|cat=
- We must not categorize any
{{langx}}
with|label=none
that uses parameters not supported by{{lang}}
:|translit=
,|translit-std=
,|translit-script=
,|translation=
,|lit=
,{{{4}}}
,|label=
,|link=
,|script=
,|region=
,|variant=
,|engvar=
- —Trappist the monk (talk) 14:27, 14 October 2024 (UTC)
- It's great that you still know the ins and outs of this module. I really only thought the difference between these two templates is the existence of a label, not that langx has other unique features. Gonnym (talk) 14:32, 14 October 2024 (UTC)
- Related to this, is the fact that these features aren't offered for lang a deliberate decision? Gonnym (talk) 14:35, 14 October 2024 (UTC)
- If a deliberate decision taken, I was not party to it. My goal when creating Module:Lang was to provide a uniform support structure for as many
{{lang-??}}
templates as possible. The commonalities between{{lang}}
and the{{lang-??}}
were not considered except to reuse code that supports both. - Before you suggest it, I'm not interested in thinking about expanding the
{{lang}}
parameter set; too much other going on right now. Let us first finish consolidating{{lang-??}}
into{{langx}}
. - —Trappist the monk (talk) 14:52, 14 October 2024 (UTC)
- If a deliberate decision taken, I was not party to it. My goal when creating Module:Lang was to provide a uniform support structure for as many
- Related to this, is the fact that these features aren't offered for lang a deliberate decision? Gonnym (talk) 14:35, 14 October 2024 (UTC)
- It's great that you still know the ins and outs of this module. I really only thought the difference between these two templates is the existence of a label, not that langx has other unique features. Gonnym (talk) 14:32, 14 October 2024 (UTC)
- I suggested earlier that
[we] might do an unsupported parameters test in future
. I have implemented that:{{lang/sandbox|ar|نص العنصر النائب|script=Arab}}
→ [نص العنصر النائب] Error: {{Lang}}: invalid parameter: |script= (help)
|script=
,|region=
, and|variant=
parameters are not supported by{{lang}}
because those IETF subtags can/should be part of the language tag. This same should also apply to{{langx}}
but because we didn't think about this while Monkbot/task 20 was running we may be stuck with these as{{langx}}
parameters. On the other hand, these searches:- suggest that we might deprecate these parameters. We could write an awb script to create proper IETF language tags from the
|code=
/|script=
/|region=
/|variant=
subtag parameters and then remove support for them. I have added|script=
/|region=
/|variant=
parameter detection to{{langx}}
which will add a maintenance message and Category:Langx deprecated parameters when any of these parameters are used:{{langx/sandbox|es|region=419|Casa}}
→ [Casa] Error: {{Langx}}: invalid parameter: |region= (help)
- Once cleared, that maint category and message go away to be replaced with the invalid parameter error message.
- For completeness, these searches for
{{lang}}
with|script=
,|region=
, and|variant=
parameters: - (all articles returned by these searches will end up in Category:Lang and lang-xx template errors)
- —Trappist the monk (talk) 17:15, 13 November 2024 (UTC)
- Is the fix for the above to move the value from the script/region/variant to the code area? So
ar-Arab
? Gonnym (talk) 18:34, 13 November 2024 (UTC)- Yes – except that
Arab
is not a valid script subtag forar
:{{lang/sandbox|ar-Arab|نص العنصر النائب}}
→ [نص العنصر النائب] Error: {{Lang}}: script: arab not supported for code: ar (help)
- but:
{{lang/sandbox|ar|Placeholder text|script=Latn}}
→{{lang/sandbox|ar-Latn|Placeholder text}}
→<span title="Arabic-language text"><i lang="ar-Latn">Placeholder text</i></span>
→ Placeholder text
- —Trappist the monk (talk) 18:52, 13 November 2024 (UTC)
- Is
|rtl=
used by langx or does it detect automatically the languages that use that? Gonnym (talk) 16:53, 14 November 2024 (UTC)- When certain scripts are specified (see here),
{{lang}}
and{{langx}}
will applydir="rtl"
:{{langx|es-Arab|text}}
→[text] <span style="color:#d33">Error: {{Langx}}: Latn text/non-Latn script subtag mismatch ([[:Category:Lang and lang-xx template errors|help]])</span>
→ [text] Error: {{Langx}}: Latn text/non-Latn script subtag mismatch (help)
- When certain languages are specified (see here),
{{langx}}
will applydir="rtl"
:{{langx|ydg|text}}
→[[Yidgha language|Yidgha]]: <i lang="ydg" dir="rtl">text</i>
→ Yidgha: text
- These lists are not comprehensive. I suspect that
dir="rtl"
is rarely actually needed except in cases where the browser gets confused (ltr digits mixed with rtl text) so the ~/langx mechanism can probably go away; maybe the ~/data list too. - —Trappist the monk (talk)
17:43, 14 November 2024 (UTC)18:59, 14 November 2024 (UTC) fix ~/langx link
- When certain scripts are specified (see here),
- Is
- Yes – except that
- Is the fix for the above to move the value from the script/region/variant to the code area? So
lang-en
[edit]{{langx}} shouldn't say "The non-English text to display." when |en| is allowed (as it should, since lang-en is being merged with it). Or at least "Text" shouldn't be a "Required field" as I can put "Literal translation". Web-julio (talk) 03:31, 19 October 2024 (UTC)
- The English Wikipedia is written in English so there is relatively little need to use any of
{{lang}}
,{{lang-en}}
, or{{langx}}
to markup English-language text. - The primary purpose of any of the lang templates is to properly construct html markup around non-English text so that browsers and screen readers know how to display or speak non-English text. At
{{lang-en}}
is this (preserved here because someday{{lang-en}}
and its subpages will be deleted):
Because this is English Wikipedia, the facts that a) the content is in English by default, and that b) the word "English" refers to the English language, are generally taken to be understood. Unlike many multilingual support templates, this template does not link the language name by default. To activate the link, add the
|link=yes
parameter.In most cases, there is no reason to use this template, unless you have a specific technical need for it. This template exists principally as a placeholder for interwiki purposes.
Legitimate use almost always involves automation. The vast majority of needed uses of this template are cases where
{{lang-xx}}
has values, possibly includingen
, inserted for xx automatically by software tools such as templates and bots.Some editors would also include using it in lists and tables that are using other such templates (e.g.
{{lang-es}}
for Spanish) to provide multiple translations of something, where consistency of output is desirable. However even in these cases it is better to use plain text, because{{lang-en|foo}}
is three characters longer than simplyEnglish: foo
and wastes performance on template parsing. That said, the form{{lang-en|foo}}
could be useful in such a table in a linguistics or language usage article, where a link to English language could be genuinely relevant in the context.It is rarely ever useful in ordinary article prose. Instead, for translating a foreign word, use {{gloss}}:
{{lang-es|casa}}, {{gloss|house}}
giving
- {{lang-es|casa}}, {{gloss|house}}
rather than:
{{lang-es|casa}}, {{lang-en|'house'}}
giving
- {{lang-es|casa}}, {{lang-en|'house'}}
which is pointless on en.wikipedia.org.
- That, in whole or in part, should perhaps be included (as a note?) in the
{{langx}}
documentation. The{{langx}}
doc might also be tweaked to incorporate parts of the{{lang}}
documentation. - I don't know what you mean by
"Text" shouldn't be a "Required field" as I can put "Literal translation".
Explain? - —Trappist the monk (talk) 14:26, 19 October 2024 (UTC)
- For example, cases like
{{lang|es|casa}} ({{langx|en|house}})
casa (English: house) are a very understandable use of lang-en/langx|en. Web-julio (talk) 03:37, 21 October 2024 (UTC)
- For example, cases like
Template | Languages | Scripts | Transliterations | Translation | Labels |
---|---|---|---|---|---|
{{Hani}}
|
Any | — | — | No | No |
{{CJKV}}
|
Yes | Always | |||
{{lang-zh}}
|
Chinese |
|
|
Yes | Optional |
{{Nihongo}}
|
Japanese | Japanese writing system[a] | Hepburn | Yes | Optional |
{{Nihongo2}}
|
Japanese | Japanese writing system[a] | — | No | No |
{{Korean}}
|
Korean |
|
Yes | Optional | |
{{Hanja}}
|
Korean | Hanja | — | No | Always |
{{Vi-nom}}
|
Vietnamese | Chữ Nôm | — | No | No |
{{Lang}}
|
Any | Any | Any | No | No |
{{Langx}}
|
Any | Any | Any | Yes | Optional |
- ^ a b c No parameter for giving a kana transcription; mixed orthography can be used.
- ^ A single "Korean" parameter—suitable for giving a Hangul transcription of a written word used in multiple languages, but not transcribing hanja in a Korean-specific context.
- ^ A single "Vietnamese" parameter—suitable for giving a transcription of a written word used in multiple languages, but not transcribing in a Vietnamese-specific context.
--HarJIT (talk) 13:54, 25 October 2024 (UTC)
Typo in "Langx |italic= parameter operation" section
[edit]In the Italic=value (last section of table), in the second entry, we see {{Langx|ru|''тундра''|italic=}invert}}
. There appears to be an extra right-brace right after "italic=". Tarl N. (discuss) 13:19, 28 October 2024 (UTC)
- I didn't realize my request was going to Template talk:Lang. The typo I'm referring to is in the Template:Langx section "Langx |italic= parameter operation" section. Why does the talk page for langx drop one here? Tarl N. (discuss) 02:21, 31 October 2024 (UTC)
- That error was fixed with this edit. This talk page is the centralized discussion page for several related templates and modules.
- —Trappist the monk (talk) 02:52, 31 October 2024 (UTC)
- Ah, thanks. Tarl N. (discuss) 00:26, 1 November 2024 (UTC)
Missing languages
[edit]We need the ability to feed languages outside ISO, for example, such as Old West Norse, Old East Norse, Old Swedish, Early Modern Swedish, Late Modern Swedish, etc. Blockhaj (talk) 08:27, 30 October 2024 (UTC)
- No we do not, in my opinion. Remsense ‥ 论 09:19, 30 October 2024 (UTC)
- Ur reasoning? Why limit ourselfs. Blockhaj (talk) 10:34, 30 October 2024 (UTC)
- Same reason as always: it serves insufficient concrete benefit to editors or readers, while increasing technical, conceptual, and potentially epistemological complexity. At this level of diachronic granularity, whose schemas are we meant to use? There's a reason ISO took on the task of producing a standard for this to begin with, wouldn't you agree? Remsense ‥ 论 10:44, 30 October 2024 (UTC)
- I disagree with the argument "insufficient concrete benefit to editors or readers". Current limits are limiting in a bad way. I feel we should instead strive for commonality with Wiktionary, whos expanded schemas i propose we use. Blockhaj (talk) 11:23, 30 October 2024 (UTC)
- As you are locking in the argument that there are concrete issues to be solved, would you mind directly articulating what they are? Remsense ‥ 论 22:20, 30 October 2024 (UTC)
- I disagree with the argument "insufficient concrete benefit to editors or readers". Current limits are limiting in a bad way. I feel we should instead strive for commonality with Wiktionary, whos expanded schemas i propose we use. Blockhaj (talk) 11:23, 30 October 2024 (UTC)
- Same reason as always: it serves insufficient concrete benefit to editors or readers, while increasing technical, conceptual, and potentially epistemological complexity. At this level of diachronic granularity, whose schemas are we meant to use? There's a reason ISO took on the task of producing a standard for this to begin with, wouldn't you agree? Remsense ‥ 论 10:44, 30 October 2024 (UTC)
- Ur reasoning? Why limit ourselfs. Blockhaj (talk) 10:34, 30 October 2024 (UTC)
- If there is sufficient need, we can create IETF private use tags for languages not directly supported in the IANA language-subtag-registry file. The list of currently supported private-use tags is at Template:Lang § Private-use language tags.
- Language templates based on Module:Lang will not adopt the mishmash of nonstandard tags that are supported at wiktionary.
- —Trappist the monk (talk) 13:15, 30 October 2024 (UTC)
extra params?
[edit]|anglicization=
/ |anglisation=
and |romanization=
/ |romanisation=
would be useful, |translation=
and |transliteration=
and |lit=
provide a translation, transliteration, and literal meaning; but if something has an older anglicization, that should also be available (ie. Crackow, etc), and a romanized form that is different from transliteration, because of some oddball or non-English choices in letter/character use, or because the language uses both latin and non-latin script, making the latin script version not a transliteration ; also for extended latin alphabets to basic latin alphabetic forms -- 65.92.246.77 (talk) 11:32, 30 October 2024 (UTC)
Private-use language tags
[edit]I propose the addition of the following private-use tags:
- Old East Norse: non-x-east
- Old Norwegian: nor-x-old
- Middle Norwegian: nor-x-middle
- Old Norwegian: nor-x-old
- Old West Norse: non-x-west
- Old Swedish: swe-x-old
- Early Modern Swedish: swe-x-ems
- Late Modern Swedish: swe-x-lms
- Early Modern Swedish: swe-x-ems
- Middle Danish: dan-x-middle
- Modern Danish: dan-x-modern
- Old Swedish: swe-x-old
Blockhaj (talk) 17:18, 1 November 2024 (UTC)
tracking sr usage with issues
[edit]@Trappist the monk I noticed {{lang-sr}} was deleted after the bot replaced its usage, but it also had a couple of semantic problems previously discussed at Template talk:Lang-sr and Talk:Romanization of Serbian that were never resolved:
- a lot of text is marked as just "Serbian" but we don't know if it's Latin, in which case it should be italicized, or if it's Cyrillic, in which case it shouldn't
- for example the lead section of Belgrade has:
- Serbian: Београд / Beograd
- and the latter part of that fails MOS:FOREIGNITALIC
- for example the lead section of Belgrade has:
- its third parameter was sometimes used to show the other script, but would mark it as "romanization", which may or may not be good - when discussing 500-year-old sources it's probably fine, but when discussing something from the last 50 years it's basically very weird
- for example as it was before this fix:
- and there is no "romanization" in the latter half of the 20th century, the company's name in Latin was of the same significance as its name in Cyrillic
How can we address these now with langx? Can we get at least some tracking categories if these symptoms are detected, so they can be checked? --Joy (talk) 09:54, 8 November 2024 (UTC)
- If this is such a problem, why wasn't
{{lang-sr}}
deleted long ago? Didn't we create{{lang-x2}}
,{{lang-sr-Cyrl-Latn}}
, and{{lang-sr-Latn-Cyrl}}
specifically to address this issue? Also,{{lang-sr-Cyrl}}
and{{lang-sr-Latn}}
? - This crude search finds about 4900 articles that use
{{langx|sr|...}}
and this crude search finds about 1500 articles that have{{langx|sr|<parameter>|<another parameter>|...}}
.<another parameter>
could be a named parameter or a 'transliteration'. - I am opposed to one-off special-case code. Module:Lang/langx has a list of unsupported language tags. Use of
{{langx}}
with any of those tags adds the page to Category:Langx uses unsupported language tag. I will addsr
to that list. In future, some of the currently unsupported language tags will be converted to supported private use tags. After that, I expect that the module will be tweaked so that the remaining unsupported language tags will cause the module to emit error messages. - —Trappist the monk (talk)
14:26, 8 November 2024 (UTC)15:19, 8 November 2024 (UTC) additional templates- I would guess the reason is that nobody in the know really wanted to create a TfD that would have required a check and possibly a change to 5k articles when lang-sr can be perfectly fine if the input text is only one Cyrillic parameter. We don't want to emit error messages to readers for that. How can we best manage this process of converting to different tags?
- BTW I also noticed that the old template had code to add Category:Instances of Lang-sr using second unnamed parameter since 2016, so the removal of this part is a bit of a regression. --Joy (talk) 16:55, 8 November 2024 (UTC)
- The day after I created Module:Lang/langx, I made myself a TODO-note wondering if
{{langx}}
couldn't auto-italicize in a manner similar to{{lang}}
. Sometime later I wrote a hack to do just that. I have moved that hack into Module:Lang/sandbox. What the{{langx/sandbox}}
renderings look like compared to the live{{lang}}
and{{langx}}
template renderings can be seen in this version of my sandbox (permalink). The hack should probably be rewritten so that Module:Lang will work for those other-language wikis that don't / won't support{{langx}}
. Any{{lang-??}}
templates that remain after the conversion will need to be checked to ensure that they continue to work as they were intended. - —Trappist the monk (talk) 20:44, 8 November 2024 (UTC)
- OK so if I read that right, overall the outcome would be that Serbian Latin would be italicized, and combinations still need manual interventions en masse? --Joy (talk) 21:41, 8 November 2024 (UTC)
- Of course, but you knew that. The new:
{{langx|sr|Београд / Beograd|lit=White City}}
- is just as broken as the old:
{{lang-sr|Београд / Beograd|lit=White City}}
- which is why you wrote
{{lang-sr-Cyrl-Latn}}
and its companions: - I imagine that you might write an awb script that is sufficiently clever to create
{{lang-sr-Cyrl-Latn}}
from{{langx|sr|Београд|Beograd|White City}}
. Mayhaps even from{{langx|sr|Београд / Beograd|lit=White City}}
. - —Trappist the monk (talk) 23:01, 8 November 2024 (UTC)
- Okay, but none of this addresses my original point - how do we find them first. This issue may affect sr, sh, cnr and uz IIRC, can't we just have a tracking category for this whole class of lang-x2 languages? --Joy (talk) 10:19, 9 November 2024 (UTC)
- Did I not suggest how to find articles that use
{{langx|sr|...}}
? Repeating the second of those suggested searches here with similar searches for the other three language tags: - I am opposed to special-case code.
- —Trappist the monk (talk) 19:41, 9 November 2024 (UTC)
- I mean we can genericize it even further - Category:Articles containing Serbian-language text shows 20.5k, why wouldn't we simply distinguish those 1.5k... and in turn why not have a tracking category for labeled vs. not labeled for each language. Is there a particular cost to having two subcategories instead of just one? --Joy (talk) 21:33, 9 November 2024 (UTC)
- I have written a simple awb script that trawls the search results above and lists those articles that have
{{langx}}
templates that are candidates for conversion to{{lang-<tag>-Cyrl-Latn}}
. I have put the four lists in your user space; see User:Joy/candidate articles for lang-xx-Cyrl-Latn. - —Trappist the monk (talk) 19:05, 10 November 2024 (UTC)
- I have written a simple awb script that trawls the search results above and lists those articles that have
- I mean we can genericize it even further - Category:Articles containing Serbian-language text shows 20.5k, why wouldn't we simply distinguish those 1.5k... and in turn why not have a tracking category for labeled vs. not labeled for each language. Is there a particular cost to having two subcategories instead of just one? --Joy (talk) 21:33, 9 November 2024 (UTC)
- Did I not suggest how to find articles that use
- Okay, but none of this addresses my original point - how do we find them first. This issue may affect sr, sh, cnr and uz IIRC, can't we just have a tracking category for this whole class of lang-x2 languages? --Joy (talk) 10:19, 9 November 2024 (UTC)
- Of course, but you knew that. The new:
- OK so if I read that right, overall the outcome would be that Serbian Latin would be italicized, and combinations still need manual interventions en masse? --Joy (talk) 21:41, 8 November 2024 (UTC)
- The day after I created Module:Lang/langx, I made myself a TODO-note wondering if
Next steps?
[edit]Nice job with clearing and deleting all the templates from the TfD!
From the left over templates, we have
- those at Category:Lang-x templates with other than ISO 639. I think that if we aren't planning on deleting them, then we should support them with the private use range.
- templates with IPA support like Template:Lang-rus. Can we add
|ipa=
support built-in to the module?
Another question which I have is regarding the script templates at Category:Script–font templates. If font support is needed for specific languages, why don't we support it via the module? Is the text less clear with us not always using it? Are some of these outdated with newer Unicode support?
Regarding #Tracking categories, I think making the difference between lang and langx only being the label is the right way to handle this, as the label=no situation is not only unnecessary code in text, but it also disables all other labels. Gonnym (talk) 10:11, 12 November 2024 (UTC)
- Of the templates originally in Category:Lang-x templates with other than ISO 639, several have been converted to be usable by
{{lang}}
and{{langx}}
:- Template:Lang-ast-leo → Leonese: text → added
ast-ES
language tag (used internally by{{lang}}
) to Module:Lang/data - Template:Lang-az-Arab → [text] Error: {{Langx}}: Latn text/non-Latn script subtag mismatch (help) →
az-Arab
is a properly formed IETF language tag that was ignored by wrapped{{Language with name}}
template - Template:Lang-fr-gallo → Gallo: text → added
fr-gallo
to Module:Lang/data - Template:Lang-fra-que → Quebec French: text → added
fr-CA
to Module:Lang/data - Template:Lang-ku-Cyrl → [text] Error: {{Langx}}: Latn text/non-Latn script subtag mismatch (help) →
ku-Cyrl
is a properly formed IETF language tag; converted from{{Language with name}}
- Template:Lang-lmo-IT → Bergamasque: text → added
lmo-x-berg
to Module:Lang/data - Template:Lang-oc-gascon → Gascon: text →
oc-gascon
is a properly formed IETF language tag ignored by wrapped{{Language with name}}
- Template:Lang-ast-leo → Leonese: text → added
- which leaves us with these:
- Template:Lang-1ca – Old Anatolian Turkish is a defunct Turkic language; private use tag might be possible:
trk-x-oldanat
; don't know iftrk
is the right base tag - Template:Lang-est-sea – Seto is a dialect of Estonian; private use tag might be possible:
et-x-seto
- Template:Lang-fra-frc – private use tag might be possible:
fr-x-frainc
- Template:Lang-1ca – Old Anatolian Turkish is a defunct Turkic language; private use tag might be possible:
- These are not languages so we probably ought not support them with Module:Lang; that being the case, these templates don't belong in Category:Lang-x templates with other than ISO 639:
- Template:Lang-sq-definite – definiteness is a linguistic construct
- Template:Lang-uniturk – Uniform Turkic Alphabet is a writing system
- Template:Lang-vi-chunom – chữ Nôm is a writing system; applies custom styling with
{{Vi-nom}}
- Template:Lang-vi-hantu – chữ Hán is a writing system; applies custom styling with
{{Vi-nom}}
- I don't currently have an opinion about styling templates. I suspect that there are editors who will demand styling because they prefer the styled for over the default form:
- I suspect that there would be a deal of work to be done were we to attempt to consolidate the various scripts and their (sometimes) attendant css files.
- I don't really understand what you mean by your #Tracking categories comment. And, if that comment was a continuation of that other discussion, doesn't the comment belong there?
- —Trappist the monk (talk) 19:06, 12 November 2024 (UTC)
- Nice, good job again on shortening the list!
- Regarding
1ca
, looking at the article,trk
seems the most correct. est-sea
's linguage in the article seems a bit confusing. It says it's a South Estonian but that article's infobox does not list Estonian as a parent (the lead does though). It's most recent parent according to the infobox is Võro language. Not sure ifet-x-seto
is the most correct.fr-x-frainc
seems good.- Regarding script templates, I thought the reason was not just visible preference but because it renders it correctly, but maybe that isn't true, or always true. I think though that it's probably better for the wiki if we use consistent fonts so we don't have instances of the the above Hebrew translation which look different, even on the same page. It will also make for shorter code on the pages themselves if we don't need to apply the script template manually.
- Some templates that use script and currently can't be merged: Template:Lang-ku-Arab
- Others:
- Template:Lang-ka and Template:Transl-grc usages can be converted if we support an automatic transliteration (
|auto=yes
or something), which will call their respected templates (if they exist). - Template:Lang-rus can be converted if we support
|IPA=
or, if we remove support of IPA from outside that template. In general though, I don't think it's smart of us to have lang-rus around as that's an opening for yet another batch of templates created in similar style.
- Template:Lang-ka and Template:Transl-grc usages can be converted if we support an automatic transliteration (
- Regarding
- Gonnym (talk) 10:59, 13 November 2024 (UTC)
- Nice, good job again on shortening the list!
auto italics for {{langx}}
[edit]At present, {{langx}}
uses a list of language tags scraped from those now deleted {{lang-??}}
templates that called lang_xx_inherit()
. That function sets the initial rendering style for a {{lang-??}}
template to upright. The list of tags is in Module:Lang/langx at lines 1–536 (permalink).
In the sandbox, I have adapted the auto italics code used by {{lang}}
so that we aren't limited by the hard-coding in the inherit_t
list. Serbian is a good example. That language gives equal status to Cyrillic and Latin text. Currently, the live version of {{langx|sr|<text>}}
renders <text>
in an upright font regardless of script. The proposed sandbox version renders Cyrillic <text>
in an upright font and Latin <text>
in an italic font. {{lang}}
renderings here for reference:
- српски језик ←
{{lang|sr|српски језик}}
- Serbian: српски језик ←
{{langx|sr|српски језик}}
- Serbian: српски језик ←
{{langx/sandbox|sr|српски језик}}
- srpski jezik ←
{{lang|sr|srpski jezik}}
- Serbian: srpski jezik ←
{{langx|sr|srpski jezik}}
- Serbian: srpski jezik ←
{{langx/sandbox|sr|srpski jezik}}
Without objection, I shall update the live version of the module to support auto italics.
—Trappist the monk (talk) 23:28, 12 November 2024 (UTC)
- Good idea on making the source of information of both styles the same. Gonnym (talk) 11:03, 13 November 2024 (UTC)
lang error that currently can't be fixed within the template
[edit]At Adoptionism#Ebionites (and I've seen this issue in many other places), the code used is {{lang|hbo|אביונים|ebyonim}}
, this produces an error as {{lang}} does not support transliteration. This can be fixed by changing to use {{langx}}, however the label it will produce for the language isn't wanted there. |label=none
can be used, but then it also removes the label for the romanization, which is wanted there. One can remove the transliteration outside the template, but that just defeats the purpose of the template.
What should happen in my opinion, and I've said this somewhere in one of the above sections, is that {{lang}} and {{langx}} should have the same secondary features regarding transliteration and literal translation, with the difference being that Langx produces a language label and Lang does not (but does produce labels for the other parts). Gonnym (talk) 17:00, 14 November 2024 (UTC)
Broken usage of langx
[edit]I'm not sure how this template works, but this page is complaining about a missing parameter "p", and I'm not sure how to fix it. x42bn6 Talk Mess 18:24, 15 November 2024 (UTC)
- The page was calling {{lang-ru}} with
|p=
. The template has been deleted, so I don't know if|p=
(for "pronunciation", possibly) was a valid parameter. An admin will be able to check. – Jonesey95 (talk) 18:51, 15 November 2024 (UTC) - Some history – I didn't go back to the very beginning:
- changed from
{{lang-ru}}
to{{lang-rus}}
at this edit –{{lang-rus}}
supports the|p=
parameter - changed from
{{lang-rus}}
to{{lang-ru}}
at this edit –{{lang-ru}}
ignored the unsupported|p=
parameter - changed from
{{lang-ru}}
to{{langx|ru|...}}
at this edit –{{langx}}
ignored the unsupported|p=
parameter until just a day or so ago; now it emits an error message when editors give it parameters that it does not support.
- changed from
- —Trappist the monk (talk) 19:00, 15 November 2024 (UTC)
- So it looks like one possible fix is to change the template transclusion back to {{lang-rus}}. Or is that creating more work in the future? This error is present in other articles, such as Denis Cheryshev. – Jonesey95 (talk) 19:17, 15 November 2024 (UTC)
- For now changing is the fix. I did however propose that we either disentangle the unsupported features from -rus or add support for them so other languages can use. There is really almost no reason at all for any specific-language template to stay after the creation of langx. Gonnym (talk) 19:24, 15 November 2024 (UTC)
- Pending more granular tracking categories or sorting within the category, an insource search shows 63 articles with this particular error. Most appear to be using lang|ru, but at least a few are using lang|zh, which I have not investigated. – Jonesey95 (talk) 14:32, 16 November 2024 (UTC)
- It looks like there is also an error message with "sc", which presumably refers to script. Mellk (talk) 13:35, 22 November 2024 (UTC)
- Thanks, but it is not necessary for you to report each instance of unknown parameters causing error messages. They are all collected in Category:Lang and lang-xx template errors which at present lists 659 pages.
- —Trappist the monk (talk) 13:57, 22 November 2024 (UTC)
- Since this is related to lang-rus, the issue is not just "p=". Mellk (talk) 14:06, 22 November 2024 (UTC)
- The 'issue' is
{{lang}}
and{{langx}}
with parameters that are not know to those templates. The issue is not confined to{{lang-rus}}
or{{lang-zh}}
templates that have been improperly changed to{{lang}}
or{{langx}}
. Here are searches that are not parameter specific for both templates:{{lang}}
~680 articles{{langx}}
~190 articles
- Yep, there is a lot of junk out there. You still don't need to make a report here for every subgroup of errors that you encounter out there.
- —Trappist the monk (talk) 14:43, 22 November 2024 (UTC)
- I did not plan to make a report for every error. I also did not say that the errors are confined to lang-rus (that is pretty obvious when the search above showed that it was not just ru). I was referring to the fix suggested above. Mellk (talk) 14:59, 22 November 2024 (UTC)
- The 'issue' is
- Since this is related to lang-rus, the issue is not just "p=". Mellk (talk) 14:06, 22 November 2024 (UTC)
- It looks like there is also an error message with "sc", which presumably refers to script. Mellk (talk) 13:35, 22 November 2024 (UTC)
- I think it is also possible to move pronunciation to the IPA template. I was under the impression that lang-rus would eventually be replaced, but it seems like this is not the case yet? Mellk (talk) 09:38, 22 November 2024 (UTC)
- Pending more granular tracking categories or sorting within the category, an insource search shows 63 articles with this particular error. Most appear to be using lang|ru, but at least a few are using lang|zh, which I have not investigated. – Jonesey95 (talk) 14:32, 16 November 2024 (UTC)
- For now changing is the fix. I did however propose that we either disentangle the unsupported features from -rus or add support for them so other languages can use. There is really almost no reason at all for any specific-language template to stay after the creation of langx. Gonnym (talk) 19:24, 15 November 2024 (UTC)
- So it looks like one possible fix is to change the template transclusion back to {{lang-rus}}. Or is that creating more work in the future? This error is present in other articles, such as Denis Cheryshev. – Jonesey95 (talk) 19:17, 15 November 2024 (UTC)
Lang error category without error message?
[edit]Church Slavonic is in Category:Lang and lang-xx template errors, but I am unable to find a red error message. Maybe I just can't see it. – Jonesey95 (talk) 19:30, 15 November 2024 (UTC)
Do you see it here:[a]
[ⱌⱃⰽⰲⰰⱀⱁⱄⰾⱁⰲⱑⱀⱄⰽⱜ ⰵⰸⰻⰽⱜ] Error: {{Langx}}: invalid parameter: |script= (help)
Fixing the deprecated |script=
parameter (cu
→ cu-Glab
) resolves the problem.[a]
Croatian Church Slavonic: ⱌⱃⰽⰲⰰⱀⱁⱄⰾⱁⰲⱑⱀⱄⰽⱜ ⰵⰸⰻⰽⱜ, romanized: crkavnoslověnskь jezikь
- ^ Croatian Church Slavonic: ⱌⱃⰽⰲⰰⱀⱁⱄⰾⱁⰲⱑⱀⱄⰽⱜ ⰵⰸⰻⰽⱜ, romanized: crkavnoslověnskь jezikь
It has been a while, but I've seen these before and if my failing memory is correct, always associated with {{efn}}
. I was never able to figure out why the invalid error message gets sandwiched into and corrupts the maintenance message.
—Trappist the monk (talk) 20:04, 15 November 2024 (UTC)
- No, I do not see an error message in this talk page section. Maybe my custom CSS is suppressing it? When I inspect the page, I see Note the display:none. – Jonesey95 (talk) 14:33, 16 November 2024 (UTC)
<span class="lang-comment" style="font-style: normal; display: none; color: #33aa33; margin-left: 0.3em;">{{langx}} uses deprecated parameter(s) </span>
- I can see error messages above now, and in the 20 October 2024 version of Church Slavonic. This appears to be resolved. – Jonesey95 (talk) 18:47, 22 November 2024 (UTC)
Use in headers
[edit]If there is non-English text in section headers, should we use this template? E.g. == Hello ({{lang|ko|안녕}}) ==
seefooddiet (talk) 23:31, 15 November 2024 (UTC)
- Isn't this a question for the appropriate WP:MOS talk page? Templates and wikilinks are discouraged in section headings; see MOS:HEADINGS.
{{lang}}
is a template and, unless|nocat=yes
will create a category wikilink. I can imagine that we could make{{lang}}
subst-able in a way that it knows that it is being subst'd so won't emit a category. Once subst'd you'd end up with a header that looks like this:== Hello (
<span title="Korean-language text"><span lang="ko">안녕</span></span>
) ==
- I don't know if there are any rules regarding html markup in headings so posing your question elsewhere would be a good idea. Start at WT:MOS?
- —Trappist the monk (talk) 00:24, 16 November 2024 (UTC)
text/script mismatch
[edit]I've been picking away at Category:Langx deprecated parameters and noticed multiple instances of {{lang}}
and {{langx}}
templates where <text>
does not match the script specified by the script subtag. For example, this:
{{langx|tly-Latn|Фәхрәддин Әбосзодә}}
→ [Фәхрәддин Әбосзодә] Error: {{Langx}}: Non-latn text (pos 1)/Latn script subtag mismatch (help)
In that template, <text>
is clearly not Latn
script but {{langx}}
doesn't notice and so incorrectly renders <text>
in italic form.
So, in the sandbox, I've fixed that, at least partially. To support auto-italics, Module:Lang evaluates <text>
to see if it is wholly Latn script. When it is not, <text>
is rendered upright (unless overridden by |italic=
). Since we know that <text>
is or is not Latn script, we can check the script subtag (if present) to see that it is appropriate. In the example above, the Cyrillic <text>
does not match the -Latn
subtag.
Conversely, when <text>
is Latn script, a mismatch exists when the script subtag is not -Latn
:
{{langx|tly-Cyrl|Text}}
→ [Text] Error: {{Langx}}: Latn text/non-Latn script subtag mismatch (help)
Again {{langx}}
does not notice so <text>
is incorrectly rendered in upright form.
Fixed in the ~/sandbox:
{{langx/sandbox|tly-Latn|Фәхрәддин Әбосзодә}}
→ [Фәхрәддин Әбосзодә] Error: {{Langx}}: Non-latn text (pos 1)/Latn script subtag mismatch (help){{langx/sandbox|tly-Cyrl|Text}}
→ [Text] Error: {{Langx}}: Latn text/non-Latn script subtag mismatch (help)
Same applies to {{lang}}
so:
{{lang/sandbox|tly-Latn|Фәхрәддин Әбосзодә}}
→ [Фәхрәддин Әбосзодә] Error: {{Lang}}: Non-latn text (pos 1)/Latn script subtag mismatch (help){{lang/sandbox|tly-Cyrl|Text}}
→ [Text] Error: {{Lang}}: Latn text/non-Latn script subtag mismatch (help)
Without objection, I shall implement this in the live module.
—Trappist the monk (talk) 16:15, 17 November 2024 (UTC)
Category renames
[edit]Now that almost all lang-xx have been deleted, the categories should be renamed to "Lang and langx".
Also, Template:My has ended in deletion, so if the bot can help with that replacement it would be great. Gonnym (talk) 16:20, 18 November 2024 (UTC)
- Switching
{{my}}
to{{lang}}
is outside of the Monkbot/task 20 remit. One might write an awb task to do the job though I notice that there are others already doing the work. Unless{{my}}
lingers for longer than it should (don't know how long that is) I guess I wouldn't worry about it. - Yeah, categories should be renamed. I suppose that can happen at any time so long as it happens at about the same time that we update Module:Lang to use the new names. The module should continue to support the existing names for those wikis that don't support
{{langx}}
. - —Trappist the monk (talk) 17:09, 18 November 2024 (UTC)
fn lang_xx_inherit parameter values removed
[edit]Trappist the monk recently did a major overhaul of Module:Lang in order to implement Template:Langx. (Thanks!) In the process, he removed |fn=lang_xx_inherit
, |fn=lang_xx_italic
, and |fn=lang
from Module:Lang as "no longer required". However, this broke Template:Translated blockquote, which depended on this feature, and is used in mainspace articles.
Based on the old documentation, I believe this documents the equivalent replacements:
Old code | New code |
---|---|
{{Lang |
{{Lang |
{{Lang |
{{Langx |
{{Lang |
{{Langx |
Please correct me if the old and and new code columns above are not exactly equivalent. I thought I would document this here in case any other template editors experienced similar errors from the removal of this functionality. I have yet to fix Template:Translated blockquote but plan on it in the next few days. Daask (talk) 21:03, 18 November 2024 (UTC)
- Restored in Module:Lang/sandbox.
|fn=lang_xx_inherit
and|fn=lang_xx_italic
were created so that editors didn't have to create yet another{{lang-??}}
template;|fn=lang
just came along for the ride. With the advent of{{langx}}
that generic use is no longer required. - We don't check parameter use for the useful utilities:
|fn=is_ietf_tag
,|fn=is_lang_name
,|fn=name_from_tag
, and|fn=tag_from_name
;name_from_tag
shown here for completeness. - Test the fix in
{{Translated blockquote/sandbox}}
by switching{{lang}}
to{{lang/sandbox}}
. - —Trappist the monk (talk) 00:15, 19 November 2024 (UTC)
- @Trappist the monk: Template:Lang/sandbox, and Template:Translated blockquote/sandbox, which now uses it, work as expected. Do you intend to restore these features to Template:Lang? Daask (talk) 14:24, 19 November 2024 (UTC)
- Yeah, I think I have to. Some version of Module:Lang is used on ~160 MediaWiki sites. There may be sites that rely on
|fn=
. - —Trappist the monk (talk) 15:15, 19 November 2024 (UTC)
- Yeah, I think I have to. Some version of Module:Lang is used on ~160 MediaWiki sites. There may be sites that rely on
- @Trappist the monk: Template:Lang/sandbox, and Template:Translated blockquote/sandbox, which now uses it, work as expected. Do you intend to restore these features to Template:Lang? Daask (talk) 14:24, 19 November 2024 (UTC)
Putting lang inside of langx?
[edit]I was curious if there is any point of putting lang inside of langx? for examples, see any of these. these are all single nestings, but I have also see cases with multiple {{lang}} inside of one {{langx}}. Frietjes (talk) 15:44, 19 November 2024 (UTC)
- None that I can think of unless the editor felt that the tool-tip was a requirement. Regardless, such constructs result in improper html and pointless category link duplication. For example:
{{langx|ain|{{lang|ain-Kana|アィヌ}}}}
→[[Ainu language|Ainu]]: <span lang="ain"><span title="Ainu (Japan)-language text"><span lang="ain-Kana">アィヌ</span></span>[[Category:Articles containing Ainu (Japan)-language text]]</span>[[Category:Articles containing Ainu (Japan)-language text]]
- the first category link (in English) is marked up as Ainu.
- The above was a conversion from:
{{lang-ain|アィヌ}}, {{transl|ain|Aynu}}
- to:
{{lang-ain|{{lang|ain-Kana|アィヌ}}, {{lang|ain-Latn|Aynu}}
- at this edit by SrpskiAnonimac.
- I can see no real useful reason why
{{lang}}
/{{langx}}
should be nested. Don't do that. - The fix for the above, as it currently exists in Ainu people § Names, is:
{{langx|ain-Kana|アィヌ}}
→ Ainu: アィヌ
- For others like this one from Roman province § Republican period where the two language tags are different:
{{langx|el|{{lang|grc|ἐπαρχίᾱ}}}}
- the fix is to use the language tag that directly wraps the text (no doubt there will be exceptions):
{{langx|grc|ἐπαρχίᾱ}}
→ Ancient Greek: ἐπαρχίᾱ
- —Trappist the monk (talk) 16:51, 19 November 2024 (UTC)
Errors in the template documentation
[edit]I am seeing what I believe are new errors in the template documentation. In the table headed "Langx |italic= parameter operation", I see many cells with output like "script= (help)". I suspect that an unescaped pipe in the error message output may be causing something unwanted to happen. – Jonesey95 (talk) 15:06, 20 November 2024 (UTC)
- Yep, fixed.
- —Trappist the monk (talk) 15:24, 20 November 2024 (UTC)
- Much better; thank you. – Jonesey95 (talk) 19:04, 20 November 2024 (UTC)
Non-latn text/Latn script subtag mismatch errors in ancient Iranian articles
[edit]Articles regarding ancient Iranian society like Mithra, Mantra (Zoroastrianism)#Etymology and Saoshyant#Etymology are showing this error recently, and I'm not sure how to fix them. —CX Zoom[he/him] (let's talk • {C•X}) 13:09, 26 November 2024 (UTC)
- Do you really mean to romanize Miθra and Miθraʰ with 'θ' (Greek small letter theta)? Do you really mean to romanize Astwat̰-әrәta and astvat-әrәta with 'ә' (Cyrillic small letter schwa)?
- Apparently there is no unicode for Latin theta so that may require some sort of modification to Module:Lang if, in fact, you did really mean to use the Greek theta character. There is a Latin small letter schwa: 'ə'. Wouldn't that be the correct choice when romanizing Astwat̰-әrәta and astvat-әrәta?
- —Trappist the monk (talk) 15:15, 26 November 2024 (UTC)
- Sorry, I don't know much about how romanization works, but I believe you are correct about the schwa symbol. For Latin theta, I think there needs to be an exception. Or maybe {{transliteration}} would fit better here? I saw it work fine in some other articles. —CX Zoom[he/him] (let's talk • {C•X}) 17:41, 26 November 2024 (UTC)
{{transliteration|ae|Miθra}}
should emit an error message because Greek theta is not Latin theta and in the rendering, 'Miθra' is marked up as Latin text:<span title="Avestan-language romanization"><i lang="ae-Latn">Miθra</i></span>
- Miθra
- For the same reason, were we using
{{langx}}
, there should be an error message:{{langx|ae|𐬨𐬌𐬚𐬭𐬀|Miθra}}
[[Avestan language|Avestan]]: <span lang="ae" dir="rtl">𐬨𐬌𐬚𐬭𐬀</span>, <small>romanized: </small><span title="Avestan-language romanization"><i lang="ae-Latn">Miθra</i></span>
- Avestan: 𐬨𐬌𐬚𐬭𐬀, romanized: Miθra
- These need to be fixed.
- I think that I have a solution to the
{{lang|ae-Latn|Miθra}}
where 'θ' is the Greek form but I'll hold off on implementing that until I've fixed the missing transliteration error messaging. - —Trappist the monk (talk) 19:21, 26 November 2024 (UTC)
- Sorry, I don't know much about how romanization works, but I believe you are correct about the schwa symbol. For Latin theta, I think there needs to be an exception. Or maybe {{transliteration}} would fit better here? I saw it work fine in some other articles. —CX Zoom[he/him] (let's talk • {C•X}) 17:41, 26 November 2024 (UTC)
- I have tweaked the sandbox so that when the Greek theta (U+03B8) is the only non-Latin character in a string of text, it is assumed to represent the non-existent (in Unicode) Latin theta. Here are a variety of illustrations:
- For
{{lang}}
:{{Lang/sandbox|ae-Latn|Miθraʰ}}
→ Miθraʰ – assume Latin theta becauseLatn
script specified and all other characters in<text>
are Latin script{{Lang/sandbox|ae-Cyrl|Miθraʰ}}
→ [Miθraʰ] Error: {{Lang}}: Latn text/non-Latn script subtag mismatch (help) – assume Latin theta because all other characters in<text>
are Latin script; script/text mismatch:Cyrl
script specified but<text>
is Latin script{{Lang/sandbox|ae|Miθraʰ}}
→ Miθraʰ – assume Latin theta because all other characters in<text>
are Latin script
- When theta is the only character in
<text>
:{{Lang/sandbox|ae-Latn|θ}}
→ θ – assume Latin theta becauseLatn
script specified{{Lang/sandbox|ae-Cyrl|θ}}
→ θ – assume Cyrillic theta becauseCyrl
script specified – Greek/Cyrillic Unicode mismatch not checked{{Lang/sandbox|ae|θ}}
→ θ – assume Greek theta because script not specified
- For
{{langx}}
:{{Langx/sandbox|ae-Latn|Miθraʰ}}
→ Avestan: Miθraʰ – assume Latin theta becauseLatn
script specified and all other characters in<text>
are Latin script{{Langx/sandbox|ae-Cyrl|Miθraʰ}}
→ [Miθraʰ] Error: {{Langx}}: Latn text/non-Latn script subtag mismatch (help) – assume Latin theta because all other characters in<text>
are Latin script; script/text mismatch:Cyrl
script specified but<text>
is Latin script{{Langx/sandbox|ae|Miθraʰ}}
→ Avestan: Miθraʰ – assume Latin theta because all other characters in<text>
are Latin script
- When theta is the only character in
<text>
:{{Langx/sandbox|ae-Latn|θ}}
→ Avestan: θ – assume Latin theta becauseLatn
script specified{{Langx/sandbox|ae-Cyrl|θ}}
→ Avestan: θ – assume Cyrillic theta becauseCyrl
script specified – Greek/Cyrillic Unicode mismatch not checked{{Langx/sandbox|ae|θ}}
→ Avestan: θ – assume Greek theta because script not specified
- For
{{langx}}
with<translit>
: - For:
{{transliteration}}
{{transliteration/sandbox|ae|Miθra}}
→ Miθra – assume latin theta because<code>
is a language tag{{transliteration/sandbox|ae|θ}}
→ θ – assume latin theta because<code>
is a language tag{{transliteration/sandbox|latn|θ}}
→ θ – assume latin theta because<code>
is a script tag{{transliteration/sandbox|cyrl|θ}}
→ θ – assume latin theta because<code>
is a script tag{{transliteration/sandbox|ru|ш}}
→ [ш] Error: {{Transliteration}}: transliteration text not Latin script (pos 1) (help) – error because<translit>
notlatn
script{{transliteration/sandbox|cyrl|ш}}
→ [ш] Error: {{Transliteration}}: transliteration text not Latin script (pos 1) (help) – error because<translit>
notlatn
script
- For
- Without objection, I shall update the live module.
- —Trappist the monk (talk) 20:38, 27 November 2024 (UTC)
- Updated.
- —Trappist the monk (talk) 17:33, 28 November 2024 (UTC)
Category:Transliteration template errors $2
[edit]The article First Sino-Japanese War, in the sidebar box entitled "First Sino-Japanese War", contains a transliteration error and also appears to be assigning the nonexistent category Category:Transliteration template errors $2. I suspect that recent changes to this module or one of its subpages has caused this new, nonexistent category to appear. – Jonesey95 (talk) 18:45, 28 November 2024 (UTC)
- Fixed I think; the miscoding (on my part) also added articles to Category:Lang and lang-xx template errors $2. The article count in Category:Lang and lang-xx template errors was going down, which is an expected result of the change. On the other hand, Category:Transliteration template errors was not changing so I was beginning to wonder why. Now I know why.
- —Trappist the monk (talk) 19:35, 28 November 2024 (UTC)
- I figured it was a small typo like this. I don't go looking for these things, but I look at a lot of pages with errors in my travels, and I often stumble across new entries in error reports and categories that are caused by template and module changes. – Jonesey95 (talk) 21:03, 28 November 2024 (UTC)
Lone Common-script letter causes the non-Latin error
[edit]The lone {{transl|ar|ʾ}} (U+02BE ʾ MODIFIER LETTER RIGHT HALF RING) triggers the error as in DIN 31635. Looks like it works okay in longer words with other letters present, but not alone. – MwGamera (talk) 21:02, 28 November 2024 (UTC)
- To determine if
<text>
is Latin script, Module:Lang uses Module:Unicode data.U+02BE
is:{{#invoke:Unicode data|lookup|script|02BE}}
→Zyyy
- For a
Latn
determination, the<text>
must contain at least oneLatn
-script character and then may contain one or more characters fromZinh
(Code for inherited script)Zyyy
(Code for undetermined script),Zzzz
(Code for uncoded script) scripts. - Giving
{{transliteration}}
an okina (U+02BB), an apostrophe (U+0027) – or any other punctuation – will cause the same error message return:{{transl|ar|ʻ}}
→ ʻ – okina:Zyyy
←{{#invoke:Unicode data|lookup|script|02BB}}
{{transl|ar|'}}
→ ['] Error: {{Transliteration}}: transliteration text not Latin script (pos 1) (help) – apostrophe:Zyyy
←{{#invoke:Unicode data|lookup|script|0027}}
- —Trappist the monk (talk) 23:41, 28 November 2024 (UTC)
- I mean, I can see that this is happening, that's why I mentioned it being of the Common script (aliased to the ISO code
Zyyy
here), but the result is clearly undesirable. Maybe the template wasn't meant to be used with single letters, but if the usage is appropriate (and it seems to be to me) then the check is incorrect. I'm sure it might help catching some mistakes, but the Script property of characters used and the language tag to mark it up with are conceptually related but different things. Since I'm not sure what exactly was intended, I'm just pointing out another place where the current solution fails short and needs someone's attention. – MwGamera (talk) 13:26, 29 November 2024 (UTC) - This needs to be fixed ASAP, as editors are responding by just removing the template. Remsense ‥ 论 04:25, 1 December 2024 (UTC)
- Is there any value in placing single punctuation in a language tag? Do screen readers read these differently? Gonnym (talk) 09:35, 1 December 2024 (UTC)
- I have no idea what screen readers do with symbols like that, but it affects font choice and other styling. I would consider it desirable to have all transliterations (or transcriptions) consistently marked up the same way no matter if they are of just a single letter (or phoneme) or of a longer word. – MwGamera (talk) 14:31, 1 December 2024 (UTC)
- I agree. I wrote the original
is_Latin
function (back when Module:Unicode data wasn't restricted to template editors) and I think in view of the cases of lone modifier letters, Module:lang should use a different function that checks that there are no non-Latin characters (for instance, no Cyrillic or Greek characters), but permits Common and Inherited characters. That might not be sufficient as I think some Greek characters are used in orthography of Latin-script languages and have no Latin-script equivalents (I can look for specific cases if there is interest), but it's an improvement. Lone Common-script characters should have the correct markup, and I should have thought of these cases when I was creating the function. — Eru·tuon 05:40, 29 December 2024 (UTC) - In fact, I believe that when I wrote the
is_Latin
function, it was only being used to decide whether to italicize foreign-language text (MOS:FOREIGN). I didn't intend it to decide whether {{transl}} should display an error. — Eru·tuon 00:36, 30 December 2024 (UTC)
- I agree. I wrote the original
- I have no idea what screen readers do with symbols like that, but it affects font choice and other styling. I would consider it desirable to have all transliterations (or transcriptions) consistently marked up the same way no matter if they are of just a single letter (or phoneme) or of a longer word. – MwGamera (talk) 14:31, 1 December 2024 (UTC)
- Is there any value in placing single punctuation in a language tag? Do screen readers read these differently? Gonnym (talk) 09:35, 1 December 2024 (UTC)
- I mean, I can see that this is happening, that's why I mentioned it being of the Common script (aliased to the ISO code
lang sandbox edits
[edit]@Gonnym: Something about this edit broke Module:Lang/sandbox so that the testcases fail.
Also: maker_error_span()
should be make_error_span()
?
—Trappist the monk (talk) 15:41, 30 November 2024 (UTC)
- Fixed both. make_error_span() could probably be replaced with
make_error_msg()
which also handles the span. I just created it to have that code be in one place while it was there. Gonnym (talk) 22:37, 30 November 2024 (UTC)
Issue with use in links
[edit]Discussion at Wikipedia:Main Page/Errors#Friday's FA has identified an issue with (some browsers') display of the title attribute for code like:
''[[École Polytechnique massacre|{{Lang|fr|École Polytechnique|italic=no}} massacre]]'''
where the displayed link contains text in more than one language (arguable in this case, but the point is general).
This could be remedied by allowing suppression of the title attribute, by writing, say:
''[[École Polytechnique massacre|{{Lang|fr|École Polytechnique|italic=no|title=no}} massacre]]''
or possibly better still by simply removing the title attribute completely.
Why do we need that attribute?
Can we apply one or other solution? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 16:26, 30 November 2024 (UTC)
Wrong font for lzh (Literary Chinese)
[edit]When using "lang|lzh" for Literary Chinese texts, it seems to be using a Taiwanese font?
For example, 有 is typically written as 月 which is also seen in historical texts such as in the Kangxi dictionary (inherited glyphs). But in the Taiwanese standard, they prefer to write it as ⺼ which is modern orthography (Traditional Chinese characters ≠ Literary Chinese characters). Another example would be 遣 where the radical ⻌ would be written as ⻍ according to the inherited glyphs, while the Taiwanese standard is ⻎. The template uses ⻎ instead of ⻍. How would one change it so that the template would use fonts (such as I.Ming) that are based on the inherited glyphs rather than the Taiwanese Traditional characters fonts (which are based on handwriting and their own standard)? Lachy70 (talk) 07:25, 4 December 2024 (UTC)
- This is only related to the fonts your system picks to render specific languages, and has nothing to do with Wikipedia. Remsense ‥ 论 08:04, 4 December 2024 (UTC)
- It doesn't use Taiwanese font for me. Unfortunately browsers allow configuring default fonts only for handful of languages (if at all) and
lzh
isn't among them. And system configuration might be difficult. The easiest way is to add something like[lang]:lang(lzh){font-family:"I.Ming"}
to your user style either in your browser, or just for when you're logged in to Wikipedia at common.css or global.css. The template documentation already covers that at § Applying styles. But if you think of something like overriding default fonts for everyone regardless of their system configuration, then this is something that definitely should not be done (MOS:FONTFAMILY). – MwGamera (talk) 05:51, 5 December 2024 (UTC)
Update to Module:Lang/sandbox
[edit]I've modified Module:Lang/sandbox to allow {{Wikt-lang}} to use the language html attribute logic instead of having to duplicate the entire code. Testcases at Module talk:Lang/testcases have all passed so nothing seems to have been broken. Let me know if you have any comments before I update. Gonnym (talk) 09:58, 16 December 2024 (UTC)
Transliteration whitelist
[edit]@Trappist the monk I don't think having a blanket whitelist of arbitrary non-Latin script characters makes sense, and especially not one which is as random as [ʻʼʾʿΔαβγδθσφχϑьᾱῑ῾上入去平]
. This is totally unsustainable, since it will constantly need to be expanded (e.g. I can already see that ъ
is missing, which crops up in various Slavicist transcripitons), and it also opens the door to false-negatives, because most of these will not be acceptable characters in the vast majority of languages. This seems like an artificially-imposed maintenance burden for increasingly little gain.
What I suggest is:
- Convert to form NFD before checking, which removes the need to have precomposed characters like
ᾱῑ
. - Allow all common script characters.
- Allow any characters marked with
Latn
in the Unicode ScriptExtension.txt file. - Generate a warning message via
mw.message
instead of a big error message, as it's overkill. - Create a maintenance category, and add all transcriptions containing non-Latin-script characters to it by default.
- Allow language-specific exceptions, specified in the data somewhere. These should only be added for really common cases.
- Implement an override, which can be specified using a parameter. This should be used in all other cases. Suggest it in the warning, too ("If this is correct, please...").
Theknightwho (talk) 16:34, 2 January 2025 (UTC)
- Yeah, I know, really crude. I did that for the avoidance of conflict.
- Hadn't thought about NFD and ScriptExtensions; I will.
mw.message
? Not sure how that would be used. My experience withmw.message()
is limited to rendering error messages with$1
,$2
, etc replacements. Can it be used to render messages someplace other than directly in the rendered article? Or were you perhaps thinking ofmw.addWarning()
?- Maintenance categories are problematic because quite often,
{{transliteration}}
is used in wikilinks and{{ill}}
templates: - Emitting a category wikilink inside another wikilink breaks the rendering.
- Yep, overrides are necessary because stuff like this:
{{transl|ja|Ama Kakeru ミ☆ Jōshikōsei}}
. - —Trappist the monk (talk) 19:57, 2 January 2025 (UTC)
- I strongly agree with User:Theknightwho on the problems with the whitelist. I think underlying problem with this breaking change stems from mixing two separate uses of the term 'Latn', without being clear about transliteration requirements.
-
-Latn
is the script portion of the IETF language tag, which is used to set thelang=
attribute (RFC-4646), which affects the display style of the inline text containing element (among other things,as noted by Template:Lang#Rationale). It is important that a single transliterated string has a consistent display style across all its characters, and with other transliterations in the same document. It's a sensible requirement for a en-wiki transliteration template where 'romanization' is a near synonym to use a 'Latn' display style.Latn
is also used in Unicode for the "predominant" [script value] of a single code-point.if the predominant use of the character is in one script, but it is also used in others, then it takes the Script property value associated with that predominant use
. This is a different (glyph level) classification, and doesn't directly relate to transliteration.
- It's hard to find a concrete example in the specs, so this could perhaps be explained better, but it is in fact completely reasonable to have a Greek theta character displayed side-by-side with Latn characters, all using the same Latn display style. This is what is required for Etruscan transliterations, and all the other non-Latn Unicode-script-class examples previously mentioned, including the "modifier" half circles used for Arabic, and the ъ mentioned above.
- The same string could be displayed using Greek display rules, but it would look wrong. It would also be wrong to use mixed styles in the same string. A 'Latin theta' is a semantically different symbol, which is why it has a different Unicode code point, and is also incorrect to substitute.
- The number of characters, or the Unicode script classification of any adjacent characters, are irrelevant for the display purposes if the transliteration is valid. Single character transliterations are totally valid. Ironically, the most obvious use is in transliteration tables.
[ʻʼʾʿΔαβγδθσφχϑьᾱῑ῾上入去平]
demonstrates that Unicode Script classification of individual glyphs is a different concern from a consistent transliteration display style. I have no idea what the CJK glyphs are doing there, I cannot verify any of it. It looks like nonsense. I know what the Greek symbols are, and don't even doubt those CJK are valid in some transliteration of something, but this partial list has no value AFAICT.- The current
IS-LATIN
whitelist function is misnamed. It's more of a is-valid-transliteration-string/char, but as stated above is of little value, impossible to maintain, and additionally seems to be based on misunderstandings. - Not only is it prone to false-positives, but every "true"-positive error it catches mis-characterises the problem. It's not that the string contains a non-"Unicode Script = Latn" character, rather the character possibly is not a valid transliteration symbol. At best, this is a heuristic for maintenance purposes, but even then it needs to be considerably smarter and have a better idea of what is and isn't valid transliteration. It is not appropriate for this to be raising error messages. Warnings at most, but it'd still be annoyingly noisy.
- Template:Transliteration/testcases are appallingly light and most of these basic transliteration cases that broke and seem to be a total surprise should be covered. Salpynx (talk) 21:08, 3 January 2025 (UTC)
- @Salpynx Just FYI,
上入去平
refer to the four tones of Middle Chinese, which are of fundamental importance in Chinese linguistics, so it's not that weird that they've come up. No modern variety has retained the Middle Chinese tone system (Mandarin having 4 tones is a coincidence - it's not a one-to-one conversion), and they're diaphonemic anyway (so IPA is out), so you sometimes see them given next to readings in a similar fashion to the tone numbers used in Wade-Giles or Jyutping. Theknightwho (talk) 02:16, 4 January 2025 (UTC)
- @Salpynx Just FYI,
How do I include a non-literal translation?
[edit]The langx template has a translation parameter, but it produces "lit. [text]". What should I do if I want to include a non-literal translation? TryKid [dubious – discuss] 18:21, 2 January 2025 (UTC)
- You don't have to use the translation parameter:
{{langx|es|casa}}, 'dwelling'
→ Spanish: casa, 'dwelling'
- Include punctuation and any descriptive text as you see fit. Of course, if you do sommat like that, some helpful editor is likely to come along and 'fix' your carefully crafted non-literal translation...
- —Trappist the monk (talk) 20:07, 2 January 2025 (UTC)
- I see. It's strange that the parameter automatically defaults to "literal translation". I think most of the useful translations included on Wikipedia aren't literal, but are cited to sources which make thoughtful decisions on how to translate something (e.g. Haravijaya, the reason I asked this question). Having a "lit." parameter seems like a magnet inviting original research from editors to translate something themselves.
- Any chance of changing this to something more sensible, maybe two separate lit and translation parameters? regards, TryKid [dubious – discuss] 21:23, 2 January 2025 (UTC)
- That is exactly what we do on Wiktionary, so I agree that it's a good idea. The difference is especially relevant if you're dealing with idioms: e.g. Greek ξεβράκωτος στ' αγγούρια (xevrákotos st’ angoúria, "caught with one's pants down; unprepared", lit. "pantsless among the cucumbers"). Theknightwho (talk) 02:22, 4 January 2025 (UTC)