#1938892: Switch from ISO-3166-1 country data to CLDR unicode data lead to the addition of some territories and a region, all of which don't represent countries. Specifically, the addition of:

  • Diego Garcia
  • Ceuta and Melilla
  • Canary Islands
  • Saint Martin
  • Outlying Oceania

was rightly disputed in #1938892-43: Switch from ISO-3166-1 country data to CLDR unicode data.

So most urgently, we have to decide what to do with these specific territories, as these constitute a regression vs. D7 and therefore a critical bug.

Apart from that, we also have to figure out some criteria on which basis countries should be included or not. And we need to make sure these inclusion criteria are automatically enforced, whenever CLDR possibly decides to add another territory to the list.
This part is less critical, so might be spun off to a followup, or not.

Comments

Pancho’s picture

Issue summary:View changes

followup or not

Pancho’s picture

After looking into the data and doing some more reseach, I can confirm that the non-country territories included in CLDR are not a random choice. CLDR had good reasons to include these, because all of them have a special status regarding some area of use that requires they are covered.

Rather the error is on our side. While CLDR gives us great locale data for every relevant territory, we can't take for granted that each of these territories is a country and therefore should be included in a general-use country list.
While we're already stripping 'EU' (European Union) from the list, we have to do the same for a number of territories that don't represent countries.

Easy cases:

Let's take a look one by one at the territories disputed by Alan and some more disputable territories:

+      'AC' => t('Ascension Island'),
+      'TA' => t('Tristan da Cunha'),

ISO 3166-1 alpha-2 codes, exceptionally reserved "on request of UPU for stamp issuing area"

+      'CP' => t('Clipperton Island'),
+      'DG' => t('Diego Garcia'),

ISO 3166-1 alpha-2 codes, exceptionally reserved "on request of ITU for location of certain telecommunications installations"

+      'EA' => t('Ceuta and Melilla'),
+      'IC' => t('Canary Islands'),

ISO 3166-1 alpha-2 codes, exceptionally reserved "on request of WCO for area not covered by European Union Customs arrangements"

'EU'

ISO 3166-1 alpha-2 code, exceptionally reserved "for any application needing to represent the name European Union".
This wasn't disputed because we already figured out that this needs to be stripped.

+      'QO' => t('Outlying Oceania'),

Unofficial, ISO 3166-1 alpha-2 private-space code.
Wikipedia says: "The Unicode Common Locale Data Repository assigns QO to represent Outlying Oceania (a multi-territory region containing Antarctica, Bouvet Island, the Cocos (Keeling) Islands, Christmas Island, South Georgia and the South Sandwich Islands, Heard Island and McDonald Islands, the British Indian Ocean Territory, the French Southern Territories, and the United States Minor Outlying Islands),"
I couldn't find any explanation why this region was assigned a two-letter code by CLDR, but this seems to cover mostly or completely inhabited outlying islands, so implementations can use this instead of the individual territory codes in order to shorten the list of countries.

These are clear cases: while being valid and meaningful records in certain circumstances, they are no countries by any means and therefore don't belong in our country list.

Possible Solution:

To get rid of them, what we want to do is lookup the ISO country list:
http://www.iso.org/iso/home/standards/country_codes/country_names_and_co...
in order to define which countries should be included.
And then load the respective country names from CLDR.

Case of Kosovo:

However, life would be so easy if there wasn't the case of Kosovo:

+      'XK' => t('Kosovo'),

Unofficial, ISO 3166-1 alpha-2 private-space code.
Wikipedia says: "The code XK is being used by the European Commission,[20] Switzerland,[21] the Deutsche Bundesbank,[22] and other organizations as a temporary country code for Kosovo."

Now, Kosovo is recognized by more than 100 UN countries, which is a lot more recognition than many countries have that are included in the ISO list. Still it seems to be unclear, when Kosovo will be assigned an official country code, and which one it might be.
So for the intermediate time, using XK seems a good choice, and many international organizations, countries etc. temporarily use 'XK'.

But should we throw out Kosovo again, until it has received an official ISO-3166-1 code?
That's the real question here, and it requires some more consideration on our general inclusion rule.

Damien Tournoud’s picture

The other option is to not filter anything and import the whole list. It's up to the user to decide what the list is used for.

For example:

  • for Postal addresses applications, you probably would want the AC/TA codes.
  • if you run a website locating telecommunication installations, you probably want CP/DG

etc.

What wrong does it do to have those additional entries in the list?

Alan D.’s picture

@Pancho
Kosovo is going to be the Palestine for the next few years as this is not recongnised by Russia, just like Palestine was not recongnised by the USA thus the change of the name only this year. Kosovo should be added in the near future as Kosovo is a member both the IMF and World Bank (as per the ISO definitions they need to be used here before inclusion). I'd say skip to avoid the update pain.

@Damien Tournoud
Guess that is just a user experience thing, like what would you think if Alaska was presented in the list or if Tibet was? These are either baffling or potentially very politically charged.

Pancho’s picture

@Damien:
Oh, I'm absolutely fine with leaving the first 6 entries in the list, and also re-add 'EU'.
All I want to avoid is that we're these 7 entries appear in our generic "country list" as it is used in the installer. Also we should avoid wrong implementation of the list in contrib.

So even if we want to go that route, we still need to figure out on which basis we filter them. And we need to do that generically, because potentially more of them could be added. So either we maintain a second - codes only - list taken directly from the ISO data, or we add some status metadata to the records. The latter seems better.

@Alan D.:
The update pain is not the problem here. We anyway need a proper, generic update path, with a simple code switch being the least problem.
No, IMHO the real point is: we know that Kosovo is a country, more than the Palestinian Territories have ever been. And everybody else including Serbia and Russia knows it just as well. And we can be quite sure that Kosovo finally will get a country code.
That all shouldn't mean I'm for Kosovo being a country or against, it's just about the facts IMHO.

So does applying common sense mean, we should deviate from the official ISO list in this case, and make Kosovo a first class entry in the country list ahead of ISO?
Would it prove that we're not as bureaucratic and politically blocked as ISO is? Or would it lead us on a slippery slope, opening a can of worms in terms of political disagreement?

I'm not sure about this. But what I'm sure, is that for Drupal users in Kosovo it will continue to be a problem that their country doesn't appear on the list - they will notice and appreciate it at once. For people in Serbia it only might be a political nuisance, if they happen to come across the Kosovo entry.

Alan D.’s picture

Btw, I'm arguing against the points that I happened to uphold before doing the research for the Countries module. I simply have had to let go of the western point of view. Drupal is fundamentally a US product so a western point of view here is not a bad thing! The default language is en-US after all.

we know that Kosovo is a country

This is the political hot potato when it comes to country recognition :)

As westerns watching TV, then yes.

As governmental authorities of 2 of the 5 permanent seats on the UN security council, then no. Neither Russia nor China have recognized Kosovo. Then it is about 50 / 50 after that.

And as India hasn't either, then I think I can safely sat that the governmental bodies of over 50% of the worlds population have not recognized the independence of Kosovo yet.

https://en.wikipedia.org/wiki/International_recognition_of_Kosovo

However, it is actually membership within the UN bodies that drive the ISO spec, of which it is listed by the IMF and World Bank. Behind the door diplomacy with China and / or Russia must be holding this up. SS only took about a month to get listed.

more than the Palestinian Territories have ever been

Be careful not to stand on anyone's toes. People die in this and other sovereignty debates. I keep things neutral in this one by simply stating the recognition status of both sides.

we add some status metadata to the records

There is a 3 yo issue that started this, for simple name changes. These got the blessing to go ahead post string freeze and nothing has made it to Drupal 7 yet. Will this be any different? An automated process would be best if possible.

Anyways, I'm making too much noise in these threads, repeating the same messages. I must unsubscribe and put my energy into the Countries modules so that everyone can be happy when D8 comes out. ;)

Peace

bojanz’s picture

I should note that Serbia has started normalizing relations with Kosovo, so it's a matter of time before they are fully recognized (might take years though).
I am in favor of adding Kosovo to the list, though I am unsure if the country-code-rename could cause problems afterwards (hopefully they'll just get the code before D8 is released, which will solve it).

catch’s picture

Category:bug» task
Priority:Critical» Normal

Sorry I don't think this is critical at all, if I have a (personal) preference at all it'd be for more regions in the list rather than less - if a specific site really wants to constrain the list for whatever reason nothing stops them.

Most site owners won't pay attention to the specifics of what's in and isn't in the list, most visitors to a site will be looking for whatever region they're in rather than scouring it for regions they disagree with.

There's no functional bug here, and the other issue ensured this is the 'best possible' standardized list we know about.

The fact that the list might update after release, and country codes change is a valid problem though, but that's one we can tackle when the list updates I think. Leaving open for that.

Alan D.’s picture

Alan D.’s picture

Issue summary:View changes

linkfix