I'm investigating some non-Apache code that claims to work around
encoding bugs in many pom.xml files on Maven Central by rewriting the
POMs to use Latin-1. The claim is that there are many pom.xml files on
Maven Central that are Latin-1 (and not UTF-8) but are not properly
identified as such in the XML declaration.
How likely is this? Is anyone aware of such malformed pom.xml files?
Are there any checks in place that would prevent such a pom.xml file
from being published?
I need to figure out whether I should focus my attention on looking
for bad pom.xml files or looking for bugs in the XML parsing code. :-)