Re: Allowed characters in GAV and how/where to sanitize?

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Re: Allowed characters in GAV and how/where to sanitize?

Hervé BOUTEMY
Why do you say that Maven can't handle 4 part versions?

I fixed MNG-3010 near 10 years ago for Maven 3.0.0-alpha-1

this was a versioning limitation removal with enhancement documented in
https://cwiki.apache.org/confluence/display/MAVENOLD/Versioning

 I can tell you that since this fix, it supports an arbitrary count of parts
(and resolver's GenericVersionScheme is a little evolution of Maven Artifact's
ComparableVersion)

Regards,

Hervé

Le mercredi 10 janvier 2018, 15:07:08 CET Andreas Sewe a écrit :

> Fred Cooke wrote:
> > Re versions, I know the background on it, but it annoys me that maven
> > can't
> > handle 4 part versions, 1.2.3.4 as sometimes it's handy to do a patch
> > level
> > that deep. Lots of messed up software in the world :-)
>
> Are you sure that's still the case (the parts-restriction, not the
> messiness of software ;-)?
>
> At least the Maven Resolver uses a versioning scheme that's quite
> flexible [1]. Not sure if the flexibility at this low level bubbles up
> all the way to the top, though. Maybe one the Maven developers can chime in.
> > Format should be N[.N as many times as needed][optional hyphen and
> > qualifier of some sort] or something like that. Not hard limited to 1 2 or
> > 3 parts.
>
> AFAICT, that's what GenericVersionScheme does.
>
> Hope this helps,
>
> Andreas
>
> [1]
> <https://github.com/apache/maven-resolver/blob/3fc53c052f538169cb7dc6aa9ed90
> 52514b569ca/maven-resolver-util/src/main/java/org/eclipse/aether/util/versio
> n/GenericVersionScheme.java#L31>



---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Allowed characters in GAV and how/where to sanitize?

Hervé BOUTEMY
Le mercredi 10 janvier 2018, 10:21:35 CET Andreas Sewe a écrit :
> Hervé BOUTEMY wrote:
> > notice that Central contains artifacts produced by Maven but also by other
> > tools: I did some analysis myself and found strange things also that are
> > clearly not produced by Maven. Scala for example produces some artifacts
> > that I doubt could be referenced by Maven.
>
> Yes, Maven is not the only tool that can deploy artifacts to Central --
> and that's a good thing.
+1

>
> > Then: what do we call "broken"?
> > Something that seems "clearly" related to a typo?
> > Something that can't be consumed by Maven?
> > Something that people who produced the release (with any tooling) won't
> > consume for syntactic reasons on the result? Something that they won't
> > consume for other reasons? (like for example because it's continuous
> > deployment and it's the 4th version of the day)
>
> I wouldn't go so far to treat version=1.6.2.1 as an illegal version
I was not talking about a version with 4 parts: Maven 3 supports an arbitrary
count of parts.
I was talking about an artifact that is released 4 times per day, because it's
continuous delivery (I suppose): a vast majority of releases are IMHO never
used

> (after all, I can image someone using legitimately using a qualifier
> scheme like 1.2.3-os=linux), but there are IMHO two cases which I always
> consider broken:
>
> - Spaces in any of the components of a GAV
> - A colon in any of the components of a GAV
+1

>
> Spaces are just likely to cause trouble for some tool further down the line.
>
> And for colons we know that they will cause trouble, being the default
> separator for GAVs when written as a single string.
>
> Aside from those characters, I would probably just ban a few characters
> (non-printable control characters). A bit similar to what XML did with a
> its NCName (non-colon name) production [1].
+1

>
> However, for groudId and artifactId we already have much stricter rules
> (A-Z, a-z, 0-9, ., -, _), so the argument can be made that
> versions/classifiers/extensions should also be made up of a more limited
> character set as well.
+1

>
> In particular, care should to be taken that the path component can still
> be parsed unambiguously, so allowing '.' in a classifier is probably
> asking for trouble.
+1 again

>
> > And what can we do?
> > On the past artifacts, removing anything is not really an option: IMHO,
> > the
> > issue does not deserve the effort and to break our base rule about
> > inalterability.
> > On the future, perhaps we can do something:
> > - at Maven level, sure we can and we should improve controls as much as
> > possible
>
> Yes, if only that at this level we can provide the best error messages,
> as the error is recognized closest to the user.
>
> > - on other build tools: perhaps we should try not only to implement checks
> > in Maven but also document rules for other tools to implement same rules
> The Maven Resolver is a great place to enforce some rules in
> DefaultArtifact (or whatever replaces it). Granted, not everyone deploys
> using the Maven Resolver, but its *the* place that knows about all the
> intricacies of the repository layout already.
>
> > - on repo managers used by the publishers: same rules documentation
> > prerequisite, but other tools target
>
> Well, Nexus already has some checks in place, to avoid versions like
> "1/../../other-artifact/2". However, groupIds like "org...example" are
> still accepted (deployed under org/example).
probably ".." should be forbidden also

>
> > - on sync to central: this is the only location where some rules can be
> > checked for absolutely any new artifact then really interesting at a first
> > glance. But making rules evolve at this level is really hard since there
> > is no real feedback process I know of when base Central publication rules
> > are not met. Base Central publication rules were defined from the
> > beginning (signature, ...), then are implemented by publishers' repo
> > managers. I suppose failed controls done by sync to central (then sync
> > blocked) are rare: I'm not sure there is a strong process/tooling. And
> > adding it would cost some management: not easy. IMHO, we should start by
> > first detecting if there are really issues on new artifacts these days
> > before trying to take actions at this level.
> That being said, I think the first step is to document the syntax for
> GAVs somewhere (e.g., at [2] or [3]).
+1
there is an edit button near the title to find the source and propose a PR :)

Regards,

Hervé

>
> Best wishes,
>
> Andreas
>
> [1] <https://www.w3.org/TR/REC-xml-names/#NT-NCName>
> [2] <http://maven.apache.org/pom.html#Maven_Coordinates>
> [3] <http://maven.apache.org/ref/3.5.2/maven-model/maven.html>



---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]