Proper way to build a Maven repository without Internet access

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

Proper way to build a Maven repository without Internet access

Sean Horan
Hi all,

I am tasked with ensuring that the Maven build process of a large
government/enterprise-class system does not reach out to the Internet.  Our
Jenkins server's local maven repository has 10,000 POMs.  There are many
individual builds that are specific to our product and what we customize
for government clients.

I have a lot of devops experience but practically no experience with Maven
and Java beyond struggling to set this up.

We are using Artifactory and I'm not sure whether a generic or
Maven-specific repository is suitable for this project.

As I'm trying to understand it, I am using curl in a find/curl loop adapted
from
https://github.com/jfrog/project-examples/blob/master/bash-example/deploy-folder-by-checksum.sh

to traverse the ~/.m2/repository on our existing Jenkins server and HTTP
PUT it over to Artifactory.  This script would be hardened and sent to
internal customers to sync as part of the development process.

The problem I am seeing is that the build process is looking for
maven-metadata.xml which does not exist on our server.  We do have
-companyname and -central XML files for eg, the maven-source-plugin that
are slightly different.

I have the sense that my approach to this is off and I'm in over my head so
I could use some help.

Any pointers in the right direction would be more than welcome.

We are using Maven 3.3.9 and JDK8 on Centos 7 and cannot upgrade at this
time.

Sean Horan
Reply | Threaded
Open this post in threaded view
|

Re: Proper way to build a Maven repository without Internet access

Anders Hammar
Depending on what you mean by "not reach out to the Internet" you could
configure Maven to point all requests at your Repository Manager (e.g.
Artifactory). The repo manager would then access pre-defined Internet
repositories and cache then. Read on how repo managers like Artifactory or
Sonatype Nexus work for more details.

If you don't want any system in your network to access Internet, then you
have a bigger problem as you some how need to get hold of the external
artifacts that I'm sure your using as dependencies. One possible solution
in that case could be to contact Sonatype as I believe the have a
commercial service where you can get an offline copy of the Central
Repository that you can then deploy internally.

My suggestion is to get someone that knows how Maven works involved.

/Anders

On Fri, Nov 8, 2019 at 12:22 AM Sean Horan <[hidden email]> wrote:

> Hi all,
>
> I am tasked with ensuring that the Maven build process of a large
> government/enterprise-class system does not reach out to the Internet.  Our
> Jenkins server's local maven repository has 10,000 POMs.  There are many
> individual builds that are specific to our product and what we customize
> for government clients.
>
> I have a lot of devops experience but practically no experience with Maven
> and Java beyond struggling to set this up.
>
> We are using Artifactory and I'm not sure whether a generic or
> Maven-specific repository is suitable for this project.
>
> As I'm trying to understand it, I am using curl in a find/curl loop adapted
> from
>
> https://github.com/jfrog/project-examples/blob/master/bash-example/deploy-folder-by-checksum.sh
>
> to traverse the ~/.m2/repository on our existing Jenkins server and HTTP
> PUT it over to Artifactory.  This script would be hardened and sent to
> internal customers to sync as part of the development process.
>
> The problem I am seeing is that the build process is looking for
> maven-metadata.xml which does not exist on our server.  We do have
> -companyname and -central XML files for eg, the maven-source-plugin that
> are slightly different.
>
> I have the sense that my approach to this is off and I'm in over my head so
> I could use some help.
>
> Any pointers in the right direction would be more than welcome.
>
> We are using Maven 3.3.9 and JDK8 on Centos 7 and cannot upgrade at this
> time.
>
> Sean Horan
>
Reply | Threaded
Open this post in threaded view
|

Re: Proper way to build a Maven repository without Internet access

Robert Scholte-8
In reply to this post by Sean Horan
If you don't have a repository manager in place to serve your developers (that is, bound to the internet), then I consider that as a huge risk. Developers will find other ways to get their libraries, which means you can't know anymore if they came from a valid and secure location.
A repository manager should be the only place that acts as the single point of trust. Only this way you can control all incoming (and optionally outgoing) traffic.

Robert
On 8-11-2019 00:22:28, Sean Horan <[hidden email]> wrote:
Hi all,

I am tasked with ensuring that the Maven build process of a large
government/enterprise-class system does not reach out to the Internet. Our
Jenkins server's local maven repository has 10,000 POMs. There are many
individual builds that are specific to our product and what we customize
for government clients.

I have a lot of devops experience but practically no experience with Maven
and Java beyond struggling to set this up.

We are using Artifactory and I'm not sure whether a generic or
Maven-specific repository is suitable for this project.

As I'm trying to understand it, I am using curl in a find/curl loop adapted
from
https://github.com/jfrog/project-examples/blob/master/bash-example/deploy-folder-by-checksum.sh

to traverse the ~/.m2/repository on our existing Jenkins server and HTTP
PUT it over to Artifactory. This script would be hardened and sent to
internal customers to sync as part of the development process.

The problem I am seeing is that the build process is looking for
maven-metadata.xml which does not exist on our server. We do have
-companyname and -central XML files for eg, the maven-source-plugin that
are slightly different.

I have the sense that my approach to this is off and I'm in over my head so
I could use some help.

Any pointers in the right direction would be more than welcome.

We are using Maven 3.3.9 and JDK8 on Centos 7 and cannot upgrade at this
time.

Sean Horan
Reply | Threaded
Open this post in threaded view
|

Re: Proper way to build a Maven repository without Internet access

Sean Horan
The developers can use whatever repos on the Internet they feel is
appropriate, but it is up to the team lead to sync that up to the build
server which does not have access to the public Internet.

Syncing the local repository that exists now on the build server with
Artifactory is where I'm blocked because maven is looking for
maven-metadata.xml files that I don't have and I don't understand the
difference between maven-metadata.xml, maven-metadata-central.xml, and
maven-metadata-mycompanyname.xml that are in the folder.

Sean

On Sun, Nov 10, 2019 at 11:31 PM Robert Scholte <[hidden email]>
wrote:

> If you don't have a repository manager in place to serve your developers
> (that is, bound to the internet), then I consider that as a huge risk.
> Developers will find other ways to get their libraries, which means you
> can't know anymore if they came from a valid and secure location.
> A repository manager should be the only place that acts as the single
> point of trust. Only this way you can control all incoming (and optionally
> outgoing) traffic.
>
> Robert
> On 8-11-2019 00:22:28, Sean Horan <[hidden email]> wrote:
> Hi all,
>
> I am tasked with ensuring that the Maven build process of a large
> government/enterprise-class system does not reach out to the Internet. Our
> Jenkins server's local maven repository has 10,000 POMs. There are many
> individual builds that are specific to our product and what we customize
> for government clients.
>
> I have a lot of devops experience but practically no experience with Maven
> and Java beyond struggling to set this up.
>
> We are using Artifactory and I'm not sure whether a generic or
> Maven-specific repository is suitable for this project.
>
> As I'm trying to understand it, I am using curl in a find/curl loop adapted
> from
>
> https://github.com/jfrog/project-examples/blob/master/bash-example/deploy-folder-by-checksum.sh
>
> to traverse the ~/.m2/repository on our existing Jenkins server and HTTP
> PUT it over to Artifactory. This script would be hardened and sent to
> internal customers to sync as part of the development process.
>
> The problem I am seeing is that the build process is looking for
> maven-metadata.xml which does not exist on our server. We do have
> -companyname and -central XML files for eg, the maven-source-plugin that
> are slightly different.
>
> I have the sense that my approach to this is off and I'm in over my head so
> I could use some help.
>
> Any pointers in the right direction would be more than welcome.
>
> We are using Maven 3.3.9 and JDK8 on Centos 7 and cannot upgrade at this
> time.
>
> Sean Horan
>
Reply | Threaded
Open this post in threaded view
|

Re: Proper way to build a Maven repository without Internet access

Jason Young
Sean, last time I checked, Maven does not handle concurrent Maven processes
using the same local repository--the kind that the Maven client sets up.

What does work that I know of is using a company-wide repo like Nexus or
Artifactory. IME, issues with maven-metadata.xml, etc. do not come up in
this configuration--I don't ever think about it myself. Each client will
still have its own local repository, but will go to Nexus or Artifactory
for their artifacts. Optionally, you can let Nexus or Artifactory fetch
(once) all the third-party artifacts you need from Maven Central. With your
CI server and all devs using the organization's central artifact repo,
everyone is using the same artifacts as each other all the time.

More specifically, `group:project:1.2.3` is conventionally just one
artifact, not possibly a couple of different artifacts, as enforced by
artifact repos like Nexus, Artifactory, and Maven Central. Once someone
uploads a non-SNAPSHOT artifact to your company's artifact repository, it
never changes (assuming you configure the repository according to
convention). Certain things you can add on to the version like `-SNAPSHOT`
indicate mutability.

Someone please correct any misconceptions / outdated info on my part.

On Mon, Nov 11, 2019 at 2:59 PM Sean Horan <[hidden email]> wrote:

> The developers can use whatever repos on the Internet they feel is
> appropriate, but it is up to the team lead to sync that up to the build
> server which does not have access to the public Internet.
>
> Syncing the local repository that exists now on the build server with
> Artifactory is where I'm blocked because maven is looking for
> maven-metadata.xml files that I don't have and I don't understand the
> difference between maven-metadata.xml, maven-metadata-central.xml, and
> maven-metadata-mycompanyname.xml that are in the folder.
>
> Sean
>
> On Sun, Nov 10, 2019 at 11:31 PM Robert Scholte <[hidden email]>
> wrote:
>
> > If you don't have a repository manager in place to serve your developers
> > (that is, bound to the internet), then I consider that as a huge risk.
> > Developers will find other ways to get their libraries, which means you
> > can't know anymore if they came from a valid and secure location.
> > A repository manager should be the only place that acts as the single
> > point of trust. Only this way you can control all incoming (and
> optionally
> > outgoing) traffic.
> >
> > Robert
> > On 8-11-2019 00:22:28, Sean Horan <[hidden email]> wrote:
> > Hi all,
> >
> > I am tasked with ensuring that the Maven build process of a large
> > government/enterprise-class system does not reach out to the Internet.
> Our
> > Jenkins server's local maven repository has 10,000 POMs. There are many
> > individual builds that are specific to our product and what we customize
> > for government clients.
> >
> > I have a lot of devops experience but practically no experience with
> Maven
> > and Java beyond struggling to set this up.
> >
> > We are using Artifactory and I'm not sure whether a generic or
> > Maven-specific repository is suitable for this project.
> >
> > As I'm trying to understand it, I am using curl in a find/curl loop
> adapted
> > from
> >
> >
> https://github.com/jfrog/project-examples/blob/master/bash-example/deploy-folder-by-checksum.sh
> >
> > to traverse the ~/.m2/repository on our existing Jenkins server and HTTP
> > PUT it over to Artifactory. This script would be hardened and sent to
> > internal customers to sync as part of the development process.
> >
> > The problem I am seeing is that the build process is looking for
> > maven-metadata.xml which does not exist on our server. We do have
> > -companyname and -central XML files for eg, the maven-source-plugin that
> > are slightly different.
> >
> > I have the sense that my approach to this is off and I'm in over my head
> so
> > I could use some help.
> >
> > Any pointers in the right direction would be more than welcome.
> >
> > We are using Maven 3.3.9 and JDK8 on Centos 7 and cannot upgrade at this
> > time.
> >
> > Sean Horan
> >
>


--
Jason Young
Systems Engineer | TEVERA
[image: Phone] 715 245 8000 x7609
[image: Mobile] 715 781 0845
[image: Web] tevera.com
Confidentiality Notice: This message is intended for the sole use of the
individual and entity to which it is addressed, and may contain information
that is privileged, confidential and exempt from disclosure under
applicable law. Any unauthorized review, use, disclosure or distribution of
this email message, including any attachment, is prohibited. If you are not
the intended recipient, please advise the sender by reply email and destroy
all copies of the original message.
Reply | Threaded
Open this post in threaded view
|

Re: Proper way to build a Maven repository without Internet access

stephenconnolly
In reply to this post by Sean Horan
So I know that Sonatype have or had a feature in nexus that let you approve
what dependencies could be consumed by developers from its hosted Maven
repo. If you used that you could then replicate the nexus storage back-end
to the offline network via sneaker-net (or better a dmz that only has
access to the developer nexus)

Unclear if jfrog have a competitive feature

On Thu 7 Nov 2019 at 23:22, Sean Horan <[hidden email]> wrote:

> Hi all,
>
> I am tasked with ensuring that the Maven build process of a large
> government/enterprise-class system does not reach out to the Internet.  Our
> Jenkins server's local maven repository has 10,000 POMs.  There are many
> individual builds that are specific to our product and what we customize
> for government clients.
>
> I have a lot of devops experience but practically no experience with Maven
> and Java beyond struggling to set this up.
>
> We are using Artifactory and I'm not sure whether a generic or
> Maven-specific repository is suitable for this project.
>
> As I'm trying to understand it, I am using curl in a find/curl loop adapted
> from
>
> https://github.com/jfrog/project-examples/blob/master/bash-example/deploy-folder-by-checksum.sh
>
> to traverse the ~/.m2/repository on our existing Jenkins server and HTTP
> PUT it over to Artifactory.  This script would be hardened and sent to
> internal customers to sync as part of the development process.
>
> The problem I am seeing is that the build process is looking for
> maven-metadata.xml which does not exist on our server.  We do have
> -companyname and -central XML files for eg, the maven-source-plugin that
> are slightly different.
>
> I have the sense that my approach to this is off and I'm in over my head so
> I could use some help.
>
> Any pointers in the right direction would be more than welcome.
>
> We are using Maven 3.3.9 and JDK8 on Centos 7 and cannot upgrade at this
> time.
>
> Sean Horan
>
--
Sent from my phone
Reply | Threaded
Open this post in threaded view
|

Re: Proper way to build a Maven repository without Internet access

Sean Horan-2
I actually got it mostly working.

. is the repository directory in the examples.

There were only a handful of maven-metadata.xml files that were missing
that it was looking for, so on a hunch I copied the
maven-metadata-central.xml to maven-metadata.xml with this one liner

for I in $(find . -name maven-metadata-central.xml); do cp $I $(echo $I|sed
s/-central//); done

and HTTP PUT'd them up with this one-liner.

find . -name maven-metadata.xml | while read f; do art=$(echo $f|cut -c
3-); echo $f; curl -w "%{http_code}" -s -k -u admin:<password> -T $f
http://artifactory.mycompany.com:8081/artifactory/maven-mirror/"${art}";
done

This is what ended up working in settings.xml:

<settings xsi:schemaLocation="http://maven.apache.org/SETTINGS/1.0.0
http://maven.apache.org/xsd/settings-1.0.0.xsd"
          xmlns="http://maven.apache.org/SETTINGS/1.0.0"
          xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">

    <servers>
        <server>
            <username>admin</username>
            <password>*****</password>
            <id>companybuild</id>
        </server>
    </servers>

    <pluginGroups>
        <pluginGroup>org.sonarsource.scanner.maven</pluginGroup>
    </pluginGroups>

    <profiles>
        <profile>
            <id>artifactory</id>
            <repositories>
                <repository>
                    <id>snapshots</id>
                    <name>libs-snapshot-local</name>
                    <snapshots />
                    <!-- mvn warns about this tag. It does not recognize it.

<updatePolicy>always</updatePolicy> -->
                    <url>
http://artifactory.mycompany.com.com:8081/artifactory/libs-snapshot-local
</url>
                </repository>
                <repository>
                    <id>releases</id>
                    <name>libs-release-local</name>
                    <snapshots>
                        <enabled>false</enabled>
                    </snapshots>
                    <url>
http://artifactory.mycompany.com:8081/artifactory/libs-release-local</url>
                </repository>
            </repositories>
            <pluginRepositories>
                <pluginRepository>
                    <id>plugins-release</id>
                    <name>plugins-release</name>
                    <snapshots>
                        <enabled>false</enabled>
                    </snapshots>
                    <url>
http://artifactory.mycompany.com:8081/artifactory/plugins-release</url>
                </pluginRepository>
                <pluginRepository>
                    <id>snapshots</id>
                    <name>plugins-snapshot</name>
                    <snapshots />
                    <url>
http://artifactory.mycompany.com:8081/artifactory/plugins-snapshot</url>
                </pluginRepository>
            </pluginRepositories>
        </profile>
    </profiles>

    <activeProfiles>
        <activeProfile>artifactory</activeProfile>
    </activeProfiles>

    <mirrors>
        <mirror>
            <id>artifactory</id>
            <mirrorOf>*</mirrorOf>
            <name>Artifactory</name>
            <url>
http://artifactory.mycompany.com:8081/artifactory/maven-mirror/</url>
        </mirror>
    </mirrors>
</settings>

This settings file is cropped from what we're using and it's messy but I'm
not touching it for now.

What it's failing on now is a tgz we have in the repo that maven thinks is
corrupt but I can perfectly untar it.  Asking for help at my company but I
may just have to dig into it.

--sean

On Wed, Nov 13, 2019 at 3:02 PM Stephen Connolly <
[hidden email]> wrote:

> So I know that Sonatype have or had a feature in nexus that let you approve
> what dependencies could be consumed by developers from its hosted Maven
> repo. If you used that you could then replicate the nexus storage back-end
> to the offline network via sneaker-net (or better a dmz that only has
> access to the developer nexus)
>
> Unclear if jfrog have a competitive feature
>
> On Thu 7 Nov 2019 at 23:22, Sean Horan <[hidden email]> wrote:
>
> > Hi all,
> >
> > I am tasked with ensuring that the Maven build process of a large
> > government/enterprise-class system does not reach out to the Internet.
> Our
> > Jenkins server's local maven repository has 10,000 POMs.  There are many
> > individual builds that are specific to our product and what we customize
> > for government clients.
> >
> > I have a lot of devops experience but practically no experience with
> Maven
> > and Java beyond struggling to set this up.
> >
> > We are using Artifactory and I'm not sure whether a generic or
> > Maven-specific repository is suitable for this project.
> >
> > As I'm trying to understand it, I am using curl in a find/curl loop
> adapted
> > from
> >
> >
> https://github.com/jfrog/project-examples/blob/master/bash-example/deploy-folder-by-checksum.sh
> >
> > to traverse the ~/.m2/repository on our existing Jenkins server and HTTP
> > PUT it over to Artifactory.  This script would be hardened and sent to
> > internal customers to sync as part of the development process.
> >
> > The problem I am seeing is that the build process is looking for
> > maven-metadata.xml which does not exist on our server.  We do have
> > -companyname and -central XML files for eg, the maven-source-plugin that
> > are slightly different.
> >
> > I have the sense that my approach to this is off and I'm in over my head
> so
> > I could use some help.
> >
> > Any pointers in the right direction would be more than welcome.
> >
> > We are using Maven 3.3.9 and JDK8 on Centos 7 and cannot upgrade at this
> > time.
> >
> > Sean Horan
> >
> --
> Sent from my phone
>

--


**CONFIDENTIALITY
NOTICE:* *

*This transmission may contain information
which is Vimo, Inc. (DBA
Getinsured) confidential and/or legally
privileged. The information is intended
only for the use of the individual
or entity named on this transmission. If you
are not the intended
recipient, you are hereby notified that any disclosure,
copying, or
distribution of the contents of this transmission is strictly
prohibited.
If you have received this transmission in error, please immediately
notify
me by return e-mail and destroy all copies of the original message.*
Reply | Threaded
Open this post in threaded view
|

Re: Proper way to build a Maven repository without Internet access

Henrik-2
In reply to this post by stephenconnolly
JFrog has an air-gap how-to. Do a search for "using-artifactory-with-an-air-gap”
That maybe can help you?

Henrik
 

> On 14 Nov 2019, at 00:01, Stephen Connolly <[hidden email]> wrote:
>
> So I know that Sonatype have or had a feature in nexus that let you approve
> what dependencies could be consumed by developers from its hosted Maven
> repo. If you used that you could then replicate the nexus storage back-end
> to the offline network via sneaker-net (or better a dmz that only has
> access to the developer nexus)
>
> Unclear if jfrog have a competitive feature
>
> On Thu 7 Nov 2019 at 23:22, Sean Horan <[hidden email]> wrote:
>
>> Hi all,
>>
>> I am tasked with ensuring that the Maven build process of a large
>> government/enterprise-class system does not reach out to the Internet.  Our
>> Jenkins server's local maven repository has 10,000 POMs.  There are many
>> individual builds that are specific to our product and what we customize
>> for government clients.
>>
>> I have a lot of devops experience but practically no experience with Maven
>> and Java beyond struggling to set this up.
>>
>> We are using Artifactory and I'm not sure whether a generic or
>> Maven-specific repository is suitable for this project.
>>
>> As I'm trying to understand it, I am using curl in a find/curl loop adapted
>> from
>>
>> https://github.com/jfrog/project-examples/blob/master/bash-example/deploy-folder-by-checksum.sh
>>
>> to traverse the ~/.m2/repository on our existing Jenkins server and HTTP
>> PUT it over to Artifactory.  This script would be hardened and sent to
>> internal customers to sync as part of the development process.
>>
>> The problem I am seeing is that the build process is looking for
>> maven-metadata.xml which does not exist on our server.  We do have
>> -companyname and -central XML files for eg, the maven-source-plugin that
>> are slightly different.
>>
>> I have the sense that my approach to this is off and I'm in over my head so
>> I could use some help.
>>
>> Any pointers in the right direction would be more than welcome.
>>
>> We are using Maven 3.3.9 and JDK8 on Centos 7 and cannot upgrade at this
>> time.
>>
>> Sean Horan
>>
> --
> Sent from my phone


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]